Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

"If truth is beauty, how come no one has their hair done in the library?" -- Lily Tomlin


devel / comp.lang.python / Re: tail

SubjectAuthor
o Re: tailPeter J. Holzer

1
Re: tail

<mailman.224.1650751356.20749.python-list@python.org>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=18004&group=comp.lang.python#18004

 copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: hjp-pyt...@hjp.at (Peter J. Holzer)
Newsgroups: comp.lang.python
Subject: Re: tail
Date: Sun, 24 Apr 2022 00:02:29 +0200
Lines: 77
Message-ID: <mailman.224.1650751356.20749.python-list@python.org>
References: <CABbU2U98YKdcnJkDPfzE3Pqso+6LL72usB8hrSBVR0WbhauRoQ@mail.gmail.com>
<CAPTjJmr3AiCyvxXt=-nqNLrJfyQHmG=pvSsM7nU_XxhSe94zgA@mail.gmail.com>
<20220423220229.6lvry4nwsbk2llcd@hjp.at>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha512;
protocol="application/pgp-signature"; boundary="ayqtziuqigxperpn"
X-Trace: news.uni-berlin.de oHP1bu8ilXj3YphhSBatxQ1pZtUOnjvkoESUs5YP7x6w==
Return-Path: <hjp-python@hjp.at>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; '2022': 0.05; 'content-
type:multipart/signed': 0.05; 'sun,': 0.07; 'used.': 0.07;
'angelico': 0.09; 'content-type:application/pgp-signature': 0.09;
'filename:fname piece:asc': 0.09; 'filename:fname
piece:signature': 0.09; 'filename:fname:signature.asc': 0.09;
'general,': 0.09; 'memory.': 0.09; 'minus': 0.09; 'threshold':
0.09; 'steps': 0.11; 'memory': 0.15; '"creative': 0.16; '__/':
0.16; 'bottom?': 0.16; 'challenge!"': 0.16; 'commons': 0.16;
'from:addr:hjp-python': 0.16; 'from:addr:hjp.at': 0.16;
'from:name:peter j. holzer': 0.16; 'general.': 0.16; 'hardly':
0.16; 'hjp@hjp.at': 0.16; 'holzer': 0.16; 'iterate': 0.16;
'personally,': 0.16; 'reality.': 0.16; 'streams': 0.16; 'stross,':
0.16; 'url-ip:212.17.106.137/32': 0.16; 'url-ip:212.17.106/24':
0.16; 'url-ip:212.17/16': 0.16; 'url:hjp': 0.16; '|_|_)': 0.16;
'wrote:': 0.16; 'problem': 0.16; 'probably': 0.17; 'to:addr
:python-list': 0.20; 'all,': 0.20; 'lines': 0.23; 'python,': 0.25;
'depends': 0.25; 'library': 0.26; 'function': 0.27; 'chris': 0.28;
'sense': 0.28; 'whole': 0.30; '(as': 0.32; 'split': 0.32;
'unless': 0.32; 'but': 0.32; "i'm": 0.33; 'there': 0.33; 'header
:In-Reply-To:1': 0.34; "it's": 0.37; 'file': 0.38; 'read': 0.38;
'quite': 0.39; 'text': 0.39; 'case.': 0.40; 'difficult': 0.40;
'done.': 0.40; 'file:': 0.40; 'place.': 0.40; 'method': 0.61;
'received:212': 0.62; 'skip:r 20': 0.64; 'introducing': 0.64;
'pay': 0.65; 'received:userid': 0.66; 'entire': 0.67; 'price.':
0.69; 'url-ip:212/8': 0.69; 'too.': 0.70; 'seek': 0.81;
'backwards': 0.84; 'decode': 0.84; 'easy.': 0.84; 'lines,': 0.84;
'received:at': 0.84; 'sulla': 0.84; 'tiny': 0.84
Mail-Followup-To: python-list@python.org
Content-Disposition: inline
In-Reply-To: <CAPTjJmr3AiCyvxXt=-nqNLrJfyQHmG=pvSsM7nU_XxhSe94zgA@mail.gmail.com>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <20220423220229.6lvry4nwsbk2llcd@hjp.at>
X-Mailman-Original-References: <CABbU2U98YKdcnJkDPfzE3Pqso+6LL72usB8hrSBVR0WbhauRoQ@mail.gmail.com>
<CAPTjJmr3AiCyvxXt=-nqNLrJfyQHmG=pvSsM7nU_XxhSe94zgA@mail.gmail.com>
 by: Peter J. Holzer - Sat, 23 Apr 2022 22:02 UTC
Attachments: signature.asc (application/pgp-signature)

On 2022-04-24 04:57:20 +1000, Chris Angelico wrote:
> On Sun, 24 Apr 2022 at 04:37, Marco Sulla <Marco.Sulla.Python@gmail.com> wrote:
> > What about introducing a method for text streams that reads the lines
> > from the bottom? Java has also a ReversedLinesFileReader with Apache
> > Commons IO.
>
> It's fundamentally difficult to get precise. In general, there are
> three steps to reading the last N lines of a file:
>
> 1) Find out the size of the file (currently, if it's being grown)
> 2) Seek to the end of the file, minus some threshold that you hope
> will contain a number of lines
> 3) Read from there to the end of the file, split it into lines, and
> keep the last N
[...]
> This is quite inefficient in general. It would be far FAR easier to do
> this instead:
>
> 1) Read the entire file and decode bytes to text
> 2) Split into lines
> 3) Iterate backwards over the lines

Which one is more efficient depends very much on the size of the file.
For a file of a few kilobytes, the second solution is probably more
efficient. But for a few gigabytes, that's almost certainly not the
case.

> Tada! Done. And in Python, quite easy. The downside, of course, is
> that you have to store the entire file in memory.

Not just memory. You have to read the whole file in the first place. Which is
hardly efficient if you only need a tiny fraction.

> Personally, unless the file is tremendously large and I know for sure
> that I'm not going to end up iterating over it all, I would pay the
> memory price.

Me, too. Problem with a library function (as Marco proposes) is that you
don't know how it will be used.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

Attachments: signature.asc (application/pgp-signature)
1
server_pubkey.txt

rocksolid light 0.9.7
clearnet tor