Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

"Cogito ergo I'm right and you're wrong." -- Blair Houghton


devel / comp.lang.python / Re: tail

SubjectAuthor
o Re: tailChris Angelico

1
Re: tail

<mailman.214.1650740253.20749.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17994&group=comp.lang.python#17994

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: ros...@gmail.com (Chris Angelico)
Newsgroups: comp.lang.python
Subject: Re: tail
Date: Sun, 24 Apr 2022 04:57:20 +1000
Lines: 42
Message-ID: <mailman.214.1650740253.20749.python-list@python.org>
References: <CABbU2U98YKdcnJkDPfzE3Pqso+6LL72usB8hrSBVR0WbhauRoQ@mail.gmail.com>
<CAPTjJmr3AiCyvxXt=-nqNLrJfyQHmG=pvSsM7nU_XxhSe94zgA@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
X-Trace: news.uni-berlin.de bAWgZbG+lAAc+8UdfuMBMwGjEtUXfiBwx3nZmRQsF64A==
Return-Path: <rosuav@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=Y6HqLIg0;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.024
X-Spam-Evidence: '*H*': 0.95; '*S*': 0.00; '2022': 0.05; 'sun,': 0.07;
'byte': 0.09; 'general,': 0.09; 'memory.': 0.09; 'minus': 0.09;
'threshold': 0.09; 'steps': 0.11; 'memory': 0.15; '"end': 0.16;
'bottom?': 0.16; 'characters.': 0.16; 'chrisa': 0.16; 'commons':
0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16;
'general.': 0.16; 'iterate': 0.16; 'personally,': 0.16; 'streams':
0.16; 'wrote:': 0.16; 'problem': 0.16; 'instead': 0.17;
'probably': 0.17; 'to:addr:python-list': 0.20; 'all,': 0.20;
'basically': 0.22; 'lines': 0.23; 'python,': 0.25; 'binary': 0.26;
'done': 0.28; 'split': 0.32; 'message-id:@mail.gmail.com': 0.32;
'unless': 0.32; 'but': 0.32; "i'm": 0.33; 'there': 0.33; 'same':
0.34; 'header:In-Reply-To:1': 0.34; 'received:google.com': 0.34;
'from:addr:gmail.com': 0.35; "it's": 0.37; 'received:209.85':
0.37; 'file': 0.38; 'read': 0.38; 'received:209': 0.39; 'quite':
0.39; 'text': 0.39; 'this,': 0.39; 'use': 0.39; 'difficult': 0.40;
'done.': 0.40; 'file:': 0.40; 'seeking': 0.40; 'best': 0.61;
'method': 0.61; 'skip:r 20': 0.64; 'introducing': 0.64; 'pay':
0.65; 'entire': 0.67; 'matter': 0.68; 'complexity': 0.69;
'price.': 0.69; 'yourself': 0.75; 'seek': 0.81; 'position': 0.81;
'backwards': 0.84; 'decode': 0.84; 'easy.': 0.84; 'lines,': 0.84;
'preceding': 0.84; 'price,': 0.84; 'sulla': 0.84; 'you:': 0.84
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
h=mime-version:references:in-reply-to:from:date:message-id:subject:to;
bh=065+cpXF/KYwmKT5y4mggS6X/pRUV5kMqdOX/L2N3H4=;
b=Y6HqLIg0GiN7A2RCQduoLfmYMhbfSi4LNPuPkgbnGbFD/Gs99DrgvkIpZapmyx4vzb
r61P8SJlk+oRfX5HDr1t5IzBL9TRXka7CUfFpJxajN9rW4FCZ9bVTsyParNmAF2f2kIQ
SkUdaoVlVjeJosv6Qn5iAF5wVSdGNPmBC+49AONsaho3CnbJoEBkHdhc+4Z+Zx+Acox+
mpeUVWcY1rv8T54RH9Spd/fetYfStRd9MYTnIFfRY9MCmI+hg0XTjmM18JlS2n0CxE5k
eoo2cJriWy9BZRvkroQhXl1vRGR7FCIg10u5ozqsd+YHgjdzLcd9uC6vGxL9FaICoGel
BrCA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=x-gm-message-state:mime-version:references:in-reply-to:from:date
:message-id:subject:to;
bh=065+cpXF/KYwmKT5y4mggS6X/pRUV5kMqdOX/L2N3H4=;
b=oBUGjk1e8E5NBv8uGOOxv8+kMzz5xFeX1qJD4QQ+AghZeRrsgAyEjPpm4WlwAyoj8D
6R5MvBiAIs0BbRx5adshuL8QxYJhkgNMmycEgCm9ZOTf7ZgcZAybazByOKxHNUPkVr4s
YWw5eaKtYau/jrlTWUiMCM9h0FLfUS1M6BKkyaLw5t8Jy094erUEm2uwU3ohLLjwM53u
yYAKW7OuytyP7W7Dmd3I/bREXBntaEVECUWSJrXmP4LeecxF8vS5kxyLBYaNAcRVzPsy
oIYt62CoMoJv/kxa5cwQfRqWlfoCVASQaRQZaYqCB5k2vnsQIquIsoFXLxVY5qlRq/K1
U1mw==
X-Gm-Message-State: AOAM530XumuvShkzUDHbFSv2ZLDVK2gpba2AhvjJKCjHoAVRDnY+5Ql2
NTNO26p4Ew8pOVOYRr0eELRxstfwU9l25aAFq3uCtKKJvr8=
X-Google-Smtp-Source: ABdhPJysxToi6pZu/MKcLftot8ObIB3skzDZgj5CNRkIoVLwUUT4cGlWRBBaygkvQBLYlFa27f6xrnRoJopRztr3x0w=
X-Received: by 2002:a5d:68c6:0:b0:20a:d654:6cae with SMTP id
p6-20020a5d68c6000000b0020ad6546caemr1687341wrw.564.1650740251715; Sat, 23
Apr 2022 11:57:31 -0700 (PDT)
In-Reply-To: <CABbU2U98YKdcnJkDPfzE3Pqso+6LL72usB8hrSBVR0WbhauRoQ@mail.gmail.com>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CAPTjJmr3AiCyvxXt=-nqNLrJfyQHmG=pvSsM7nU_XxhSe94zgA@mail.gmail.com>
X-Mailman-Original-References: <CABbU2U98YKdcnJkDPfzE3Pqso+6LL72usB8hrSBVR0WbhauRoQ@mail.gmail.com>
 by: Chris Angelico - Sat, 23 Apr 2022 18:57 UTC

On Sun, 24 Apr 2022 at 04:37, Marco Sulla <Marco.Sulla.Python@gmail.com> wrote:
>
> What about introducing a method for text streams that reads the lines
> from the bottom? Java has also a ReversedLinesFileReader with Apache
> Commons IO.

It's fundamentally difficult to get precise. In general, there are
three steps to reading the last N lines of a file:

1) Find out the size of the file (currently, if it's being grown)
2) Seek to the end of the file, minus some threshold that you hope
will contain a number of lines
3) Read from there to the end of the file, split it into lines, and
keep the last N

Reading the preceding N lines is basically a matter of repeating the
same exercise, but instead of "end of the file", use the byte position
of the line you last read.

The problem is, seeking around in a file is done by bytes, not
characters. So if you know for sure that you can resynchronize
(possible with UTF-8, not possible with some other encodings), then
you can do this, but it's probably best to build it yourself (opening
the file in binary mode).

This is quite inefficient in general. It would be far FAR easier to do
this instead:

1) Read the entire file and decode bytes to text
2) Split into lines
3) Iterate backwards over the lines

Tada! Done. And in Python, quite easy. The downside, of course, is
that you have to store the entire file in memory.

So it's up to you: pay the memory price, or pay the complexity price.

Personally, unless the file is tremendously large and I know for sure
that I'm not going to end up iterating over it all, I would pay the
memory price.

ChrisA

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor