Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

In computing, the mean time to failure keeps getting shorter.


devel / comp.lang.python / Re: tail

SubjectAuthor
o Re: tailChris Angelico

1
Re: tail

<mailman.219.1650747522.20749.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17999&group=comp.lang.python#17999

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: ros...@gmail.com (Chris Angelico)
Newsgroups: comp.lang.python
Subject: Re: tail
Date: Sun, 24 Apr 2022 06:58:29 +1000
Lines: 53
Message-ID: <mailman.219.1650747522.20749.python-list@python.org>
References: <CABbU2U98YKdcnJkDPfzE3Pqso+6LL72usB8hrSBVR0WbhauRoQ@mail.gmail.com>
<CAPTjJmr3AiCyvxXt=-nqNLrJfyQHmG=pvSsM7nU_XxhSe94zgA@mail.gmail.com>
<CABbU2U8TAvy0zMhUcNtTD0=WpQ6oNYEeZQuKDjnxhG85FVriDg@mail.gmail.com>
<CAPTjJmqnfoPjoNT2CNsrkMVxkzAMHHXHj-G3DuGrJ21SDRNsPA@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
X-Trace: news.uni-berlin.de DVfoyP3DPHJxEr1N1UB/SgaXiZ1K5DdDhpQZX6VeB88Q==
Return-Path: <rosuav@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=mSNrWvta;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.008
X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; '2022': 0.05; 'sun,': 0.07;
'angelico': 0.09; 'byte': 0.09; 'general,': 0.09; 'minus': 0.09;
'pypi,': 0.09; 'threshold': 0.09; 'steps': 0.11; '"end': 0.16;
'anything,': 0.16; 'bottom?': 0.16; 'characters.': 0.16; 'chrisa':
0.16; 'commons': 0.16; 'encoding.': 0.16; 'from:addr:rosuav':
0.16; 'from:name:chris angelico': 0.16; 'general.': 0.16;
'indeed': 0.16; 'splitting': 0.16; 'stdlib.': 0.16; 'streams':
0.16; 'wrote:': 0.16; 'problem': 0.16; 'instead': 0.17;
'probably': 0.17; 'to:addr:python-list': 0.20; 'basically': 0.22;
'sat,': 0.22; 'lines': 0.23; 'depends': 0.25; 'binary': 0.26;
'done': 0.28; 'chris': 0.28; 'whole': 0.30; 'wondering': 0.31;
'think': 0.32; 'guess': 0.32; 'split': 0.32; 'message-
id:@mail.gmail.com': 0.32; 'but': 0.32; "i'm": 0.33; 'there':
0.33; 'same': 0.34; 'header:In-Reply-To:1': 0.34;
'received:google.com': 0.34; 'from:addr:gmail.com': 0.35; "it's":
0.37; 'received:209.85': 0.37; 'this.': 0.37; 'file': 0.38;
'read': 0.38; 'received:209': 0.39; 'quite': 0.39; 'necessary':
0.39; 'text': 0.39; 'this,': 0.39; 'use': 0.39; 'still': 0.40;
'difficult': 0.40; 'file:': 0.40; 'seeking': 0.40; 'best': 0.61;
'likely': 0.61; 'method': 0.61; 'skip:r 20': 0.64; 'introducing':
0.64; 'well': 0.65; 'less': 0.65; 'matter': 0.68; 'only.': 0.69;
'yourself': 0.75; 'seek': 0.81; 'position': 0.81; 'backwards':
0.84; 'lines,': 0.84; 'literally': 0.84; 'lose': 0.84;
'preceding': 0.84; 'sulla': 0.84
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
h=mime-version:references:in-reply-to:from:date:message-id:subject:to;
bh=0S1o9sqf6q/7JhUFRzWFURBE07Au6QQpiUiAzFrW6mg=;
b=mSNrWvtas/Bp3Lkpi/XxEDOcFkkPSmEG3BNLOqw1ai9NcNg4QDtuTaiHO883Ytf+GK
YTKd0gwIMpTxII8L1CZGje/ytwHR161xFeBAN9ziLtdRfc5pk50NKphUNJpewvLoAQLC
OiwNIogj5HkhkispAZDVnPK3AsKTHC2zTm8bGRBeAuZmcM+//2teZA+Nfz/1SGGfJZla
r/1fk+9fv/XCqJpaaavJlfj6ykOM0mpmVnGAaJsOCLF4siD879lIpgLfcIqcXkTN3vSQ
7F6xBNMsXfZCQfT1AcC9V9pB70ogBY3Kae0CVR0ZQb1T5TzY5wL9I/wD6XGIGcLTWlyG
mAEg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=x-gm-message-state:mime-version:references:in-reply-to:from:date
:message-id:subject:to;
bh=0S1o9sqf6q/7JhUFRzWFURBE07Au6QQpiUiAzFrW6mg=;
b=LxdIGUqfSDvVjCz+oKZTgxzUNsqRuJs1KgllDrwTSb/tFLdP7cUmJQJrzmN0rWLZ+Z
569mmQWKIbdP3NDOgIilT/YhXXI7wCD2rp8Ii/un7bdRq/WkNzxjGbgKDXgiJ5swxt7t
MDPDUklFTqE56sCKN6ljq+6Jzsb3pFsNWSTCZrQUDGbd3efHU/Cq3q0/330lkn0A1iYt
eZJ9Tt+Gsp8IOeDUR8l1OnmPY/nqhT+ayvWavsHnRtnLhTE0GREOmV0iBbJfsN0owzuN
/vfzAslzIhAigOWrSGhGd53No52OtsRBi8tIoqdTtLIK0irWenUFh1LsNkZKFHs2B0Tt
oKbw==
X-Gm-Message-State: AOAM532QQPs6v7qImys8sOnX1T09DeiY7vzDVX4C45dAm1hU1VK+BfG5
29xmp65Ym0eeT5VERLGANRsDV89NXFHbAEIjnFV6Eg+k
X-Google-Smtp-Source: ABdhPJwOrSWAf0EOdzCeuLddz5sO0FJt25QZo3ez1J9mkCRVjWosrHRP9cO0jlcaZysMOzu7EGoa/FKAV+YruyjzFog=
X-Received: by 2002:a05:6000:188d:b0:20a:a014:7ff6 with SMTP id
a13-20020a056000188d00b0020aa0147ff6mr8621446wri.104.1650747520295; Sat, 23
Apr 2022 13:58:40 -0700 (PDT)
In-Reply-To: <CABbU2U8TAvy0zMhUcNtTD0=WpQ6oNYEeZQuKDjnxhG85FVriDg@mail.gmail.com>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CAPTjJmqnfoPjoNT2CNsrkMVxkzAMHHXHj-G3DuGrJ21SDRNsPA@mail.gmail.com>
X-Mailman-Original-References: <CABbU2U98YKdcnJkDPfzE3Pqso+6LL72usB8hrSBVR0WbhauRoQ@mail.gmail.com>
<CAPTjJmr3AiCyvxXt=-nqNLrJfyQHmG=pvSsM7nU_XxhSe94zgA@mail.gmail.com>
<CABbU2U8TAvy0zMhUcNtTD0=WpQ6oNYEeZQuKDjnxhG85FVriDg@mail.gmail.com>
 by: Chris Angelico - Sat, 23 Apr 2022 20:58 UTC

On Sun, 24 Apr 2022 at 06:41, Marco Sulla <Marco.Sulla.Python@gmail.com> wrote:
>
> On Sat, 23 Apr 2022 at 20:59, Chris Angelico <rosuav@gmail.com> wrote:
> >
> > On Sun, 24 Apr 2022 at 04:37, Marco Sulla <Marco.Sulla.Python@gmail.com> wrote:
> > >
> > > What about introducing a method for text streams that reads the lines
> > > from the bottom? Java has also a ReversedLinesFileReader with Apache
> > > Commons IO.
> >
> > It's fundamentally difficult to get precise. In general, there are
> > three steps to reading the last N lines of a file:
> >
> > 1) Find out the size of the file (currently, if it's being grown)
> > 2) Seek to the end of the file, minus some threshold that you hope
> > will contain a number of lines
> > 3) Read from there to the end of the file, split it into lines, and
> > keep the last N
> >
> > Reading the preceding N lines is basically a matter of repeating the
> > same exercise, but instead of "end of the file", use the byte position
> > of the line you last read.
> >
> > The problem is, seeking around in a file is done by bytes, not
> > characters. So if you know for sure that you can resynchronize
> > (possible with UTF-8, not possible with some other encodings), then
> > you can do this, but it's probably best to build it yourself (opening
> > the file in binary mode).
>
> Well, indeed I have an implementation that does more or less what you
> described for utf8 only. The only difference is that I just started
> from the end of file -1. I'm just wondering if this will be useful in
> the stdlib. I think it's not too difficult to generalise for every
> encoding.
>
> > This is quite inefficient in general.
>
> Why inefficient? I think that readlines() will be much slower, not
> only more time consuming.

It depends on which is more costly: reading the whole file (cost
depends on size of file) or reading chunks and splitting into lines
(cost depends on how well you guess at chunk size). If the lines are
all *precisely* the same number of bytes each, you can pick a chunk
size and step backwards with near-perfect efficiency (it's still
likely to be less efficient than reading a file forwards, on most file
systems, but it'll be close); but if you have to guess, adjust, and
keep going, then you lose efficiency there.

I don't think this is necessary in the stdlib. If anything, it might
be good on PyPI, but I for one have literally never wanted this.

ChrisA

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor