Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Alexander Graham Bell is alive and well in New York, and still waiting for a dial tone.


devel / comp.lang.python / Re: tail

SubjectAuthor
o Re: tailChris Angelico

1
Re: tail

<mailman.243.1650815931.20749.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18026&group=comp.lang.python#18026

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: ros...@gmail.com (Chris Angelico)
Newsgroups: comp.lang.python
Subject: Re: tail
Date: Mon, 25 Apr 2022 01:58:37 +1000
Lines: 50
Message-ID: <mailman.243.1650815931.20749.python-list@python.org>
References: <CABbU2U98YKdcnJkDPfzE3Pqso+6LL72usB8hrSBVR0WbhauRoQ@mail.gmail.com>
<CAPTjJmr3AiCyvxXt=-nqNLrJfyQHmG=pvSsM7nU_XxhSe94zgA@mail.gmail.com>
<CABbU2U8TAvy0zMhUcNtTD0=WpQ6oNYEeZQuKDjnxhG85FVriDg@mail.gmail.com>
<CAPTjJmqnfoPjoNT2CNsrkMVxkzAMHHXHj-G3DuGrJ21SDRNsPA@mail.gmail.com>
<CABbU2U_sWyEmBXf0Psudwc-FLeRYqLX=B4x-_9TV0qc5ZVt3Bg@mail.gmail.com>
<CAPTjJmrJacamKq1V5T8FECkm4jURdYQgj0VsC+JK5Db0NoFaww@mail.gmail.com>
<CABbU2U81bTTtf_5d-drx8kj7VQunPZg8xMATm8FoQ6ONVZULLw@mail.gmail.com>
<CAPTjJmq1k7-RV-c1SKtgV3dwwoKXyx1yPM7M_0mj2zicGYiK+g@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Trace: news.uni-berlin.de Mg03vJBIYcbVveycM3pMnQYpkLU7d2m1a+giU0dpCceQ==
Return-Path: <rosuav@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=lNBA5NmI;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.030
X-Spam-Evidence: '*H*': 0.94; '*S*': 0.00; '(which': 0.04; '2022':
0.05; 'is.': 0.05; 'usage': 0.05; 'utf-8': 0.07; 'angelico': 0.09;
'byte': 0.09; 'memory': 0.15; 'chrisa': 0.16; 'encoding': 0.16;
'especially,': 0.16; 'fixed-width': 0.16; 'from:addr:rosuav':
0.16; 'from:name:chris angelico': 0.16; 'newline': 0.16;
'received:209.85.221.49': 0.16; 'received:mail-
wr1-f49.google.com': 0.16; 'slow': 0.16; 'wrote:': 0.16;
'instead': 0.17; "can't": 0.17; 'to:addr:python-list': 0.20;
'exception': 0.22; 'sat,': 0.22; 'do,': 0.26; "isn't": 0.27;
'chris': 0.28; 'think': 0.32; 'message-id:@mail.gmail.com': 0.32;
'unless': 0.32; 'but': 0.32; 'header:In-Reply-To:1': 0.34;
'received:google.com': 0.34; 'from:addr:gmail.com': 0.35; 'mon,':
0.36; 'system,': 0.36; 'necessarily': 0.37; 'really': 0.37;
'using': 0.37; "it's": 0.37; 'received:209.85': 0.37; 'this.':
0.37; 'file': 0.38; 'read': 0.38; 'received:209': 0.39; 'single':
0.39; 'valid': 0.39; 'otherwise,': 0.40; 'seeking': 0.40; 'want':
0.40; 'try': 0.40; 'search': 0.61; 'likely': 0.61; "there's":
0.61; 'point.': 0.62; 'once': 0.63; 'full': 0.64; 'times.': 0.64;
'your': 0.64; 'time.': 0.66; 'back': 0.67; 'more,': 0.67; 'time,':
0.67; 'compare': 0.69; "you'll": 0.73; 'five': 0.75; '(you': 0.76;
'backwards': 0.84; 'lines,': 0.84; 'pairs': 0.84; 'sulla': 0.84;
'line,': 0.93
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
h=mime-version:references:in-reply-to:from:date:message-id:subject:to
:content-transfer-encoding;
bh=mQCzxYvvg1tOrXsDV2sBD7dnh+zasd3WOrIg+aD5NXw=;
b=lNBA5NmI4LxRuSINkmBwd/GpueLUQxw5Wt9Tt8zp4RVyhSkUTC3qikRfAOvGKTmN+p
ifw+KCEfsQNVcrCxYflmmIUSNU2sYg/QIuGJcoCsrQYNlDrNrBpYDX2qXbHyXGIvSmUk
jh4GgRjsw37dUxt0OHhqgE164b+FMjOSFgCOPDZqD7rn9Qi4vLtq+7CqIhEWxGIQVthS
OWHFED6u0kppIxicx594hXP1mzCzxhsbginBy2AOmMRIy3PxvOYF8T1LDqm3MQP5sphy
FmcsoF6DsMbk9AdWerQg0u30dPZ1xzS26ZpPve0cVcKkcBgXoYh1HLLl+F65MNcFG0YG
WBTg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=x-gm-message-state:mime-version:references:in-reply-to:from:date
:message-id:subject:to:content-transfer-encoding;
bh=mQCzxYvvg1tOrXsDV2sBD7dnh+zasd3WOrIg+aD5NXw=;
b=Xdhh+Lp9fo1regkyJzh/VDUwXVwLdgZfYPQj88v8ei8RdojaAB9AQ13b0nUS5EmPFv
Qcd3kIpKX5lRmH80rXzSusPSAafjIzt8DwmcVCovSx+x5TcW8BfYsEfAbk3kcre7Zt34
Pr8efN7abAFuUM/p9xgNAIXVLEnwaQb7aSK/ZOMRscvQ9jpkOMXszdEjQSabkhRrWAx9
K5oHtUj1jaCK7wVKDRPw7QrhVJHtEyMoKKYiB+2ouECc0neKwM80SCHswmY1gclMj6fk
E81nsYURm3ZJ/ZiIj0ViGyUfjAJvyIzFQ5H/CyyLETF0ht3XUDT0aBw0tfxciQTzo0RF
a3QQ==
X-Gm-Message-State: AOAM530JSJTqBN+7aYwJb+81xnGkEdOLN8LYbX6N+MpwaK3VXM4gSMDY
pd2SihA7vbDC3LhewljnNwFAplhWAEaTblRr1yBkC4PY
X-Google-Smtp-Source: ABdhPJxEYHkVy4M6dCF1C/EHCBHz/4f0EerYCvkKvFQRHUqqGvMmzSAo9Sw9Qat4OyNx7d4LPb/H1BPTYCP69484TFo=
X-Received: by 2002:a05:6000:188d:b0:20a:a014:7ff6 with SMTP id
a13-20020a056000188d00b0020aa0147ff6mr11160953wri.104.1650815928733; Sun, 24
Apr 2022 08:58:48 -0700 (PDT)
In-Reply-To: <CABbU2U81bTTtf_5d-drx8kj7VQunPZg8xMATm8FoQ6ONVZULLw@mail.gmail.com>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CAPTjJmq1k7-RV-c1SKtgV3dwwoKXyx1yPM7M_0mj2zicGYiK+g@mail.gmail.com>
X-Mailman-Original-References: <CABbU2U98YKdcnJkDPfzE3Pqso+6LL72usB8hrSBVR0WbhauRoQ@mail.gmail.com>
<CAPTjJmr3AiCyvxXt=-nqNLrJfyQHmG=pvSsM7nU_XxhSe94zgA@mail.gmail.com>
<CABbU2U8TAvy0zMhUcNtTD0=WpQ6oNYEeZQuKDjnxhG85FVriDg@mail.gmail.com>
<CAPTjJmqnfoPjoNT2CNsrkMVxkzAMHHXHj-G3DuGrJ21SDRNsPA@mail.gmail.com>
<CABbU2U_sWyEmBXf0Psudwc-FLeRYqLX=B4x-_9TV0qc5ZVt3Bg@mail.gmail.com>
<CAPTjJmrJacamKq1V5T8FECkm4jURdYQgj0VsC+JK5Db0NoFaww@mail.gmail.com>
<CABbU2U81bTTtf_5d-drx8kj7VQunPZg8xMATm8FoQ6ONVZULLw@mail.gmail.com>
 by: Chris Angelico - Sun, 24 Apr 2022 15:58 UTC

On Mon, 25 Apr 2022 at 01:47, Marco Sulla <Marco.Sulla.Python@gmail.com> wrote:
>
>
>
> On Sat, 23 Apr 2022 at 23:18, Chris Angelico <rosuav@gmail.com> wrote:
>>
>> Ah. Well, then, THAT is why it's inefficient: you're seeking back one
>> single byte at a time, then reading forwards. That is NOT going to
>> play nicely with file systems or buffers.
>>
>> Compare reading line by line over the file with readlines() and you'll
>> see how abysmal this is.
>>
>> If you really only need one line (which isn't what your original post
>> suggested), I would recommend starting with a chunk that is likely to
>> include a full line, and expanding the chunk until you have that
>> newline. Much more efficient than one byte at a time.
>
>
> Well, I would like to have a sort of tail, so to generalise to more than 1 line. But I think that once you have a good algorithm for one line, you can repeat it N times.
>

Not always. If you know you want to read 5 lines, it's much more
efficient than reading 1 line, then going back to the file, five
times. Disk reads are the costliest part, with the possible exception
of memory usage (but usually only because it can cause additional disk
*writes*).

> I understand that you can read a chunk instead of a single byte, so when the newline is found you can return all the cached chunks concatenated. But will this make the search of the start of the line faster? I suppose you have always to read byte by byte (or more, if you're using urf16 etc) and see if there's a newline.
>

Massively massively faster. Try it. Especially, try it on an
artificially slow file system, so you can see what it costs.

But you can't rely on any backwards reads unless you know for sure
that the encoding supports this. UTF-8 does (you have to scan
backwards for a start byte), UTF-16 does (work with pairs of bytes and
check for surrogates), and fixed-width encodings do, but otherwise,
you won't necessarily know when you've found a valid start point. So
any reverse-read algorithm is going to be restricted to specific
encodings.

ChrisA

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor