Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

As a computer, I find your faith in technology amusing.


devel / comp.lang.python / Re: tail

SubjectAuthor
o Re: tailCameron Simpson

1
Re: tail

<mailman.446.1652909432.20749.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=18322&group=comp.lang.python#18322

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: cs...@cskk.id.au (Cameron Simpson)
Newsgroups: comp.lang.python
Subject: Re: tail
Date: Thu, 19 May 2022 07:30:20 +1000
Lines: 48
Message-ID: <mailman.446.1652909432.20749.python-list@python.org>
References: <CABbU2U_DbdMt7578cqAuHjyRpe=3cW29aog0=OJJZRkX8-Xi6g@mail.gmail.com>
<YoVlbKuF62gisDjt@cskk.homeip.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: news.uni-berlin.de p/Jnca54uM/nxGeqMU8Uqwi2+Lp9D94I8U7DC99eN5ZA==
Return-Path: <cameron@cskk.id.au>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.010
X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; 'that?': 0.07; 'thing.':
0.07; 'linux': 0.09; 'cheers,': 0.11; '"line': 0.16; '>>>>': 0.16;
'cameron': 0.16; 'dominate': 0.16; 'encoding': 0.16;
'from:addr:cs': 0.16; 'from:addr:cskk.id.au': 0.16;
'from:name:cameron simpson': 0.16; 'hand,': 0.16; 'message-
id:@cskk.homeip.net': 0.16; 'provide.': 0.16; 'python3': 0.16;
'received:13.237': 0.16; 'received:13.237.201': 0.16;
'received:13.237.201.189': 0.16; 'received:cskk.id.au': 0.16;
'received:id.au': 0.16; 'received:l': 0.16;
'received:mail.cskk.id.au': 0.16; 'similar.': 0.16; 'simpson':
0.16; 'skip:> 10': 0.16; 'tries': 0.16; 'wrote:': 0.16; 'python':
0.16; 'to:addr:python-list': 0.20; "i've": 0.22; 'code': 0.23;
'seems': 0.26; 'done': 0.28; 'header:User-Agent:1': 0.30; 'whole':
0.30; "doesn't": 0.32; 'amounts': 0.32; 'end.': 0.32; 'but': 0.32;
'there': 0.33; 'same': 0.34; 'core': 0.34; 'printing': 0.34;
'header:In-Reply-To:1': 0.34; 'received:au': 0.35; 'files': 0.36;
'source': 0.36; "it's": 0.37; 'hard': 0.37; 'file': 0.38;
'thanks': 0.38; 'text': 0.39; 'received:13': 0.64; 'skip:t 40':
0.64; 'your': 0.64; 'similar': 0.65; 'received:userid': 0.66;
'numbers': 0.67; 'cost': 0.69; 'also:': 0.69; 'result,': 0.69;
'performance': 0.71; 'skip:y 10': 0.76; 'surprise': 0.76; '1.3':
0.84; 'good,': 0.84; 'measuring': 0.84; 'sulla': 0.84; 'me:':
0.91; 'central': 0.95
Mail-Followup-To: python-list@python.org
Content-Disposition: inline
In-Reply-To: <CABbU2U_DbdMt7578cqAuHjyRpe=3cW29aog0=OJJZRkX8-Xi6g@mail.gmail.com>
User-Agent: Mutt/2.2.3 (2022-04-12)
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <YoVlbKuF62gisDjt@cskk.homeip.net>
X-Mailman-Original-References: <CABbU2U_DbdMt7578cqAuHjyRpe=3cW29aog0=OJJZRkX8-Xi6g@mail.gmail.com>
 by: Cameron Simpson - Wed, 18 May 2022 21:30 UTC

On 17May2022 22:45, Marco Sulla <Marco.Sulla.Python@gmail.com> wrote:
>Well, I've done a benchmark.
>>>> timeit.timeit("tail('/home/marco/small.txt')", globals={"tail":tail}, number=100000)
>1.5963431186974049
>>>> timeit.timeit("tail('/home/marco/lorem.txt')", globals={"tail":tail}, number=100000)
>2.5240604374557734
>>>> timeit.timeit("tail('/home/marco/lorem.txt', chunk_size=1000)", globals={"tail":tail}, number=100000)
>1.8944984432309866

This suggests that the file size does not dominate uour runtime. Ah.
_Or_ that there are similar numbers of newlines vs text in the files so
reading similar amounts of data from the end. If the "line desnity" of
the files were similar you would hope that the runtimes would be
similar.

>small.txt is a text file of 1.3 KB. lorem.txt is a lorem ipsum of 1.2
>GB. It seems the performance is good, thanks to the chunk suggestion.
>
>But the time of Linux tail surprise me:
>
>marco@buzz:~$ time tail lorem.txt
>[text]
>
>real 0m0.004s
>user 0m0.003s
>sys 0m0.001s
>
>It's strange that it's so slow. I thought it was because it decodes
>and print the result, but I timed

You're measuring different things. timeit() tries hard to measure just
the code snippet you provide. It doesn't measure the startup cost of the
whole python interpreter. Try:

time python3 your-tail-prog.py /home/marco/lorem.txt

BTW, does your `tail()` print output? If not, again not measuring the
same thing.

If you have the source of tail(1) to hand, consider getting to the core
and measuring `time()` immediately before and immediately after the
central tail operation and printing the result.

Also: does tail(1) do character set / encoding stuff? Does your Python
code do that? Might be apples and oranges.

Cheers,
Cameron Simpson <cs@cskk.id.au>

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor