Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

I have a theory that it's impossible to prove anything, but I can't prove it.


devel / comp.lang.python / Re: Is there a more efficient threading lock?

SubjectAuthor
* Re: Is there a more efficient threading lock?Skip Montanaro
+- Re: Is there a more efficient threading lock?Paul Rubin
`* Re: Is there a more efficient threading lock?Dennis Lee Bieber
 `- Re: Is there a more efficient threading lock?Chris Angelico

1
Re: Is there a more efficient threading lock?

<mailman.1997.1677362017.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=21943&group=comp.lang.python#21943

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: skip.mon...@gmail.com (Skip Montanaro)
Newsgroups: comp.lang.python
Subject: Re: Is there a more efficient threading lock?
Date: Sat, 25 Feb 2023 15:41:52 -0600
Lines: 63
Message-ID: <mailman.1997.1677362017.20444.python-list@python.org>
References: <CANc-5Uz1GwFrw9Rw134A3maCs-FK06VXFygJD6W7NFGEKy20pA@mail.gmail.com>
<5fec3cb2-52a2-758b-fda9-bcd6608d5681@tompassin.net>
<CANc-5Ux9fg6v0Gjtqi=g=MdB-OJtOeM3u6xPS0c3nyPKp88Acg@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
X-Trace: news.uni-berlin.de EAd57tYanj+/WMNZRLNx2QluOvb4O2GzLPWaC+GToi2w==
Return-Path: <skip.montanaro@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=T+TZDiwf;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.021
X-Spam-Evidence: '*H*': 0.96; '*S*': 0.00; 'looks': 0.02; 'bunch':
0.05; 'is.': 0.05; 'thread': 0.05; 'cpu': 0.07; 'loop': 0.07;
'url-ip:139/8': 0.07; 'cc:addr:python-list': 0.09; 'macbook':
0.09; 'macos': 0.09; 'threads': 0.09; 'worker': 0.09; '&gt;':
0.14; 'cc:no real name:2**0': 0.14; 'import': 0.15; '5000': 0.16;
'computation': 0.16; 'database,': 0.16; 'extracting': 0.16; 'long-
running': 0.16; 'loops,': 0.16; 'once.': 0.16; 'queues': 0.16;
'subject:lock': 0.16; 'thread.': 0.16; 'wrote:': 0.16; 'problem':
0.16; 'python': 0.16; "can't": 0.17; 'uses': 0.19;
'cc:addr:python.org': 0.20; 'maybe': 0.22; 'version': 0.23;
'code': 0.23; 'stuff': 0.25; 'cc:2**0': 0.25; 'seems': 0.26;
"wasn't": 0.26; 'bit': 0.27; 'done': 0.28; 'output': 0.28; 'goes':
0.28; 'ideas': 0.28; 'this?': 0.29; 'takes': 0.31; 'default':
0.31; 'think': 0.32; 'subject:there': 0.32; 'message-
id:@mail.gmail.com': 0.32; 'but': 0.32; "i'm": 0.33; "i'll": 0.33;
'url:blog': 0.33; 'same': 0.34; 'header:In-Reply-To:1': 0.34;
'received:google.com': 0.34; 'running': 0.34;
'from:addr:gmail.com': 0.35; 'also,': 0.36; 'count': 0.36;
'using': 0.37; 'received:209.85': 0.37; 'put': 0.38; '8bit%:14':
0.38; 'thanks': 0.38; 'received:209': 0.39; 'single': 0.39;
'handle': 0.39; 'received:209.85.208': 0.39; 'block': 0.39; 'on.':
0.39; 'want': 0.40; 'best': 0.61; 'initial': 0.61; 'skip:o 10':
0.61; 'generation': 0.62; 'email': 0.63; 'once': 0.63; 'between':
0.63; 'expert': 0.64; 'lock': 0.64; 'skip:t 20': 0.66; 'time.':
0.66; 'skip:n 30': 0.67; 'body': 0.67; 'outside': 0.67; 'per':
0.68; 'order': 0.69; '3000': 0.69; 'obvious': 0.69; 'workers':
0.69; '1000': 0.70; 'database': 0.80; 'dead': 0.81; 'extra': 0.84;
'control.': 0.84; 'eliminate': 0.84; 'enters': 0.84;
'generation.': 0.84; 'manipulated': 0.84; 'phrases': 0.84;
'phrases.': 0.84; 'reducing': 0.84; 'sqlite': 0.84; 'replacing':
0.91; 'operates': 0.93; 'subject:more': 0.95
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=gmail.com; s=20210112; t=1677362015;
h=cc:to:subject:message-id:date:from:in-reply-to:references
:mime-version:from:to:cc:subject:date:message-id:reply-to;
bh=2hIk9tejAVB6h1yStBok/Qibkv1yJIy44bsFBTzwCd4=;
b=T+TZDiwf0/+wGUglzROHQs2rR5k+3/NWvMSuNX98E3c/Rm5YuKeWp6RUV3fTFOvPcC
L2RY0ZYAone8i5zo+ghfhB7GYEue3CPd2Ldj8NveyUPua9CdK3mkldwmX6bvSa6pULFp
4dQ2nlrjlEWIJup8RvXyNXwQzPlhLPA9OPqSGt79k5Xk7s2/o+AVsmiVQgIPIllmhzRZ
u9EVjF0z119n6YzonEwxIn7reQSIWU/WLahXs6WKwSOLQZ+Sk3Va5o0/Zi6Qzbzitoob
lBTaB3QjDdd/+8UzrOouaSqC06XEszpl34HOViqmwWCufflI3WAzCz0pwPUOQyURTkuS
v5Sg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112; t=1677362015;
h=cc:to:subject:message-id:date:from:in-reply-to:references
:mime-version:x-gm-message-state:from:to:cc:subject:date:message-id
:reply-to;
bh=2hIk9tejAVB6h1yStBok/Qibkv1yJIy44bsFBTzwCd4=;
b=ssI5kypxvK5IKbpF51mTjfaXNaq4r74ZQfDHwmlPeKcdK7GhPENOduch8Kuv0jlQSi
QulYw0OzlJv3QStkU7pqI1MvYwrhAmNmQaLghEplOmGnvoBvS0ASBfviqwCM7IC8ZZeD
TEnF/M8VtYbIGmMqUJF9ralQumxSDrLHbM7tO+XqpDH5dMfv+n2bd9+zBpxb1MQPAgRe
eWXXV0D+yGyl/9CdcQ/e0Fkr+I0FH3Ys9jnAx8pDzzftw7jJVS0a/4YXRJW1K+qKL1wo
Nr/tl/b+h6S/TjALPLxV7dTkJGpTlcaV4J8IPCKeZVkqrbScZh5nYlsLcmbNEScFMzgN
DNNQ==
X-Gm-Message-State: AO0yUKVzLBNje8ozBdqK0Sp79Os8FjjGBUnh3e0byeehwSw54MtEuRJL
uuBEB9d8ztwia3VwIve7Py29OTas3tZZxaF7a8swJmiEukmyAwc=
X-Google-Smtp-Source: AK7set8IHX+AsVla8STlC+OjU3X/klUMREW9/KZ6UFhG6SVeR5ffJkObrTLyqPsiZuq50Gaq3uHZ9e5vVw7Y5M/2qBk=
X-Received: by 2002:a2e:a4b4:0:b0:295:a5cd:3baf with SMTP id
g20-20020a2ea4b4000000b00295a5cd3bafmr1773015ljm.9.1677361338631; Sat, 25 Feb
2023 13:42:18 -0800 (PST)
In-Reply-To: <5fec3cb2-52a2-758b-fda9-bcd6608d5681@tompassin.net>
X-Content-Filtered-By: Mailman/MimeDel 2.1.39
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CANc-5Ux9fg6v0Gjtqi=g=MdB-OJtOeM3u6xPS0c3nyPKp88Acg@mail.gmail.com>
X-Mailman-Original-References: <CANc-5Uz1GwFrw9Rw134A3maCs-FK06VXFygJD6W7NFGEKy20pA@mail.gmail.com>
<5fec3cb2-52a2-758b-fda9-bcd6608d5681@tompassin.net>
 by: Skip Montanaro - Sat, 25 Feb 2023 21:41 UTC

Thanks for the responses.

Peter wrote:

> Which OS is this?

MacOS Ventura 13.1, M1 MacBook Pro (eight cores).

Thomas wrote:

> I'm no expert on locks, but you don't usually want to keep a lock while
> some long-running computation goes on. You want the computation to be
> done by a separate thread, put its results somewhere, and then notify
> the choreographing thread that the result is ready.

In this case I'm extracting the noun phrases from the body of an email
message (returned as a list). I have a collection of email messages
organized by month (typically 1000 to 3000 messages per month). I'm using
concurrent.futures.ThreadPoolExecutor() with the default number of workers (
os.cpu_count() * 1.5, or 12 threads on my system) to process each month, so
12 active threads at a time. Given that the process is pretty much CPU
bound, maybe reducing the number of workers to the CPU count would make
sense. Processing of each email message enters that with block once. That's
about as minimal as I can make it. I thought for a bit about pushing the
textblob stuff into a separate worker thread, but it wasn't obvious how to
set up queues to handle the communication between the threads created by
ThreadPoolExecutor() and the worker thread. Maybe I'll think about it
harder. (I have a related problem with SQLite, since an open database can't
be manipulated from multiple threads. That makes much of the program's
end-of-run processing single-threaded.)

> This link may be helpful -
>
> https://anandology.com/blog/using-iterators-and-generators/

I don't think that's where my problem is. The lock protects the generation
of the noun phrases. My loop which does the yielding operates outside of
that lock's control. The version of the code is my latest, in which I
tossed out a bunch of phrase-processing code (effectively dead end ideas
for processing the phrases). Replacing the for loop with a simple return
seems not to have any effect. In any case, the caller which uses the
phrases does a fair amount of extra work with the phrases, populating a
SQLite database, so I don't think the amount of time it takes to process a
single email message is dominated by the phrase generation.

Here's timeit output for the noun_phrases code:

% python -m timeit -s 'text = """`python -m timeit --help`""" ; from
textblob import TextBlob ; from textblob.np_extractors import
ConllExtractor ; ext = ConllExtractor() ; phrases = TextBlob(text,
np_extractor=ext).noun_phrases' 'phrases = TextBlob(text,
np_extractor=ext).noun_phrases'
5000 loops, best of 5: 98.7 usec per loop

I process the output of timeit's help message which looks to be about the
same length as a typical email message, certainly the same order of
magnitude. Also, note that I call it once in the setup to eliminate the
initial training of the ConllExtractor instance. I don't know if ~100us
qualifies as long running or not.

I'll keep messing with it.

Skip

Re: Is there a more efficient threading lock?

<87a611s90j.fsf@nightsong.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=21946&group=comp.lang.python#21946

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: no.em...@nospam.invalid (Paul Rubin)
Newsgroups: comp.lang.python
Subject: Re: Is there a more efficient threading lock?
Date: Sat, 25 Feb 2023 13:57:48 -0800
Organization: A noiseless patient Spider
Lines: 10
Message-ID: <87a611s90j.fsf@nightsong.com>
References: <CANc-5Uz1GwFrw9Rw134A3maCs-FK06VXFygJD6W7NFGEKy20pA@mail.gmail.com>
<5fec3cb2-52a2-758b-fda9-bcd6608d5681@tompassin.net>
<CANc-5Ux9fg6v0Gjtqi=g=MdB-OJtOeM3u6xPS0c3nyPKp88Acg@mail.gmail.com>
<mailman.1997.1677362017.20444.python-list@python.org>
MIME-Version: 1.0
Content-Type: text/plain
Injection-Info: reader01.eternal-september.org; posting-host="5b0765fe46e2a81cea02d4e634ea41e2";
logging-data="2811584"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+WZ1UZG2As8ww5nU3gCx4T"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)
Cancel-Lock: sha1:U3kSRl3hlQJf2lrzjdrk7iA472c=
sha1:4O1/B6I8pEm+4EelA8bfpNkFuP0=
 by: Paul Rubin - Sat, 25 Feb 2023 21:57 UTC

Skip Montanaro <skip.montanaro@gmail.com> writes:
> In this case I'm extracting the noun phrases from the body of an email
> message (returned as a list). I have a collection of email messages
> organized by month (typically 1000 to 3000 messages per month).

This is embarassingly parallel enough that I would probably launch a
bunch of separate command line processes with GNU Parallel, rather than
messing with writing a multi-threaded Python program. That would also
let you distribute the processing across multiple machines on a network,
if the cpu requirements warranted it.

Re: Is there a more efficient threading lock?

<a3jlvhhm479ltklcp8u42t6qffmltokba9@4ax.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=21960&group=comp.lang.python#21960

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border-2.nntp.ord.giganews.com!nntp.giganews.com!Xl.tags.giganews.com!local-1.nntp.ord.giganews.com!news.giganews.com.POSTED!not-for-mail
NNTP-Posting-Date: Sun, 26 Feb 2023 03:10:49 +0000
From: wlfr...@ix.netcom.com (Dennis Lee Bieber)
Newsgroups: comp.lang.python
Subject: Re: Is there a more efficient threading lock?
Date: Sat, 25 Feb 2023 22:10:48 -0500
Organization: IISS Elusive Unicorn
Message-ID: <a3jlvhhm479ltklcp8u42t6qffmltokba9@4ax.com>
References: <CANc-5Uz1GwFrw9Rw134A3maCs-FK06VXFygJD6W7NFGEKy20pA@mail.gmail.com> <5fec3cb2-52a2-758b-fda9-bcd6608d5681@tompassin.net> <CANc-5Ux9fg6v0Gjtqi=g=MdB-OJtOeM3u6xPS0c3nyPKp88Acg@mail.gmail.com> <mailman.1997.1677362017.20444.python-list@python.org>
User-Agent: ForteAgent/8.00.32.1272
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Lines: 26
X-Usenet-Provider: http://www.giganews.com
X-Trace: sv3-zJMlsAUgZjp10k/wdxw8Nm+Vy5Ouk5z7mn21vyIEZq0hN+cqN9Oe2H3rMN6P5Dn7GsHLkioAGggOQJy!UQ2efNIvomLc7n6eOKhPyRTjeGxM5nmExMac4yvZpR3zV7/Q6bhFaE9SxsBa6UfYq5FT37uE
X-Complaints-To: abuse@giganews.com
X-DMCA-Notifications: http://www.giganews.com/info/dmca.html
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.40
 by: Dennis Lee Bieber - Sun, 26 Feb 2023 03:10 UTC

On Sat, 25 Feb 2023 15:41:52 -0600, Skip Montanaro
<skip.montanaro@gmail.com> declaimed the following:

>concurrent.futures.ThreadPoolExecutor() with the default number of workers (
>os.cpu_count() * 1.5, or 12 threads on my system) to process each month, so
>12 active threads at a time. Given that the process is pretty much CPU
>bound, maybe reducing the number of workers to the CPU count would make

Unless things have improved a lot over the years, the GIL still limits
active threads to the equivalent of a single CPU. The OS may swap among
which CPU as it schedules system processes, but only one thread will be
running at any moment regardless of CPU count.

Common wisdom is that Python threading works well for I/O bound
systems, where each thread spends most of its time waiting for some I/O
operation to complete -- thereby allowing the OS to schedule other threads.

For CPU bound, use of the multiprocessing package may be more suited --
though you'll have to device a working IPC system transfer data to/from the
separate processes (no shared objects as possible with threads).

--
Wulfraed Dennis Lee Bieber AF6VN
wlfraed@ix.netcom.com http://wlfraed.microdiversity.freeddns.org/

Re: Is there a more efficient threading lock?

<mailman.2009.1677390651.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=21966&group=comp.lang.python#21966

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!2.eu.feeder.erje.net!feeder.erje.net!fu-berlin.de!uni-berlin.de!not-for-mail
From: ros...@gmail.com (Chris Angelico)
Newsgroups: comp.lang.python
Subject: Re: Is there a more efficient threading lock?
Date: Sun, 26 Feb 2023 16:50:38 +1100
Lines: 63
Message-ID: <mailman.2009.1677390651.20444.python-list@python.org>
References: <CANc-5Uz1GwFrw9Rw134A3maCs-FK06VXFygJD6W7NFGEKy20pA@mail.gmail.com>
<5fec3cb2-52a2-758b-fda9-bcd6608d5681@tompassin.net>
<CANc-5Ux9fg6v0Gjtqi=g=MdB-OJtOeM3u6xPS0c3nyPKp88Acg@mail.gmail.com>
<mailman.1997.1677362017.20444.python-list@python.org>
<a3jlvhhm479ltklcp8u42t6qffmltokba9@4ax.com>
<CAPTjJmpMu+KDByZ16AfvKLAHSYBiNFrpsrXfXNsrAEqTbD77hg@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
X-Trace: news.uni-berlin.de 4ha1RV/2n6KOHm1Q/QKgQwQ690GyP39nXvVByuKwnbHA==
Return-Path: <rosuav@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=DQLZtIT0;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.012
X-Spam-Evidence: '*H*': 0.98; '*S*': 0.00; 'def': 0.04; '(most': 0.05;
'thread': 0.05; '2023': 0.07; 'cpu': 0.07; 'pep': 0.07; 'sun,':
0.07; 'fact,': 0.09; 'numpy': 0.09; 'skip:[ 30': 0.09; 'threads':
0.09; 'treated': 0.09; 'url-ip:151.101.0.223/32': 0.09; 'url-
ip:151.101.128.223/32': 0.09; 'url-ip:151.101.192.223/32': 0.09;
'url-ip:151.101.64.223/32': 0.09; 'import': 0.15; 'call,': 0.16;
'chrisa': 0.16; 'computation.': 0.16; 'count.': 0.16; 'far,':
0.16; 'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16;
'gil': 0.16; 'mentioned,': 0.16; 'montanaro': 0.16; 'now),': 0.16;
'operation.': 0.16; 'resources:': 0.16; 'skip:> 10': 0.16;
'subject:lock': 0.16; 'swap': 0.16; 'threading': 0.16; 'times,':
0.16; 'typing': 0.16; 'url:peps': 0.16; 'wrote:': 0.16; 'python':
0.16; 'applications': 0.17; 'feb': 0.17; 'libraries': 0.19;
'to:addr:python-list': 0.20; "i've": 0.22; 'maybe': 0.22; 'sat,':
0.22; '(and': 0.25; 'anything': 0.25; 'discussion': 0.25;
'library': 0.26; 'else': 0.27; 'function': 0.27; 'sense': 0.28;
'goes': 0.28; 'recently': 0.29; 'default': 0.31; 'raw': 0.32;
'subject:there': 0.32; 'message-id:@mail.gmail.com': 0.32;
'unless': 0.32; 'but': 0.32; "i'm": 0.33; 'there': 0.33; 'script':
0.33; 'proposal': 0.33; 'release': 0.34; 'able': 0.34; 'same':
0.34; 'core': 0.34; 'header:In-Reply-To:1': 0.34;
'received:google.com': 0.34; 'running': 0.34;
'from:addr:gmail.com': 0.35; 'count': 0.36; 'work,': 0.36; "it's":
0.37; 'received:209.85': 0.37; 'hard': 0.37; 'received:209': 0.39;
'quite': 0.39; 'single': 0.39; 'received:209.85.208': 0.39; 'use':
0.39; 'to.': 0.39; 'still': 0.40; 'want': 0.40; 'should': 0.40;
'four': 0.60; 'lack': 0.60; 'skip:h 10': 0.61; 'seen': 0.62;
'showing': 0.62; 'url-ip:151.101.0/24': 0.62; 'url-
ip:151.101.128/24': 0.62; 'url-ip:151.101.192/24': 0.62; 'url-
ip:151.101.64/24': 0.62; 'feel': 0.63; 'email': 0.63; 'skip:b 20':
0.63; 'down': 0.64; 'capable': 0.64; 'improved': 0.64; 'among':
0.65; 'years': 0.65; 'time.': 0.66; 'numbers': 0.67; 'time,':
0.67; 'exactly': 0.68; 'during': 0.69; 'following:': 0.69;
'performance,': 0.69; 'skip:b 60': 0.69; 'workers': 0.69;
'average': 0.70; 'performance': 0.71; 'skip:* 10': 0.71; 'free':
0.72; 'accurate': 0.74; 'easy': 0.74; '10%': 0.76; 'eight': 0.76;
'limits': 0.76; 'significant': 0.78; 'moment': 0.81; 'executing':
0.84; 'means,': 0.84; 'measurements': 0.84; 'penalties.': 0.84;
'pin': 0.84; 'pure,': 0.84; 'reducing': 0.84; 'suffer': 0.91;
'subject:more': 0.95
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
h=to:subject:message-id:date:from:in-reply-to:references:mime-version
:from:to:cc:subject:date:message-id:reply-to;
bh=VydvspJ9MMfnOf2gV7Q4dwCp2S9dBKl4q/7aAn1ntdQ=;
b=DQLZtIT0uEYe77W5rNuF9RMyabHSa6X4Vl62tpPqDmaTJvxVPhDboh204nzmXZk/3T
HKlKOYwou/Olysjm0g+fTwI4OOeRpKWH1no70FPyu/f4yc0d+2DUxir2FLmvLZ57bdrU
OOqdIGnsZATmcnyLg4FuS5b4bx/lW8RCO7PIb/yGr4Acl3xkQb7TPDGMljX0dCGSFBv6
fRvULv8KcayPXVTeVC7tZz9YlyZla2ZePSC766qNPqCVfJ2SHXAUK2aA1YnTU37qh7Lw
fkGA6SmGR10aTpXmGzUUmz74HkRp0hzXhwOc1hMQ0oLv0JgwpzTkzGB4eAVQ1t3hYIj8
MsKA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=to:subject:message-id:date:from:in-reply-to:references:mime-version
:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
bh=VydvspJ9MMfnOf2gV7Q4dwCp2S9dBKl4q/7aAn1ntdQ=;
b=MnpJ3aedNm6VVY9s/SwQ7LT/lOwZeH3oSjorzGT9oXEYLgz1haVJXtMNAWdOvoBQXy
XyttAsfnagJ1l8vWVsyur7MOMeRSiIh3dkln/i+McEUXZxDkxbjwwHw1UinlPO20SSPp
skNRGStNlxtOlnbjam+t+kAG7T77cGlZRIwo7hNNHN3lHJR0UyvwAXkZDUEYcLdnWsQk
eeWbaxihE5PitAoCSjB2LtZ7txWh9wX1ZCYPEVwammnx6mofP78eP5T+TOc9LDSFPHaE
pM8xCzL0uLFeohzvHQcEXbRqB2kjR2LIyW5uI+pm1YiKjPWRO1bHAjZo9X0RmHnN+/uS
0vVA==
X-Gm-Message-State: AO0yUKUsa50z5RiPuZEQPj32BqVA5ORsyMaJDYt7PATyS62IJx+e05J4
6xfKZt9j2O726pIb573Xg++bMsg45UPx83q5/7YeolUzjDw=
X-Google-Smtp-Source: AK7set/YEoew3Knpm3t7C5I2VMg11GvIa84xiwvCR5yvcj7kNL62BKC2PBbdylwF/3OVJH0e6N5kwow8AUbYMsscOT0=
X-Received: by 2002:a17:907:2071:b0:8e5:411d:4d09 with SMTP id
qp17-20020a170907207100b008e5411d4d09mr7220020ejb.15.1677390649614; Sat, 25
Feb 2023 21:50:49 -0800 (PST)
In-Reply-To: <a3jlvhhm479ltklcp8u42t6qffmltokba9@4ax.com>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CAPTjJmpMu+KDByZ16AfvKLAHSYBiNFrpsrXfXNsrAEqTbD77hg@mail.gmail.com>
X-Mailman-Original-References: <CANc-5Uz1GwFrw9Rw134A3maCs-FK06VXFygJD6W7NFGEKy20pA@mail.gmail.com>
<5fec3cb2-52a2-758b-fda9-bcd6608d5681@tompassin.net>
<CANc-5Ux9fg6v0Gjtqi=g=MdB-OJtOeM3u6xPS0c3nyPKp88Acg@mail.gmail.com>
<mailman.1997.1677362017.20444.python-list@python.org>
<a3jlvhhm479ltklcp8u42t6qffmltokba9@4ax.com>
 by: Chris Angelico - Sun, 26 Feb 2023 05:50 UTC

On Sun, 26 Feb 2023 at 16:27, Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote:
>
> On Sat, 25 Feb 2023 15:41:52 -0600, Skip Montanaro
> <skip.montanaro@gmail.com> declaimed the following:
>
>
> >concurrent.futures.ThreadPoolExecutor() with the default number of workers (
> >os.cpu_count() * 1.5, or 12 threads on my system) to process each month, so
> >12 active threads at a time. Given that the process is pretty much CPU
> >bound, maybe reducing the number of workers to the CPU count would make
>
> Unless things have improved a lot over the years, the GIL still limits
> active threads to the equivalent of a single CPU. The OS may swap among
> which CPU as it schedules system processes, but only one thread will be
> running at any moment regardless of CPU count.

Specifically, a single CPU core *executing Python bytecode*. There are
quite a few libraries that release the GIL during computation. Here's
a script that's quite capable of saturating my CPU entirely - in fact,
typing this email is glitchy due to lack of resources:

import threading
import bcrypt
results = [0, 0]
def thrd():
for _ in range(10):
ok = bcrypt.checkpw(b"password",
b'$2b$15$DGDXMb2zvPotw1rHFouzyOVzSopiLIUSedO5DVGQ1GblAd6L6I8/6')
results[ok] += 1

threads = [threading.Thread(target=thrd) for _ in range(100)]
for t in threads: t.start()
for t in threads: t.join()
print(results)

I have four cores eight threads, and yeah, my CPU's not exactly the
latest and greatest (i7 6700k - it was quite good some years ago, but
outstripped now), but feel free to crank the numbers if you want to.

I'm pretty sure bcrypt won't use more than one CPU core for a single
hashpw/checkpw call, but by releasing the GIL during the hard number
crunching, it allows easy parallelization. Same goes for numpy work,
or anything else that can be treated as a separate operation.

So it's more accurate to say that only one CPU core can be
*manipulating Python objects* at a time, although it's hard to pin
down exactly what that means, making it easier to say that there can
only be one executing Python bytecode; it should be possible for any
function call into a C library to be a point where other threads can
take over (most notably, any sort of I/O, but also these kinds of
things).

As mentioned, GIL-removal has been under discussion at times, most
recently (and currently) with PEP 703
https://peps.python.org/pep-0703/ - and the benefits in multithreaded
applications always have to be factored against quite significant
performance penalties. It's looking like PEP 703's proposal has the
least-bad performance measurements of any GILectomy I've seen so far,
showing 10% worse performance on average (possibly able to be reduced
to 5%). As it happens, a GIL just makes sense when you want pure, raw
performance, and it's only certain workloads that suffer under it.

ChrisA

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor