Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

When Dexter's on the Internet, can Hell be far behind?"


devel / comp.lang.c++ / Re: Sieve of Erastosthenes optimized to the max

SubjectAuthor
* Sieve of Erastosthenes optimized to the maxBonita Montero
+* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|`* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
| `* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|  +* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|  |`* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|  | +* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|  | |`- Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|  | `* Re: Sieve of Erastosthenes optimized to the maxred floyd
|  |  `* Re: Sieve of Erastosthenes optimized to the maxTim Rentsch
|  |   `* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|  |    `* Re: Sieve of Erastosthenes optimized to the maxTim Rentsch
|  |     `* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|  |      `- Re: Sieve of Erastosthenes optimized to the maxTim Rentsch
|  `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|   `* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|    +- Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|    +- Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|    `* Re: Sieve of Erastosthenes optimized to the maxTim Rentsch
|     `* Re: Sieve of Erastosthenes optimized to the maxVir Campestris
|      `- Re: Sieve of Erastosthenes optimized to the maxTim Rentsch
+* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|`* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
| `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|  `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|   `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|    `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|     `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|      `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|       `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|        `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|         `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          +* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |`* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          | `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |  `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |   `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |    `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |     `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |      `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       +* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |`* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | +* Re: Sieve of Erastosthenes optimized to the maxDavid Brown
|          |       | |`* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | | `* Re: Sieve of Erastosthenes optimized to the maxDavid Brown
|          |       | |  +- Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  `* Re: Sieve of Erastosthenes optimized to the maxScott Lurndal
|          |       | |   `- Re: Sieve of Erastosthenes optimized to the maxDavid Brown
|          |       | +* Re: Sieve of Erastosthenes optimized to the maxScott Lurndal
|          |       | |+* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | ||`- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       | |`* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       | | `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       | |  +* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |+- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       | |  |`* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       | |  | `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |  +* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       | |  |  |`* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |  | `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       | |  |  |  `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |  |   `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       | |  |  |    `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |  |     `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       | |  |  |      `- Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |  `* Re: Sieve of Erastosthenes optimized to the maxScott Lurndal
|          |       | |  |   `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |    `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       | |  |     `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       | |  |      `- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       | |  `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       | |   `- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       | `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       |  `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |   `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |    +* Re: Sieve of Erastosthenes optimized to the maxred floyd
|          |       |    |`- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |    `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     +* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |`* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     | `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |  `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |   `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |    `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |     `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |      `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |       `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |        `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |         `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |          `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |           `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       |     |            `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |             `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |              +* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |              |`- Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |              `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |               `* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |                `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |                 +* Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |                 |`* Re: Sieve of Erastosthenes optimized to the maxred floyd
|          |       |     |                 | +- Re: Sieve of Erastosthenes optimized to the maxBonita Montero
|          |       |     |                 | `- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       |     |                 `* Re: Sieve of Erastosthenes optimized to the maxKaz Kylheku
|          |       |     `- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          |       `- Re: Sieve of Erastosthenes optimized to the maxChris M. Thomasson
|          `* Re: Sieve of Erastosthenes optimized to the maxBonita Montero
`* Re: Sieve of Erastosthenes optimized to the maxwij

Pages:123456
Re: Sieve of Erastosthenes optimized to the max

<uma7i4$2nagg$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2960&group=comp.lang.c%2B%2B#2960

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m....@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Sun, 24 Dec 2023 13:24:19 -0800
Organization: A noiseless patient Spider
Lines: 18
Message-ID: <uma7i4$2nagg$1@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 24 Dec 2023 21:24:21 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="8203a3868b3e5e3f8cc810ae49a6036b";
logging-data="2861584"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+MdIMA2a20VNP5hyAYJRhoez3yhRiv+Zw="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:+XLt+A3RH9ASkIcNsv/myyuBxsU=
In-Reply-To: <um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
Content-Language: en-US
 by: Chris M. Thomasson - Sun, 24 Dec 2023 21:24 UTC

On 12/24/2023 2:03 AM, Bonita Montero wrote:
> Am 23.12.2023 um 21:52 schrieb Chris M. Thomasson:
>
>> On 12/22/2023 8:55 AM, Bonita Montero wrote:
>
>>> False-sharing woudln't hurt my algorithm much since only the beginning
>>> and the end of the thread-local segment overlaps with other thread; but
>>> I do it anyway to have maximum performance.
>
>> That's is a good habit to get into. :^)
>
> I experimentally removed the masking of the lower bits of the
> partition bounds  according to the cacheline size and there was
> no measurable performance-loss.
>

Still, imvvho, it _is_ a good practice to get into wrt padding and
aligning...

Re: Sieve of Erastosthenes optimized to the max

<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2963&group=comp.lang.c%2B%2B#2963

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.M...@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Tue, 26 Dec 2023 06:00:00 +0100
Organization: A noiseless patient Spider
Lines: 9
Message-ID: <umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 26 Dec 2023 04:59:59 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="6eab0890560ad79ebb731e885d1d0b46";
logging-data="3561021"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19FAO//AACo13fL1rhzShRU1ZGI5utHMyQ="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:5ugR6RSgggeSQPHeMJNQGHAEfA0=
Content-Language: de-DE
In-Reply-To: <uma7i4$2nagg$1@dont-email.me>
 by: Bonita Montero - Tue, 26 Dec 2023 05:00 UTC

Am 24.12.2023 um 22:24 schrieb Chris M. Thomasson:

> Still, imvvho, it _is_ a good practice to get into wrt padding and
> aligning...

With an upper bound of 2 ^ 32 I've got 131070 cachleines per thread.
If I have false sharing at the beginning and end of the range that
doesn't hurt much.

Re: Sieve of Erastosthenes optimized to the max

<umdovb$3cmi3$2@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2964&group=comp.lang.c%2B%2B#2964

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m....@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Mon, 25 Dec 2023 21:39:54 -0800
Organization: A noiseless patient Spider
Lines: 15
Message-ID: <umdovb$3cmi3$2@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 26 Dec 2023 05:39:55 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="5538523a4798ce7e5cc2863d9d055f49";
logging-data="3562051"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/p0RuitFhdpeP7Zaqr3X2V4A1pmv/Ci8A="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:+bXS2kxfTjUZSi3ZcR2Ag+ioxc4=
Content-Language: en-US
In-Reply-To: <umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
 by: Chris M. Thomasson - Tue, 26 Dec 2023 05:39 UTC

On 12/25/2023 9:00 PM, Bonita Montero wrote:
> Am 24.12.2023 um 22:24 schrieb Chris M. Thomasson:
>
>> Still, imvvho, it _is_ a good practice to get into wrt padding and
>> aligning...
>
> With an upper bound of 2 ^ 32 I've got 131070 cachleines per thread.
> If I have false sharing at the beginning and end of the range that
> doesn't hurt much.
>

Remember when Intel first started hyperthreading and god damn threads
could false share with each other (on the stacks!) the low and high 64
byte parts of the 128 byte cache lines? A workaround was to artificially
offset the threads stacks via alloca.

Re: Sieve of Erastosthenes optimized to the max

<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2965&group=comp.lang.c%2B%2B#2965

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.M...@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Tue, 26 Dec 2023 10:27:54 +0100
Organization: A noiseless patient Spider
Lines: 11
Message-ID: <ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 26 Dec 2023 09:27:53 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="6eab0890560ad79ebb731e885d1d0b46";
logging-data="3617396"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19HNC/9hJkq1UF2lPF6+nLDf6wIux6N9TM="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:em1jDf6cdIU8BT1vgWRbcFHqkz8=
Content-Language: de-DE
In-Reply-To: <umdovb$3cmi3$2@dont-email.me>
 by: Bonita Montero - Tue, 26 Dec 2023 09:27 UTC

Am 26.12.2023 um 06:39 schrieb Chris M. Thomasson:

> Remember when Intel first started hyperthreading and god damn threads
> could false share with each other (on the stacks!) the low and high 64
> byte parts of the 128 byte cache lines? A workaround was to artificially
> offset the threads stacks via alloca.

If you have one core and two threads there's no false sharing.
It doesn't matter if the "conflicting" accesses come from either
thread of from one thread with that. False sharing is only pos-
sible with two cores or more.

Re: Sieve of Erastosthenes optimized to the max

<umfcqi$3jktj$2@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2966&group=comp.lang.c%2B%2B#2966

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m....@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Tue, 26 Dec 2023 12:24:50 -0800
Organization: A noiseless patient Spider
Lines: 22
Message-ID: <umfcqi$3jktj$2@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 26 Dec 2023 20:24:50 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="5538523a4798ce7e5cc2863d9d055f49";
logging-data="3789747"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+6g6+1xgrvm2X2ralM/y2H+kK9DhubXJY="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:AJx3Fp2mt14IqbKGbZ4LcOPksbU=
Content-Language: en-US
In-Reply-To: <ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
 by: Chris M. Thomasson - Tue, 26 Dec 2023 20:24 UTC

On 12/26/2023 1:27 AM, Bonita Montero wrote:
> Am 26.12.2023 um 06:39 schrieb Chris M. Thomasson:
>
>> Remember when Intel first started hyperthreading and god damn threads
>> could false share with each other (on the stacks!) the low and high 64
>> byte parts of the 128 byte cache lines? A workaround was to
>> artificially offset the threads stacks via alloca.
>
> If you have one core and two threads there's no false sharing.

So, are you familiar with Intel's early hyper threading problem? There
was false sharing between the hyperhtreads. The workaround did improve
performance by quite a bit. IIRC, my older appcore project had this
workaround incorporated into it logic. I wrote that sucker back in very
early 2000's. Humm... I will try to find the exact line.

> It doesn't matter if the "conflicting" accesses come from either
> thread of from one thread with that. False sharing is only pos-
> sible with two cores or more.

Re: Sieve of Erastosthenes optimized to the max

<20231226152712.582@kylheku.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2967&group=comp.lang.c%2B%2B#2967

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: 433-929-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Tue, 26 Dec 2023 23:35:11 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 42
Message-ID: <20231226152712.582@kylheku.com>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me>
Injection-Date: Tue, 26 Dec 2023 23:35:11 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f355253f546b693613fe055263939d7a";
logging-data="3831303"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/Bv+5cLhKhyS7MShRJI8EHWYMkT7mOdUQ="
User-Agent: slrn/pre1.0.4-9 (Linux)
Cancel-Lock: sha1:BGparzd5Xc7NhOmTmnnsg09OP+s=
 by: Kaz Kylheku - Tue, 26 Dec 2023 23:35 UTC

On 2023-12-26, Chris M. Thomasson <chris.m.thomasson.1@gmail.com> wrote:
> On 12/26/2023 1:27 AM, Bonita Montero wrote:
>> Am 26.12.2023 um 06:39 schrieb Chris M. Thomasson:
>>
>>> Remember when Intel first started hyperthreading and god damn threads
>>> could false share with each other (on the stacks!) the low and high 64
>>> byte parts of the 128 byte cache lines? A workaround was to
>>> artificially offset the threads stacks via alloca.
>>
>> If you have one core and two threads there's no false sharing.
>
> So, are you familiar with Intel's early hyper threading problem? There
> was false sharing between the hyperhtreads. The workaround did improve
> performance by quite a bit. IIRC, my older appcore project had this
> workaround incorporated into it logic. I wrote that sucker back in very
> early 2000's. Humm... I will try to find the exact line.

Could you be both right? The performance problem was real, but maybe
it wasn't "false sharing"? The hyper-thread are the same core; they
share the same caches.

Might you be describing a cache collision rather than false sharing?

A single processor can trigger degenerate cache uses (at any level
of a cache hierarchy). For instance, if pages of virtual memory
are accessed in certain patterns, they can trash the same TLB entry.

Was it perhaps the case that these thread stacks were allocated at such
a stride, that their addresses clashed on the same cache line set?

That could be a problem even on one processor, but it's obvious that
hyperthreading could exacerbate it because switches between hyperthreads
happen on a fine granularity. They don't get to run a full
scheduler-driven time quantum. A low-level processor event like a
pipeline stall (or other resource issue) can drive a hyper thread
switch.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

Re: Sieve of Erastosthenes optimized to the max

<umfo3c$3l1oh$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2968&group=comp.lang.c%2B%2B#2968

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!usenet.network!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m....@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Tue, 26 Dec 2023 15:37:16 -0800
Organization: A noiseless patient Spider
Lines: 45
Message-ID: <umfo3c$3l1oh$1@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 26 Dec 2023 23:37:17 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c66b038f7399639f89b200903ece3f8f";
logging-data="3835665"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+qmH8XSFwxRdlZu13AekwCAakbC7YLcpw="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:VTT+PY+QU/tiNxl86DPzJ70vTSo=
In-Reply-To: <20231226152712.582@kylheku.com>
Content-Language: en-US
 by: Chris M. Thomasson - Tue, 26 Dec 2023 23:37 UTC

On 12/26/2023 3:35 PM, Kaz Kylheku wrote:
> On 2023-12-26, Chris M. Thomasson <chris.m.thomasson.1@gmail.com> wrote:
>> On 12/26/2023 1:27 AM, Bonita Montero wrote:
>>> Am 26.12.2023 um 06:39 schrieb Chris M. Thomasson:
>>>
>>>> Remember when Intel first started hyperthreading and god damn threads
>>>> could false share with each other (on the stacks!) the low and high 64
>>>> byte parts of the 128 byte cache lines? A workaround was to
>>>> artificially offset the threads stacks via alloca.
>>>
>>> If you have one core and two threads there's no false sharing.
>>
>> So, are you familiar with Intel's early hyper threading problem? There
>> was false sharing between the hyperhtreads. The workaround did improve
>> performance by quite a bit. IIRC, my older appcore project had this
>> workaround incorporated into it logic. I wrote that sucker back in very
>> early 2000's. Humm... I will try to find the exact line.
>
> Could you be both right? The performance problem was real, but maybe
> it wasn't "false sharing"? The hyper-thread are the same core; they
> share the same caches.
>
> Might you be describing a cache collision rather than false sharing?

Iirc, when I read the docs from Intel, it was the low 64 bytes being
falsely shared with the high 64 bytes of a 128 byte l2 cache line?

>
> A single processor can trigger degenerate cache uses (at any level
> of a cache hierarchy). For instance, if pages of virtual memory
> are accessed in certain patterns, they can trash the same TLB entry.
>
> Was it perhaps the case that these thread stacks were allocated at such
> a stride, that their addresses clashed on the same cache line set?
>
> That could be a problem even on one processor, but it's obvious that
> hyperthreading could exacerbate it because switches between hyperthreads
> happen on a fine granularity. They don't get to run a full
> scheduler-driven time quantum. A low-level processor event like a
> pipeline stall (or other resource issue) can drive a hyper thread
> switch.
>

Re: Sieve of Erastosthenes optimized to the max

<umgbci$3qpao$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2969&group=comp.lang.c%2B%2B#2969

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.M...@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Wed, 27 Dec 2023 06:06:28 +0100
Organization: A noiseless patient Spider
Lines: 6
Message-ID: <umgbci$3qpao$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 27 Dec 2023 05:06:26 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="8354d6394616f4ec794902376f74a897";
logging-data="4023640"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+31LFdoGaHAOhVFznIdkEoXtoJTghLD2M="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:6hmo7cM7l3cUrXtF8VhhrARdJbg=
Content-Language: de-DE
In-Reply-To: <umfcqi$3jktj$2@dont-email.me>
 by: Bonita Montero - Wed, 27 Dec 2023 05:06 UTC

Am 26.12.2023 um 21:24 schrieb Chris M. Thomasson:

> So, are you familiar with Intel's early hyper threading problem?
> There was false sharing between the ...

False sharing can only happen between different cores.

Re: Sieve of Erastosthenes optimized to the max

<umgego$3r29p$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2970&group=comp.lang.c%2B%2B#2970

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m....@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Tue, 26 Dec 2023 21:59:51 -0800
Organization: A noiseless patient Spider
Lines: 52
Message-ID: <umgego$3r29p$1@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 27 Dec 2023 05:59:52 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c66b038f7399639f89b200903ece3f8f";
logging-data="4032825"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18PjHMk/qFW9C5AXL03bgoIDdM/ORd8ICs="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:G2Po9PXvO0klehWI18YvWwIWX3c=
Content-Language: en-US
In-Reply-To: <umfo3c$3l1oh$1@dont-email.me>
 by: Chris M. Thomasson - Wed, 27 Dec 2023 05:59 UTC

On 12/26/2023 3:37 PM, Chris M. Thomasson wrote:
> On 12/26/2023 3:35 PM, Kaz Kylheku wrote:
>> On 2023-12-26, Chris M. Thomasson <chris.m.thomasson.1@gmail.com> wrote:
>>> On 12/26/2023 1:27 AM, Bonita Montero wrote:
>>>> Am 26.12.2023 um 06:39 schrieb Chris M. Thomasson:
>>>>
>>>>> Remember when Intel first started hyperthreading and god damn threads
>>>>> could false share with each other (on the stacks!) the low and high 64
>>>>> byte parts of the 128 byte cache lines? A workaround was to
>>>>> artificially offset the threads stacks via alloca.
>>>>
>>>> If you have one core and two threads there's no false sharing.
>>>
>>> So, are you familiar with Intel's early hyper threading problem? There
>>> was false sharing between the hyperhtreads. The workaround did improve
>>> performance by quite a bit. IIRC, my older appcore project had this
>>> workaround incorporated into it logic. I wrote that sucker back in very
>>> early 2000's. Humm... I will try to find the exact line.
>>
>> Could you be both right? The performance problem was real, but maybe
>> it wasn't "false sharing"? The hyper-thread are the same core; they
>> share the same caches.
>>
>> Might you be describing a cache collision rather than false sharing?
>
> Iirc, when I read the docs from Intel, it was the low 64 bytes being
> falsely shared with the high 64 bytes of a 128 byte l2 cache line?

Something about false interference between threads that should not even
be interacting with one another to begin with. It was a problem. The fix
was based on alloca, so that is something to ponder on.

>
>
>
>>
>> A single processor can trigger degenerate cache uses (at any level
>> of a cache hierarchy). For instance, if pages of virtual memory
>> are accessed in certain patterns, they can trash the same TLB entry.
>>
>> Was it perhaps the case that these thread stacks were allocated at such
>> a stride, that their addresses clashed on the same cache line set?
>>
>> That could be a problem even on one processor, but it's obvious that
>> hyperthreading could exacerbate it because switches between hyperthreads
>> happen on a fine granularity. They don't get to run a full
>> scheduler-driven time quantum. A low-level processor event like a
>> pipeline stall (or other resource issue) can drive a hyper thread
>> switch.
>>
>

Re: Sieve of Erastosthenes optimized to the max

<umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2971&group=comp.lang.c%2B%2B#2971

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.M...@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Wed, 27 Dec 2023 10:23:19 +0100
Organization: A noiseless patient Spider
Lines: 11
Message-ID: <umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 27 Dec 2023 09:23:17 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="8354d6394616f4ec794902376f74a897";
logging-data="4075665"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ddGqeKXI4S/S69JSRxf+UlXEMPCzbCB4="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:Gw/0wIf4wSlEuTEGEDUIGcrBeFc=
Content-Language: de-DE
In-Reply-To: <umgego$3r29p$1@dont-email.me>
 by: Bonita Montero - Wed, 27 Dec 2023 09:23 UTC

Am 27.12.2023 um 06:59 schrieb Chris M. Thomasson:

> Something about false interference between threads that should not even
> be interacting with one another to begin with. It was a problem. The fix
> was based on alloca, so that is something to ponder on.;

Like with any SMT-core there could be cache-thrashing between the
cores. The L1 data cache was only 8kB, maybe only two was associative
that could be thrashing between the cores. But I've no clue what this
would have to to with alloca().

Re: Sieve of Erastosthenes optimized to the max

<20231227124453.126@kylheku.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2972&group=comp.lang.c%2B%2B#2972

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!rocksolid2!i2pn.org!news.chmurka.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: 433-929-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Wed, 27 Dec 2023 20:49:06 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 28
Message-ID: <20231227124453.126@kylheku.com>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me>
<umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>
Injection-Date: Wed, 27 Dec 2023 20:49:06 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f355253f546b693613fe055263939d7a";
logging-data="65805"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+7i3Fx8Xzc6irlNyVNC79FizwWyNMg07s="
User-Agent: slrn/pre1.0.4-9 (Linux)
Cancel-Lock: sha1:AAeeFv4vUSWJ2+cZpzShfgZo/7o=
 by: Kaz Kylheku - Wed, 27 Dec 2023 20:49 UTC

On 2023-12-27, Bonita Montero <Bonita.Montero@gmail.com> wrote:
> Am 27.12.2023 um 06:59 schrieb Chris M. Thomasson:
>
>> Something about false interference between threads that should not even
>> be interacting with one another to begin with. It was a problem. The fix
>> was based on alloca, so that is something to ponder on.;
>
> Like with any SMT-core there could be cache-thrashing between the
> cores. The L1 data cache was only 8kB, maybe only two was associative
> that could be thrashing between the cores. But I've no clue what this
> would have to to with alloca().

Say you have thread stacks that are offset by some power of two amount,
like a megabyte. Stack addresses at the same depth (like the local
variables of threads executing the same function) are likely to collide
on the same cache set (at different levels of the caching hierachy: L1,
L2, translation caches).

With alloca, since it moves the stack pointer, we can carve variable
amounts of stack space to randomize the stack offsets (before calling
the work functions). In different threads, we use differently-sized
alloca allocations.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

Re: Sieve of Erastosthenes optimized to the max

<umjkg8$bqmr$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2973&group=comp.lang.c%2B%2B#2973

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!nntp.comgw.net!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.M...@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Thu, 28 Dec 2023 12:00:27 +0100
Organization: A noiseless patient Spider
Lines: 19
Message-ID: <umjkg8$bqmr$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me>
<umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>
<20231227124453.126@kylheku.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 28 Dec 2023 11:00:25 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="0c4b36094dca328cd048e78c8d9dbce0";
logging-data="387803"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18dCXlEjpJa5mm/lnBdjNIKj4OcEDbICjY="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:KSNKnCKekKVB/1tGu7NPEdshiAM=
Content-Language: de-DE
In-Reply-To: <20231227124453.126@kylheku.com>
 by: Bonita Montero - Thu, 28 Dec 2023 11:00 UTC

Am 27.12.2023 um 21:49 schrieb Kaz Kylheku:

> Say you have thread stacks that are offset by some power of two amount,
> like a megabyte. Stack addresses at the same depth (like the local
> variables of threads executing the same function) are likely to collide
> on the same cache set (at different levels of the caching hierachy: L1,
> L2, translation caches).
>> With alloca, since it moves the stack pointer, we can carve variable
> amounts of stack space to randomize the stack offsets (before calling
> the work functions). In different threads, we use differently-sized
> alloca allocations.

That's not sth. alloca() could alleviate. The startup-code inside the
userspace-part of the thread should randomize the starting address of
the stack within the size of a set.
Most resources on the net say that the Pentium 4's L1 data cache asso-
ciativity is eight, so there's not much chance of aliasing inside the
L1D cache.

Re: Sieve of Erastosthenes optimized to the max

<uml0sp$i3j7$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2974&group=comp.lang.c%2B%2B#2974

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!news.niel.me!news.gegeweb.eu!gegeweb.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m....@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Thu, 28 Dec 2023 15:38:01 -0800
Organization: A noiseless patient Spider
Lines: 24
Message-ID: <uml0sp$i3j7$1@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me>
<umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>
<20231227124453.126@kylheku.com>
<umjkg8$bqmr$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 28 Dec 2023 23:38:02 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="3b9c8277d0ce293bbef9695132362a05";
logging-data="593511"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19COa8/LzKamiDxlRKcS3Yu87yjvtaZOAg="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:vrZtrUMlDf2CfyaXOi9reHiaUk4=
In-Reply-To: <umjkg8$bqmr$1@raubtier-asyl.eternal-september.org>
Content-Language: en-US
 by: Chris M. Thomasson - Thu, 28 Dec 2023 23:38 UTC

On 12/28/2023 3:00 AM, Bonita Montero wrote:
> Am 27.12.2023 um 21:49 schrieb Kaz Kylheku:
>
>> Say you have thread stacks that are offset by some power of two amount,
>> like a megabyte. Stack addresses at the same depth (like the local
>> variables of threads executing the same function) are likely to collide
>> on the same cache set (at different levels of the caching hierachy: L1,
>> L2, translation caches).
> >> With alloca, since it moves the stack pointer, we can carve variable
>> amounts of stack space to randomize the stack offsets (before calling
>> the work functions). In different threads, we use differently-sized
>> alloca allocations.
>
> That's not sth. alloca() could alleviate. The startup-code inside the
> userspace-part of the thread should randomize the starting address of
> the stack within the size of a set.
> Most resources on the net say that the Pentium 4's L1 data cache asso-
> ciativity is eight, so there's not much chance of aliasing inside the
> L1D cache.
>

The use of alloca to try to get around the problem in their (Intel's)
early hyperthreaded processors was real, and actually helped. It was in
the Intel docs.

Re: Sieve of Erastosthenes optimized to the max

<umldni$n9ob$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2975&group=comp.lang.c%2B%2B#2975

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!news.swapon.de!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.M...@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Fri, 29 Dec 2023 04:17:09 +0100
Organization: A noiseless patient Spider
Lines: 8
Message-ID: <umldni$n9ob$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me>
<umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>
<20231227124453.126@kylheku.com>
<umjkg8$bqmr$1@raubtier-asyl.eternal-september.org>
<uml0sp$i3j7$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 29 Dec 2023 03:17:06 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="303961f0045fd5f77a8ff2f562213bc9";
logging-data="763659"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18w+rP+1fPFNCcLKiuWxpb4u+gtNSMhWU4="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:fwpqlNoaDa3VVmZuZB57F5zgbtQ=
Content-Language: de-DE
In-Reply-To: <uml0sp$i3j7$1@dont-email.me>
 by: Bonita Montero - Fri, 29 Dec 2023 03:17 UTC

Am 29.12.2023 um 00:38 schrieb Chris M. Thomasson:

> The use of alloca to try to get around the problem in their (Intel's)
> early hyperthreaded processors was real, and actually helped. It was in
> the Intel docs.

I don't believe it.

Re: Sieve of Erastosthenes optimized to the max

<umljmb$nrkt$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2976&group=comp.lang.c%2B%2B#2976

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m....@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Thu, 28 Dec 2023 20:58:51 -0800
Organization: A noiseless patient Spider
Lines: 11
Message-ID: <umljmb$nrkt$1@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me>
<umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>
<20231227124453.126@kylheku.com>
<umjkg8$bqmr$1@raubtier-asyl.eternal-september.org>
<uml0sp$i3j7$1@dont-email.me>
<umldni$n9ob$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 29 Dec 2023 04:58:51 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="3b9c8277d0ce293bbef9695132362a05";
logging-data="781981"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+gdux2haN4VS6JtV7a9JBm5BxZkSeGBIg="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:kiiAFpv336bPy5ztvhZg33Z0Umo=
Content-Language: en-US
In-Reply-To: <umldni$n9ob$1@raubtier-asyl.eternal-september.org>
 by: Chris M. Thomasson - Fri, 29 Dec 2023 04:58 UTC

On 12/28/2023 7:17 PM, Bonita Montero wrote:
> Am 29.12.2023 um 00:38 schrieb Chris M. Thomasson:
>
>> The use of alloca to try to get around the problem in their (Intel's)
>> early hyperthreaded processors was real, and actually helped. It was
>> in the Intel docs.
>
> I don't believe it.
>

Why not?

Re: Sieve of Erastosthenes optimized to the max

<umm582$pqb4$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2977&group=comp.lang.c%2B%2B#2977

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.M...@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Fri, 29 Dec 2023 10:58:30 +0100
Organization: A noiseless patient Spider
Lines: 18
Message-ID: <umm582$pqb4$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me>
<umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>
<20231227124453.126@kylheku.com>
<umjkg8$bqmr$1@raubtier-asyl.eternal-september.org>
<uml0sp$i3j7$1@dont-email.me>
<umldni$n9ob$1@raubtier-asyl.eternal-september.org>
<umljmb$nrkt$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 29 Dec 2023 09:58:27 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="303961f0045fd5f77a8ff2f562213bc9";
logging-data="846180"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/i92vj+0l9oh4KHnpGddB/dIfIluFNKhU="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:g2rGEBHH/2r9PKQqpv4J+eRvps8=
Content-Language: de-DE
In-Reply-To: <umljmb$nrkt$1@dont-email.me>
 by: Bonita Montero - Fri, 29 Dec 2023 09:58 UTC

Am 29.12.2023 um 05:58 schrieb Chris M. Thomasson:

> On 12/28/2023 7:17 PM, Bonita Montero wrote:

>> Am 29.12.2023 um 00:38 schrieb Chris M. Thomasson:

>>> The use of alloca to try to get around the problem in their (Intel's)
>>> early hyperthreaded processors was real, and actually helped. It was
>>> in the Intel docs.

>> I don't believe it.

> Why not?

Because I don't understand what's different with the access pattern
of alloca() and usual stack allocation. And if I google for "alloca
Pentium 4 site:intel.com" I can't find anything that fits.

Re: Sieve of Erastosthenes optimized to the max

<ummfmm$r4ol$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2978&group=comp.lang.c%2B%2B#2978

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Fri, 29 Dec 2023 13:56:54 +0100
Organization: A noiseless patient Spider
Lines: 28
Message-ID: <ummfmm$r4ol$1@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me>
<umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>
<20231227124453.126@kylheku.com>
<umjkg8$bqmr$1@raubtier-asyl.eternal-september.org>
<uml0sp$i3j7$1@dont-email.me>
<umldni$n9ob$1@raubtier-asyl.eternal-september.org>
<umljmb$nrkt$1@dont-email.me>
<umm582$pqb4$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 29 Dec 2023 12:56:54 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="e9539489de686d5d1af102aae405ff52";
logging-data="889621"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+0eoDtPEYNvNcpHeh5+2B7hqpvYqDcS8s="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.11.0
Cancel-Lock: sha1:cUt5UopInrUCFcoNgFfxUr7roOM=
In-Reply-To: <umm582$pqb4$1@raubtier-asyl.eternal-september.org>
Content-Language: en-GB
 by: David Brown - Fri, 29 Dec 2023 12:56 UTC

On 29/12/2023 10:58, Bonita Montero wrote:
> Am 29.12.2023 um 05:58 schrieb Chris M. Thomasson:
>
>> On 12/28/2023 7:17 PM, Bonita Montero wrote:
>
>>> Am 29.12.2023 um 00:38 schrieb Chris M. Thomasson:
>
>>>> The use of alloca to try to get around the problem in their
>>>> (Intel's) early hyperthreaded processors was real, and actually
>>>> helped. It was in the Intel docs.
>
>>> I don't believe it.
>
>> Why not?
>
> Because I don't understand what's different with the access pattern
> of alloca() and usual stack allocation. And if I google for "alloca
> Pentium 4 site:intel.com" I can't find anything that fits.
>

There is nothing different from alloca() and ordinary stack allocations.
But alloca() makes it quick and easy to make large allocations, and to
do so with random sizes (or at least sizes that differ significantly
between threads, even if they are running the same code). Without
alloca(), you'd need to do something like arranging to call a recursive
function a random number of times before it then calls the next bit of
code in your thread. alloca() is simply far easier and faster.

Re: Sieve of Erastosthenes optimized to the max

<L3CjN.109558$p%Mb.3353@fx15.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2979&group=comp.lang.c%2B%2B#2979

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx15.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Sieve of Erastosthenes optimized to the max
Newsgroups: comp.lang.c++
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org> <umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com> <umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me> <umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org> <20231227124453.126@kylheku.com> <umjkg8$bqmr$1@raubtier-asyl.eternal-september.org> <uml0sp$i3j7$1@dont-email.me> <umldni$n9ob$1@raubtier-asyl.eternal-september.org> <umljmb$nrkt$1@dont-email.me> <umm582$pqb4$1@raubtier-asyl.eternal-september.org>
Lines: 22
Message-ID: <L3CjN.109558$p%Mb.3353@fx15.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Fri, 29 Dec 2023 16:01:47 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Fri, 29 Dec 2023 16:01:47 GMT
X-Received-Bytes: 1747
 by: Scott Lurndal - Fri, 29 Dec 2023 16:01 UTC

Bonita Montero <Bonita.Montero@gmail.com> writes:
>Am 29.12.2023 um 05:58 schrieb Chris M. Thomasson:
>
>> On 12/28/2023 7:17 PM, Bonita Montero wrote:
>
>>> Am 29.12.2023 um 00:38 schrieb Chris M. Thomasson:
>
>>>> The use of alloca to try to get around the problem in their (Intel's)
>>>> early hyperthreaded processors was real, and actually helped. It was
>>>> in the Intel docs.
>
>>> I don't believe it.
>
>> Why not?
>
>Because I don't understand what's different with the access pattern
>of alloca() and usual stack allocation. And if I google for "alloca
>Pentium 4 site:intel.com" I can't find anything that fits.
>

See page 2-35 in the _Intel Pentium 4 Processor Optimization_
manual.

Re: Sieve of Erastosthenes optimized to the max

<ummqmc$slfn$1@raubtier-asyl.eternal-september.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2980&group=comp.lang.c%2B%2B#2980

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.M...@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Fri, 29 Dec 2023 17:04:31 +0100
Organization: A noiseless patient Spider
Lines: 22
Message-ID: <ummqmc$slfn$1@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me>
<umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>
<20231227124453.126@kylheku.com>
<umjkg8$bqmr$1@raubtier-asyl.eternal-september.org>
<uml0sp$i3j7$1@dont-email.me>
<umldni$n9ob$1@raubtier-asyl.eternal-september.org>
<umljmb$nrkt$1@dont-email.me>
<umm582$pqb4$1@raubtier-asyl.eternal-september.org>
<ummfmm$r4ol$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 29 Dec 2023 16:04:28 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="303961f0045fd5f77a8ff2f562213bc9";
logging-data="939511"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+nr8atpgK1wLTdV+9GTZAKk4NUlCyEmqA="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:mxY0nqdF4GQ18mo7VRIWGdcBQcs=
In-Reply-To: <ummfmm$r4ol$1@dont-email.me>
Content-Language: de-DE
 by: Bonita Montero - Fri, 29 Dec 2023 16:04 UTC

Am 29.12.2023 um 13:56 schrieb David Brown:

> There is nothing different from alloca() and ordinary stack allocations.
> But alloca() makes it quick and easy to make large allocations, and to
> do so with random sizes (or at least sizes that differ significantly
> between threads, even if they are running the same code).

I've got my own class overflow_array<> which is like an array<> and a
vector<>. If you append more than the internal array<> can handle the
objects are moved to an internal vector. I think Boost's small_array
is similar to that.

> Without alloca(), you'd need to do something like arranging to call a
> recursive function a random number of times before it then calls the
> next bit of code in your thread.  alloca() is simply far easier and
> faster.

You've got strange ideas. alloca() has been completely removed from the
Linux kernel. The point is that if there's a fixed upper limit you would
allocate you could allocate it always statically.

Re: Sieve of Erastosthenes optimized to the max

<ummqqh$slfn$2@raubtier-asyl.eternal-september.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2981&group=comp.lang.c%2B%2B#2981

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!raubtier-asyl.eternal-september.org!.POSTED!not-for-mail
From: Bonita.M...@gmail.com (Bonita Montero)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Fri, 29 Dec 2023 17:06:44 +0100
Organization: A noiseless patient Spider
Lines: 6
Message-ID: <ummqqh$slfn$2@raubtier-asyl.eternal-september.org>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me>
<umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>
<20231227124453.126@kylheku.com>
<umjkg8$bqmr$1@raubtier-asyl.eternal-september.org>
<uml0sp$i3j7$1@dont-email.me>
<umldni$n9ob$1@raubtier-asyl.eternal-september.org>
<umljmb$nrkt$1@dont-email.me>
<umm582$pqb4$1@raubtier-asyl.eternal-september.org>
<L3CjN.109558$p%Mb.3353@fx15.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 29 Dec 2023 16:06:41 -0000 (UTC)
Injection-Info: raubtier-asyl.eternal-september.org; posting-host="303961f0045fd5f77a8ff2f562213bc9";
logging-data="939511"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/Fe1geLCvCbh5i5tuZGGRzdHUHxoBw5s0="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:B1BJ5q6IzAwh/5MdsRgGOM2R4Ng=
Content-Language: de-DE
In-Reply-To: <L3CjN.109558$p%Mb.3353@fx15.iad>
 by: Bonita Montero - Fri, 29 Dec 2023 16:06 UTC

Am 29.12.2023 um 17:01 schrieb Scott Lurndal:

> See page 2-35 in the _Intel Pentium 4 Processor Optimization_
> manual.

Where can I find sth. referring to alloca() there ?

Re: Sieve of Erastosthenes optimized to the max

<20231229092738.13@kylheku.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2982&group=comp.lang.c%2B%2B#2982

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: 433-929-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Fri, 29 Dec 2023 17:29:00 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 30
Message-ID: <20231229092738.13@kylheku.com>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me>
<umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>
<20231227124453.126@kylheku.com>
<umjkg8$bqmr$1@raubtier-asyl.eternal-september.org>
<uml0sp$i3j7$1@dont-email.me>
<umldni$n9ob$1@raubtier-asyl.eternal-september.org>
<umljmb$nrkt$1@dont-email.me>
<umm582$pqb4$1@raubtier-asyl.eternal-september.org>
Injection-Date: Fri, 29 Dec 2023 17:29:00 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="52fd9de859b81d17f176357c065ce2c2";
logging-data="968706"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ccrsJBCKDs5YrvC8ltSka2anxAxO8MeY="
User-Agent: slrn/pre1.0.4-9 (Linux)
Cancel-Lock: sha1:4p4xuEmyST4Xlt/LQNuehD/0H4w=
 by: Kaz Kylheku - Fri, 29 Dec 2023 17:29 UTC

On 2023-12-29, Bonita Montero <Bonita.Montero@gmail.com> wrote:
> Am 29.12.2023 um 05:58 schrieb Chris M. Thomasson:
>
>> On 12/28/2023 7:17 PM, Bonita Montero wrote:
>
>>> Am 29.12.2023 um 00:38 schrieb Chris M. Thomasson:
>
>>>> The use of alloca to try to get around the problem in their (Intel's)
>>>> early hyperthreaded processors was real, and actually helped. It was
>>>> in the Intel docs.
>
>>> I don't believe it.
>
>> Why not?
>
> Because I don't understand what's different with the access pattern
> of alloca() and usual stack allocation. And if I google for "alloca
> Pentium 4 site:intel.com" I can't find anything that fits.

I explained it. The allocation is not used. When you call alloca(n),
the stack pointer moves by n bytes. If you then call a function,
its stack frame will be offset by that much (plus any alignment if
n is not aligned).

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

Re: Sieve of Erastosthenes optimized to the max

<umn1ma$svun$7@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2983&group=comp.lang.c%2B%2B#2983

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: vir.camp...@invalid.invalid (Vir Campestris)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Fri, 29 Dec 2023 18:03:54 +0000
Organization: A noiseless patient Spider
Lines: 43
Message-ID: <umn1ma$svun$7@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<ul5bn0$2qsht$2@dont-email.me>
<ul5ut3$31a17$1@raubtier-asyl.eternal-september.org>
<ul7ft1$3853g$1@dont-email.me>
<ul7gal$3885r$1@raubtier-asyl.eternal-september.org>
<ulchs1$t3ph$1@dont-email.me> <ulfa02$1er97$1@redfloyd.dont-email.me>
<86edfcvkjm.fsf@linuxsc.com> <um7iva$27ikh$1@dont-email.me>
<86a5q0uhd1.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 29 Dec 2023 18:03:54 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f7a3892e219a1937d0607003ea21972a";
logging-data="950231"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19J04gP+66Q/nA78YrtEywC1kbqGW+nZ2M="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:cdKV5Upho3edjGMxZj1XVliW3jU=
In-Reply-To: <86a5q0uhd1.fsf@linuxsc.com>
Content-Language: en-GB
 by: Vir Campestris - Fri, 29 Dec 2023 18:03 UTC

On 24/12/2023 08:36, Tim Rentsch wrote:
> Depending on how the code is written, no modulo operations need
> to be done, because they will be optimized away and done at
> compile time. If we look at multiplying two numbers represented
> by bits in bytes i and j, the two numbers are
>
> i*30 + a
> j*30 + b
>
> for some a and b in { 1, 7, 11, 13 17, 19, 23, 29 }.
>
> The values we're interested in are the index of the product and
> the residue of the product, namely
>
> (i*30+a) * (j*30+b) / 30 (for the index)
> (i*30+a) * (j*30+b) % 30 (for the residue)
>
> Any term with a *30 in the numerator doesn't contribute to
> the residue, and also can be combined with the by-30 divide
> for computing the index. Thus these expressions can be
> rewritten as
>
> i*30*j + i*b + j*a + (a*b/30) (for the index)
> a*b%30 (for the residue)
>
> When a and b have values that are known at compile time,
> neither the divide nor the remainder result in run-time
> operations being done; all of that heavy lifting is
> optimized away and done at compile time. Of course there
> are some multiplies, but they are cheaper than divides, and
> also can be done in parallel. (The multiplication a*b also
> can be done at compile time.)
>
> The residue needs to be turned into a bit mask to do the
> logical operation on the byte of bits, but here again that
> computation can be optimized away and done at compile time.
>
> Does that all make sense?

Right now, no. But that's me. I'll flag it to read again when I've had a
better night's sleep.

Andy

Re: Sieve of Erastosthenes optimized to the max

<umnel9$ve1t$2@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2984&group=comp.lang.c%2B%2B#2984

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!news.chmurka.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m....@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Fri, 29 Dec 2023 13:45:12 -0800
Organization: A noiseless patient Spider
Lines: 9
Message-ID: <umnel9$ve1t$2@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me>
<umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>
<20231227124453.126@kylheku.com>
<umjkg8$bqmr$1@raubtier-asyl.eternal-september.org>
<uml0sp$i3j7$1@dont-email.me>
<umldni$n9ob$1@raubtier-asyl.eternal-september.org>
<umljmb$nrkt$1@dont-email.me>
<umm582$pqb4$1@raubtier-asyl.eternal-september.org>
<L3CjN.109558$p%Mb.3353@fx15.iad>
<ummqqh$slfn$2@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 29 Dec 2023 21:45:13 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="3b9c8277d0ce293bbef9695132362a05";
logging-data="1030205"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18oqZXHqdcbYh1GtCWuK4SdvIKXQ1IhaMY="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:5q6SCvvO/PvVmBsmCjzBz02q68c=
In-Reply-To: <ummqqh$slfn$2@raubtier-asyl.eternal-september.org>
Content-Language: en-US
 by: Chris M. Thomasson - Fri, 29 Dec 2023 21:45 UTC

On 12/29/2023 8:06 AM, Bonita Montero wrote:
> Am 29.12.2023 um 17:01 schrieb Scott Lurndal:
>
>> See page 2-35 in the _Intel Pentium 4 Processor Optimization_
>> manual.
>
> Where can I find sth. referring to alloca() there ?

Huh?

Re: Sieve of Erastosthenes optimized to the max

<umnf3q$ve1t$3@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2985&group=comp.lang.c%2B%2B#2985

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m....@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Fri, 29 Dec 2023 13:52:57 -0800
Organization: A noiseless patient Spider
Lines: 43
Message-ID: <umnf3q$ve1t$3@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<um2dsb$17vgg$2@dont-email.me>
<um2vpr$1e7pn$1@raubtier-asyl.eternal-september.org>
<um31o0$1edq3$1@dont-email.me>
<um4f1l$1l1c2$1@raubtier-asyl.eternal-september.org>
<um7haa$274oh$3@dont-email.me>
<um8vlp$2h8oc$1@raubtier-asyl.eternal-september.org>
<uma7i4$2nagg$1@dont-email.me>
<umdmkf$3clht$1@raubtier-asyl.eternal-september.org>
<umdovb$3cmi3$2@dont-email.me>
<ume6ap$3ecjk$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me>
<umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>
<20231227124453.126@kylheku.com>
<umjkg8$bqmr$1@raubtier-asyl.eternal-september.org>
<uml0sp$i3j7$1@dont-email.me>
<umldni$n9ob$1@raubtier-asyl.eternal-september.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 29 Dec 2023 21:52:58 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="3b9c8277d0ce293bbef9695132362a05";
logging-data="1030205"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19jkxEpPR6xryL8KOZYypS3Z4eHtYN5i1E="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:8WetPGBnEVIqyK4UgpqfoSispqw=
Content-Language: en-US
In-Reply-To: <umldni$n9ob$1@raubtier-asyl.eternal-september.org>
 by: Chris M. Thomasson - Fri, 29 Dec 2023 21:52 UTC

On 12/28/2023 7:17 PM, Bonita Montero wrote:
> Am 29.12.2023 um 00:38 schrieb Chris M. Thomasson:
>
>> The use of alloca to try to get around the problem in their (Intel's)
>> early hyperthreaded processors was real, and actually helped. It was
>> in the Intel docs.
>
> I don't believe it.
>

I actually found an early version my old code on the wayback machine
that does this. Take note of the following function. It did improve
performance way back on those early hyperthreaded processors.

https://web.archive.org/web/20060214112446/http://appcore.home.comcast.net/appcore/src/ac_thread_c.html
___________________
void* AC_CDECL
prv_thread_entry
( void *state )
{ int ret;
void *uret;
ac_thread_t *_this = state;

ret = pthread_setspecific
( g_tls_key,
_this );
if ( ret ) { assert( ! ret ); abort(); }

if ( _this->id < 64 )
{
AC_OS_ALLOCA( 2048 * _this->id );
uret = _this->fp_entry( (void*)_this->state );
}

else
{
uret = _this->fp_entry( (void*)_this->state );
}

return uret;
} ___________________

Re: Sieve of Erastosthenes optimized to the max

<umng1u$ve1t$4@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=2986&group=comp.lang.c%2B%2B#2986

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m....@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c++
Subject: Re: Sieve of Erastosthenes optimized to the max
Date: Fri, 29 Dec 2023 14:09:02 -0800
Organization: A noiseless patient Spider
Lines: 6
Message-ID: <umng1u$ve1t$4@dont-email.me>
References: <ul41d4$2koct$1@raubtier-asyl.eternal-september.org>
<umfcqi$3jktj$2@dont-email.me> <20231226152712.582@kylheku.com>
<umfo3c$3l1oh$1@dont-email.me> <umgego$3r29p$1@dont-email.me>
<umgqe5$3sc4h$1@raubtier-asyl.eternal-september.org>
<20231227124453.126@kylheku.com>
<umjkg8$bqmr$1@raubtier-asyl.eternal-september.org>
<uml0sp$i3j7$1@dont-email.me>
<umldni$n9ob$1@raubtier-asyl.eternal-september.org>
<umljmb$nrkt$1@dont-email.me>
<umm582$pqb4$1@raubtier-asyl.eternal-september.org>
<L3CjN.109558$p%Mb.3353@fx15.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 29 Dec 2023 22:09:03 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="3b9c8277d0ce293bbef9695132362a05";
logging-data="1030205"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+utWa+SdXXVcNC0evtJPIuuwkDPkfl9Ow="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:i9TNF3axMP+Cy9F5sRYy+QEXmCk=
In-Reply-To: <L3CjN.109558$p%Mb.3353@fx15.iad>
Content-Language: en-US
 by: Chris M. Thomasson - Fri, 29 Dec 2023 22:09 UTC

On 12/29/2023 8:01 AM, Scott Lurndal wrote:
> page 2-35 in the_Intel Pentium 4 Processor Optimization_
> manual.

I think it was in chapter 5 of Developing Multithreaded Applications: A
Platform Consistent Approach cannot remember the damn section right now.


devel / comp.lang.c++ / Re: Sieve of Erastosthenes optimized to the max

Pages:123456
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor