Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

The truth of a proposition has nothing to do with its credibility. And vice versa.


devel / comp.arch / Re: Ill-advised use of CMOVE

SubjectAuthor
* Ill-advised use of CMOVEStefan Monnier
+* Re: Ill-advised use of CMOVEThomas Koenig
|`* Re: Ill-advised use of CMOVEStephen Fuld
| `* Re: Ill-advised use of CMOVEThomas Koenig
|  +* Re: Ill-advised use of CMOVEStephen Fuld
|  |`- Re: Ill-advised use of CMOVEaph
|  `* Re: Ill-advised use of CMOVEAnton Ertl
|   +* Re: Ill-advised use of CMOVEMichael S
|   |`* Re: Ill-advised use of CMOVEAnton Ertl
|   | `* Re: Ill-advised use of CMOVEMichael S
|   |  +* Re: Ill-advised use of CMOVEIvan Godard
|   |  |`* Re: Ill-advised use of CMOVEMichael S
|   |  | `* Re: Ill-advised use of CMOVEIvan Godard
|   |  |  `- Re: Ill-advised use of CMOVEMichael S
|   |  `* Re: Ill-advised use of CMOVEAnton Ertl
|   |   +- Re: Ill-advised use of CMOVEMichael S
|   |   `* branchless binary search (was: Ill-advised use of CMOVE)Anton Ertl
|   |    +* Re: branchless binary searchStefan Monnier
|   |    |`- Re: branchless binary searchAnton Ertl
|   |    +- Re: branchless binary searchTerje Mathisen
|   |    `* Re: branchless binary searchEricP
|   |     +* Re: branchless binary searchMichael S
|   |     |+* Re: branchless binary searchStephen Fuld
|   |     ||`* Re: branchless binary searchMichael S
|   |     || +- Re: branchless binary searchThomas Koenig
|   |     || `* Spectre fix (was: branchless binary search)Anton Ertl
|   |     ||  `- Re: Spectre fix (was: branchless binary search)Michael S
|   |     |+* Re: branchless binary searchStefan Monnier
|   |     ||`- Re: branchless binary searchMitchAlsup
|   |     |`* Re: branchless binary searchAndy Valencia
|   |     | `- Re: branchless binary searchTerje Mathisen
|   |     `* Re: branchless binary searchAnton Ertl
|   |      `* Re: branchless binary searchMitchAlsup
|   |       `* Spectre and resource comtention (was: branchless binary search)Anton Ertl
|   |        +* Re: Spectre and resource comtentionStefan Monnier
|   |        |+- Re: Spectre and resource comtentionMitchAlsup
|   |        |`- Re: Spectre and resource comtentionAnton Ertl
|   |        `* Re: Spectre and resource comtention (was: branchless binary search)MitchAlsup
|   |         `* Re: Spectre and resource comtention (was: branchless binary search)Anton Ertl
|   |          `- Re: Spectre and resource comtention (was: branchless binary search)MitchAlsup
|   +* Re: Ill-advised use of CMOVEStephen Fuld
|   |`* binary search vs. hash tables (was: Ill-advised use of CMOVE)Anton Ertl
|   | +* Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)Stephen Fuld
|   | |`* Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)John Levine
|   | | `* Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)Michael S
|   | |  `* Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)Michael S
|   | |   `* Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)Michael S
|   | |    `* Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)Michael S
|   | |     `* Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)Michael S
|   | |      `* Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)Michael S
|   | |       `* Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)Michael S
|   | |        `* Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)Brett
|   | |         `- Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)Michael S
|   | `- Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)John Levine
|   +* Re: Ill-advised use of CMOVEEricP
|   |`* Re: Ill-advised use of CMOVEBGB
|   | +* Re: Ill-advised use of CMOVEMitchAlsup
|   | |+* Re: Ill-advised use of CMOVEBGB
|   | ||`* Re: Ill-advised use of CMOVEMitchAlsup
|   | || +* Re: Ill-advised use of CMOVEBGB
|   | || |`- Re: Ill-advised use of CMOVEMitchAlsup
|   | || `- Re: Ill-advised use of CMOVEIvan Godard
|   | |`* Re: Ill-advised use of CMOVEIvan Godard
|   | | `- Re: Ill-advised use of CMOVEMitchAlsup
|   | `- Re: Ill-advised use of CMOVEIvan Godard
|   `- Re: Ill-advised use of CMOVEThomas Koenig
+* Re: Ill-advised use of CMOVETerje Mathisen
|+* Re: Ill-advised use of CMOVEMitchAlsup
||`* Re: Ill-advised use of CMOVETerje Mathisen
|| `* Re: Ill-advised use of CMOVEMitchAlsup
||  `* Re: Ill-advised use of CMOVEMarcus
||   `* Re: Ill-advised use of CMOVEMitchAlsup
||    `* Re: Ill-advised use of CMOVEBGB
||     `* Re: Ill-advised use of CMOVEMitchAlsup
||      `- Re: Ill-advised use of CMOVEBGB
|+- Re: Ill-advised use of CMOVEIvan Godard
|`* Re: Ill-advised use of CMOVEStephen Fuld
| +* Re: Ill-advised use of CMOVEMichael S
| |`* Re: Ill-advised use of CMOVEStephen Fuld
| | `- Re: Ill-advised use of CMOVEAnton Ertl
| +* Re: Ill-advised use of CMOVEStefan Monnier
| |`* Re: Ill-advised use of CMOVEStephen Fuld
| | `* Re: Ill-advised use of CMOVEAnton Ertl
| |  `* Re: Ill-advised use of CMOVEBGB
| |   `* Re: Ill-advised use of CMOVEMitchAlsup
| |    +- Re: Ill-advised use of CMOVEBGB
| |    `* Re: Ill-advised use of CMOVEThomas Koenig
| |     +- Re: Ill-advised use of CMOVEAnton Ertl
| |     `- Re: Ill-advised use of CMOVEMitchAlsup
| +* Re: Ill-advised use of CMOVEAnton Ertl
| |`* Re: Ill-advised use of CMOVEMitchAlsup
| | `* Re: Ill-advised use of CMOVEEricP
| |  +* Re: Ill-advised use of CMOVEMichael S
| |  |`* Re: Ill-advised use of CMOVEEricP
| |  | +- Re: Ill-advised use of CMOVEMitchAlsup
| |  | +* Re: Ill-advised use of CMOVETerje Mathisen
| |  | |`* Re: Ill-advised use of CMOVEEricP
| |  | | `* Re: Ill-advised use of CMOVEMitchAlsup
| |  | |  `- Re: Ill-advised use of CMOVEBGB
| |  | `- Re: Ill-advised use of CMOVEAnton Ertl
| |  `* Re: Ill-advised use of CMOVEAnton Ertl
| `* Re: Ill-advised use of CMOVETerje Mathisen
+- Re: Ill-advised use of CMOVEAnton Ertl
`* Re: Ill-advised use of CMOVEScott Michel

Pages:123456
Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)

<30054530-21ac-4bd3-88b4-a0e74ea5af87n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25381&group=comp.arch#25381

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:3d3:b0:2f3:ba0b:ee5b with SMTP id k19-20020a05622a03d300b002f3ba0bee5bmr18327569qtx.365.1652775218463;
Tue, 17 May 2022 01:13:38 -0700 (PDT)
X-Received: by 2002:a54:4e92:0:b0:325:224c:8ff7 with SMTP id
c18-20020a544e92000000b00325224c8ff7mr9450547oiy.154.1652775218288; Tue, 17
May 2022 01:13:38 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 17 May 2022 01:13:38 -0700 (PDT)
In-Reply-To: <69eee9e4-1f34-4994-88c7-65d756baf30an@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t5t8r4$qdi$1@dont-email.me>
<2022May16.180830@mips.complang.tuwien.ac.at> <t5u35g$fo4$1@dont-email.me>
<t5u5c4$2n6r$1@gal.iecc.com> <677f55fc-8833-4830-bacc-9c3925c7b1adn@googlegroups.com>
<6985cff9-13dd-4b59-8a22-0d418f156c12n@googlegroups.com> <69eee9e4-1f34-4994-88c7-65d756baf30an@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <30054530-21ac-4bd3-88b4-a0e74ea5af87n@googlegroups.com>
Subject: Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)
From: already5...@yahoo.com (Michael S)
Injection-Date: Tue, 17 May 2022 08:13:38 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 1459
 by: Michael S - Tue, 17 May 2022 08:13 UTC

test

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)

<62e6eb85-26ef-45fb-9b22-fbe787447f1en@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25382&group=comp.arch#25382

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:7ca:0:b0:69f:7733:27b9 with SMTP id 193-20020a3707ca000000b0069f773327b9mr15512849qkh.493.1652775972279;
Tue, 17 May 2022 01:26:12 -0700 (PDT)
X-Received: by 2002:a05:6870:79a:b0:e9:109a:1391 with SMTP id
en26-20020a056870079a00b000e9109a1391mr11492426oab.105.1652775972092; Tue, 17
May 2022 01:26:12 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 17 May 2022 01:26:11 -0700 (PDT)
In-Reply-To: <30054530-21ac-4bd3-88b4-a0e74ea5af87n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t5t8r4$qdi$1@dont-email.me>
<2022May16.180830@mips.complang.tuwien.ac.at> <t5u35g$fo4$1@dont-email.me>
<t5u5c4$2n6r$1@gal.iecc.com> <677f55fc-8833-4830-bacc-9c3925c7b1adn@googlegroups.com>
<6985cff9-13dd-4b59-8a22-0d418f156c12n@googlegroups.com> <69eee9e4-1f34-4994-88c7-65d756baf30an@googlegroups.com>
<30054530-21ac-4bd3-88b4-a0e74ea5af87n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <62e6eb85-26ef-45fb-9b22-fbe787447f1en@googlegroups.com>
Subject: Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)
From: already5...@yahoo.com (Michael S)
Injection-Date: Tue, 17 May 2022 08:26:12 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 1518
 by: Michael S - Tue, 17 May 2022 08:26 UTC

test

Re: Ill-advised use of CMOVE

<c17493ff-4a3d-4e90-bbbd-00f1ec658d23n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25383&group=comp.arch#25383

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:2552:b0:67b:32e2:2400 with SMTP id s18-20020a05620a255200b0067b32e22400mr14548097qko.768.1652776386177;
Tue, 17 May 2022 01:33:06 -0700 (PDT)
X-Received: by 2002:a05:6870:e249:b0:f1:7fba:ce67 with SMTP id
d9-20020a056870e24900b000f17fbace67mr8380504oac.269.1652776385980; Tue, 17
May 2022 01:33:05 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 17 May 2022 01:33:05 -0700 (PDT)
In-Reply-To: <2022May16.175019@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t58mug$ref$2@newsreader4.netcologne.de>
<t5rbct$rbe$1@dont-email.me> <t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at> <a52349f8-ace0-4f2c-af38-540713b6574en@googlegroups.com>
<2022May16.175019@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c17493ff-4a3d-4e90-bbbd-00f1ec658d23n@googlegroups.com>
Subject: Re: Ill-advised use of CMOVE
From: already5...@yahoo.com (Michael S)
Injection-Date: Tue, 17 May 2022 08:33:06 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 4351
 by: Michael S - Tue, 17 May 2022 08:33 UTC

On Monday, May 16, 2022 at 7:07:29 PM UTC+3, Anton Ertl wrote:
> Michael S <already...@yahoo.com> writes:
> >On Monday, May 16, 2022 at 9:22:21 AM UTC+3, Anton Ertl wrote:
> >> And then there's the question of why one uses binary search in the
> >> first place.
> >
> >Sorted array + binary search is a reasonable option for data structure+algorithm
> >when [past initialization phase] the dictionary is either totally or almost totally static.
> Slow and complex.
> >My intuition tells me nothing about advantage/disadvantage of cmove vs branch for
> >such variation of binary search.
> The question is how predictable your searches are. If they are
> unpredictable, branches cost ~10 cycles (~20 per misprediction) on
> average on a typical desktop CPU,

6 cycles on the CPU that was used in this project.

> and a "branchless" variant is much faster.

The CPU in question has no CMOVE/Select. Emulating it will take 4 instruction.
However my point is that the most of the time is consumed in strcmp().
bsearch loop itself is relatively insignificant. Except that early access to the next key,
which is likely not in L1D cache, (and not in L2 cache, because CPU has no L2) can
be of benefit. Even if core fetches wrong data 50% of time it is still right another 50%.
And memory bandwidth in this case was, to the 1st order of approximation, free.

> >As to optimality of bsearch vs. other options, I'd guess, with decent certainty, that in my
> >case hash-based lookup would be faster, but also more complicated to code and
> >a little less robust in the unlikely worst case.
> A simple hash table implementation is much less complicated than
> keeping an array sorted

No need "to keep it sorted", since after program initialization it never was changed.

> and binary search (which is easy to get wrong,
> and not so easy to notice that you got it wrong).

Both qsort() and bsearch() are parts of standard C library, so, as far as application programmer
is concerned, there is no complexity and no bugs.
Hash-based containers, on the other hand, are not part of standard C library.
And we had several good reasons to use C as our implementation language.

> The worst case is
> worse, but it's much more likely that the simple hash table beats the
> binary search by a lot.

Yes, it is likely. But it does not matter.
The difference between O(1) of hash vs O(logN) of bsearch is often insignificant.
On the other hand, the difference between O(N) of linear search vs O(logN) is
insignificant less often.
In this particular case, at the beginning of the project the tables were of order of dozens
of entries, but throughout the years they grew to several hundreds, so, considering the
relative slowness of the CPU, it was not totally insignificant.

> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)

<736a42b0-8608-49be-aa94-e3800e1a0836n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25384&group=comp.arch#25384

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:695:0:b0:69f:b916:2d8e with SMTP id 143-20020a370695000000b0069fb9162d8emr15220780qkg.680.1652776701211;
Tue, 17 May 2022 01:38:21 -0700 (PDT)
X-Received: by 2002:a05:6870:b68d:b0:de:9da7:9615 with SMTP id
cy13-20020a056870b68d00b000de9da79615mr11996186oab.117.1652776700832; Tue, 17
May 2022 01:38:20 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 17 May 2022 01:38:20 -0700 (PDT)
In-Reply-To: <62e6eb85-26ef-45fb-9b22-fbe787447f1en@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t5t8r4$qdi$1@dont-email.me>
<2022May16.180830@mips.complang.tuwien.ac.at> <t5u35g$fo4$1@dont-email.me>
<t5u5c4$2n6r$1@gal.iecc.com> <677f55fc-8833-4830-bacc-9c3925c7b1adn@googlegroups.com>
<6985cff9-13dd-4b59-8a22-0d418f156c12n@googlegroups.com> <69eee9e4-1f34-4994-88c7-65d756baf30an@googlegroups.com>
<30054530-21ac-4bd3-88b4-a0e74ea5af87n@googlegroups.com> <62e6eb85-26ef-45fb-9b22-fbe787447f1en@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <736a42b0-8608-49be-aa94-e3800e1a0836n@googlegroups.com>
Subject: Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)
From: already5...@yahoo.com (Michael S)
Injection-Date: Tue, 17 May 2022 08:38:21 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 1794
 by: Michael S - Tue, 17 May 2022 08:38 UTC

Sorry for all 'test' posts.
I am trying to figure out if google, in their eternal wisdom, had broken a posting on Google Groups completely,
or if it did it selectively, just on Firefox.
So far it looks like the later.

Re: Ill-advised use of CMOVE

<t5vn47$ffi$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25385&group=comp.arch#25385

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Ill-advised use of CMOVE
Date: Tue, 17 May 2022 01:41:44 -0700
Organization: A noiseless patient Spider
Lines: 37
Message-ID: <t5vn47$ffi$1@dont-email.me>
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org>
<t58mug$ref$2@newsreader4.netcologne.de> <t5rbct$rbe$1@dont-email.me>
<t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at>
<a52349f8-ace0-4f2c-af38-540713b6574en@googlegroups.com>
<2022May16.175019@mips.complang.tuwien.ac.at>
<c17493ff-4a3d-4e90-bbbd-00f1ec658d23n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 17 May 2022 08:41:43 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="39efa1ed15711b99ca2ad034714b3dd8";
logging-data="15858"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19HsQabJ/8b3WbpgRq43rqk"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.9.0
Cancel-Lock: sha1:e8B/2lSpo146H0TgjryBfwRHy5c=
In-Reply-To: <c17493ff-4a3d-4e90-bbbd-00f1ec658d23n@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Tue, 17 May 2022 08:41 UTC

On 5/17/2022 1:33 AM, Michael S wrote:
> On Monday, May 16, 2022 at 7:07:29 PM UTC+3, Anton Ertl wrote:
>> Michael S <already...@yahoo.com> writes:
>>> On Monday, May 16, 2022 at 9:22:21 AM UTC+3, Anton Ertl wrote:
>>>> And then there's the question of why one uses binary search in the
>>>> first place.
>>>
>>> Sorted array + binary search is a reasonable option for data structure+algorithm
>>> when [past initialization phase] the dictionary is either totally or almost totally static.
>> Slow and complex.
>>> My intuition tells me nothing about advantage/disadvantage of cmove vs branch for
>>> such variation of binary search.
>> The question is how predictable your searches are. If they are
>> unpredictable, branches cost ~10 cycles (~20 per misprediction) on
>> average on a typical desktop CPU,
>
> 6 cycles on the CPU that was used in this project.
>
>> and a "branchless" variant is much faster.
>
> The CPU in question has no CMOVE/Select. Emulating it will take 4 instruction.
> However my point is that the most of the time is consumed in strcmp().
> bsearch loop itself is relatively insignificant. Except that early access to the next key,
> which is likely not in L1D cache, (and not in L2 cache, because CPU has no L2) can
> be of benefit. Even if core fetches wrong data 50% of time it is still right another 50%.
> And memory bandwidth in this case was, to the 1st order of approximation, free.
>
>>> As to optimality of bsearch vs. other options, I'd guess, with decent certainty, that in my
>>> case hash-based lookup would be faster, but also more complicated to code and
>>> a little less robust in the unlikely worst case.
>> A simple hash table implementation is much less complicated than
>> keeping an array sorted
>
> No need "to keep it sorted", since after program initialization it never was changed.

If never changed, why not build a perfectHash table?

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)

<0f19cd66-c04b-406c-8e79-4565b1013e45n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25386&group=comp.arch#25386

 copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5ad4:0:b0:2f3:e0fb:df1c with SMTP id d20-20020ac85ad4000000b002f3e0fbdf1cmr18609150qtd.267.1652776987320;
Tue, 17 May 2022 01:43:07 -0700 (PDT)
X-Received: by 2002:a05:6808:16ac:b0:2f9:52e5:da90 with SMTP id
bb44-20020a05680816ac00b002f952e5da90mr15392584oib.5.1652776986564; Tue, 17
May 2022 01:43:06 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 17 May 2022 01:43:06 -0700 (PDT)
In-Reply-To: <736a42b0-8608-49be-aa94-e3800e1a0836n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t5t8r4$qdi$1@dont-email.me>
<2022May16.180830@mips.complang.tuwien.ac.at> <t5u35g$fo4$1@dont-email.me>
<t5u5c4$2n6r$1@gal.iecc.com> <677f55fc-8833-4830-bacc-9c3925c7b1adn@googlegroups.com>
<6985cff9-13dd-4b59-8a22-0d418f156c12n@googlegroups.com> <69eee9e4-1f34-4994-88c7-65d756baf30an@googlegroups.com>
<30054530-21ac-4bd3-88b4-a0e74ea5af87n@googlegroups.com> <62e6eb85-26ef-45fb-9b22-fbe787447f1en@googlegroups.com>
<736a42b0-8608-49be-aa94-e3800e1a0836n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <0f19cd66-c04b-406c-8e79-4565b1013e45n@googlegroups.com>
Subject: Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)
From: already5...@yahoo.com (Michael S)
Injection-Date: Tue, 17 May 2022 08:43:07 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 1950
 by: Michael S - Tue, 17 May 2022 08:43 UTC

On Tuesday, May 17, 2022 at 11:38:22 AM UTC+3, Michael S wrote:
> Sorry for all 'test' posts.
> I am trying to figure out if google, in their eternal wisdom, had broken a posting on Google Groups completely,
> or if it did it selectively, just on Firefox.
> So far it looks like the later.

More the strangerer

Re: Ill-advised use of CMOVE

<1764536f-dad8-4a44-983b-87843568f56fn@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25387&group=comp.arch#25387

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:a90e:0:b0:69f:9b8f:86b4 with SMTP id s14-20020a37a90e000000b0069f9b8f86b4mr14511264qke.513.1652777361294;
Tue, 17 May 2022 01:49:21 -0700 (PDT)
X-Received: by 2002:a05:6830:1d92:b0:606:a1e:946a with SMTP id
y18-20020a0568301d9200b006060a1e946amr7869323oti.294.1652777361078; Tue, 17
May 2022 01:49:21 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!2.eu.feeder.erje.net!feeder.erje.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 17 May 2022 01:49:20 -0700 (PDT)
In-Reply-To: <t5vn47$ffi$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t58mug$ref$2@newsreader4.netcologne.de>
<t5rbct$rbe$1@dont-email.me> <t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at> <a52349f8-ace0-4f2c-af38-540713b6574en@googlegroups.com>
<2022May16.175019@mips.complang.tuwien.ac.at> <c17493ff-4a3d-4e90-bbbd-00f1ec658d23n@googlegroups.com>
<t5vn47$ffi$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1764536f-dad8-4a44-983b-87843568f56fn@googlegroups.com>
Subject: Re: Ill-advised use of CMOVE
From: already5...@yahoo.com (Michael S)
Injection-Date: Tue, 17 May 2022 08:49:21 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Michael S - Tue, 17 May 2022 08:49 UTC

On Tuesday, May 17, 2022 at 11:41:46 AM UTC+3, Ivan Godard wrote:
> On 5/17/2022 1:33 AM, Michael S wrote:
> > On Monday, May 16, 2022 at 7:07:29 PM UTC+3, Anton Ertl wrote:
> >> Michael S <already...@yahoo.com> writes:
> >>> On Monday, May 16, 2022 at 9:22:21 AM UTC+3, Anton Ertl wrote:
> >>>> And then there's the question of why one uses binary search in the
> >>>> first place.
> >>>
> >>> Sorted array + binary search is a reasonable option for data structure+algorithm
> >>> when [past initialization phase] the dictionary is either totally or almost totally static.
> >> Slow and complex.
> >>> My intuition tells me nothing about advantage/disadvantage of cmove vs branch for
> >>> such variation of binary search.
> >> The question is how predictable your searches are. If they are
> >> unpredictable, branches cost ~10 cycles (~20 per misprediction) on
> >> average on a typical desktop CPU,
> >
> > 6 cycles on the CPU that was used in this project.
> >
> >> and a "branchless" variant is much faster.
> >
> > The CPU in question has no CMOVE/Select. Emulating it will take 4 instruction.
> > However my point is that the most of the time is consumed in strcmp().
> > bsearch loop itself is relatively insignificant. Except that early access to the next key,
> > which is likely not in L1D cache, (and not in L2 cache, because CPU has no L2) can
> > be of benefit. Even if core fetches wrong data 50% of time it is still right another 50%.
> > And memory bandwidth in this case was, to the 1st order of approximation, free.
> >
> >>> As to optimality of bsearch vs. other options, I'd guess, with decent certainty, that in my
> >>> case hash-based lookup would be faster, but also more complicated to code and
> >>> a little less robust in the unlikely worst case.
> >> A simple hash table implementation is much less complicated than
> >> keeping an array sorted
> >
> > No need "to keep it sorted", since after program initialization it never was changed.
> If never changed, why not build a perfectHash table?

It never changes *after program initialization*. But it does not have to be identical on
each activation of the program.

Re: Ill-advised use of CMOVE

<VbidnVY1UNkP-h7_nZ2dnUU7-LednZ2d@supernews.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25388&group=comp.arch#25388

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!buffer2.nntp.dca1.giganews.com!buffer1.nntp.dca1.giganews.com!nntp.supernews.com!news.supernews.com.POSTED!not-for-mail
NNTP-Posting-Date: Tue, 17 May 2022 04:14:58 -0500
Sender: Andrew Haley <aph@zarquon.pink>
From: aph...@littlepinkcloud.invalid
Subject: Re: Ill-advised use of CMOVE
Newsgroups: comp.arch
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t58mug$ref$2@newsreader4.netcologne.de> <t5rbct$rbe$1@dont-email.me> <t5rg76$hp1$1@newsreader4.netcologne.de> <t5sbrt$jut$1@dont-email.me>
User-Agent: tin/1.9.2-20070201 ("Dalaruan") (UNIX) (Linux/4.18.0-348.12.2.el8_5.x86_64 (x86_64))
Message-ID: <VbidnVY1UNkP-h7_nZ2dnUU7-LednZ2d@supernews.com>
Date: Tue, 17 May 2022 04:14:58 -0500
Lines: 20
X-Trace: sv3-dwX8+RgQff/ZZByiVMUbYBNs+5TiWOkGykyD7cxXqlnHFVIOulEzsQ2JWUAaUqSmilkaXIwRxzcQaqY!eLIYRx2jltS/Y6wuEBi8DlFXlsZOoyYFgd9jszteSMKe++kqAWQbrALPLVM3dr+t2HOmeJCWIOdE!pSg4eIum
X-Complaints-To: www.supernews.com/docs/abuse.html
X-DMCA-Complaints-To: www.supernews.com/docs/dmca.html
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.40
X-Original-Bytes: 2015
 by: aph...@littlepinkcloud.invalid - Tue, 17 May 2022 09:14 UTC

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> wrote:
> On 5/15/2022 11:19 AM, Thomas Koenig wrote:
>>
>> Ideally, a compiler would contain heuristics when a branch, or a
>> CMOVE, is better. How could one predict predictability of a branch?
>
> That was Anton's and my point. Compilers aren't very good at doing
> this, so my suggestion is don't ask them to.

It could work if a JIT had an accurate model of the branch predictor,
or (better) a way to ask a branch predictor for its opinion. But
branch predictors are the most secret kind of secret sauce, so that
won't happen.

On the other hand, a decoder could just convert CMOVE into the
micro-ops for a test-and-branch and two MOVEs, and the problem would
go away. Security devs who are trying to write constant-time code
would hate that, though, so it won't happen.

Andrew.

Re: Ill-advised use of CMOVE

<2022May17.110354@mips.complang.tuwien.ac.at>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25389&group=comp.arch#25389

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Ill-advised use of CMOVE
Date: Tue, 17 May 2022 09:03:54 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 40
Message-ID: <2022May17.110354@mips.complang.tuwien.ac.at>
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t58mug$ref$2@newsreader4.netcologne.de> <t5rbct$rbe$1@dont-email.me> <t5rg76$hp1$1@newsreader4.netcologne.de> <2022May16.080143@mips.complang.tuwien.ac.at> <a52349f8-ace0-4f2c-af38-540713b6574en@googlegroups.com> <2022May16.175019@mips.complang.tuwien.ac.at> <c17493ff-4a3d-4e90-bbbd-00f1ec658d23n@googlegroups.com>
Injection-Info: reader02.eternal-september.org; posting-host="a6c93b3a07299dfb70710467893880db";
logging-data="3602"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18Rayo+NqGA6QVhBrsUyFQg"
Cancel-Lock: sha1:gJR1gATFT/tEeSg1iK2kaSmAcvo=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 17 May 2022 09:03 UTC

Michael S <already5chosen@yahoo.com> writes:
>On Monday, May 16, 2022 at 7:07:29 PM UTC+3, Anton Ertl wrote:
>> and a "branchless" variant is much faster.
>
>The CPU in question has no CMOVE/Select.

For binary search it's relatively cheap to do without (at least
someone showed code here many years ago that looked convincing).

>However my point is that the most of the time is consumed in strcmp().

Well, that's the other problem with binary search: In a hash table you
typically notice inequality by looking at the first character, with
binary search you have to look further after the first few iterations;
plus, you need </>= information, not just =/!=, so comparing multiple
chars at a time is more involved.

>bsearch loop itself is relatively insignificant. Except that early access to the next key,
>which is likely not in L1D cache, (and not in L2 cache, because CPU has no L2) can
>be of benefit. Even if core fetches wrong data 50% of time it is still right another 50%.
>And memory bandwidth in this case was, to the 1st order of approximation, free.

Then you could even prefetch both sides (at least if you have
resources that would otherwise be idle).

>Yes, it is likely. But it does not matter.
>The difference between O(1) of hash vs O(logN) of bsearch is often insignificant.

ld N typically is in the range 10-30, which is often significant.

>In this particular case, at the beginning of the project the tables were of order of dozens
>of entries, but throughout the years they grew to several hundreds, so, considering the
>relative slowness of the CPU, it was not totally insignificant.

So ld N<10, which makes binary search more competetive.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Ill-advised use of CMOVE

<t5vpu4$4u0$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25390&group=comp.arch#25390

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Ill-advised use of CMOVE
Date: Tue, 17 May 2022 02:29:39 -0700
Organization: A noiseless patient Spider
Lines: 44
Message-ID: <t5vpu4$4u0$1@dont-email.me>
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org>
<t58mug$ref$2@newsreader4.netcologne.de> <t5rbct$rbe$1@dont-email.me>
<t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at>
<a52349f8-ace0-4f2c-af38-540713b6574en@googlegroups.com>
<2022May16.175019@mips.complang.tuwien.ac.at>
<c17493ff-4a3d-4e90-bbbd-00f1ec658d23n@googlegroups.com>
<t5vn47$ffi$1@dont-email.me>
<1764536f-dad8-4a44-983b-87843568f56fn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 17 May 2022 09:29:40 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="39efa1ed15711b99ca2ad034714b3dd8";
logging-data="5056"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/f3YYKaCchuUEZS/P7B6k3"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.9.0
Cancel-Lock: sha1:9Jm7ZWkyjmf2v1i1GnnSB+x94QI=
In-Reply-To: <1764536f-dad8-4a44-983b-87843568f56fn@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Tue, 17 May 2022 09:29 UTC

On 5/17/2022 1:49 AM, Michael S wrote:
> On Tuesday, May 17, 2022 at 11:41:46 AM UTC+3, Ivan Godard wrote:
>> On 5/17/2022 1:33 AM, Michael S wrote:
>>> On Monday, May 16, 2022 at 7:07:29 PM UTC+3, Anton Ertl wrote:
>>>> Michael S <already...@yahoo.com> writes:
>>>>> On Monday, May 16, 2022 at 9:22:21 AM UTC+3, Anton Ertl wrote:
>>>>>> And then there's the question of why one uses binary search in the
>>>>>> first place.
>>>>>
>>>>> Sorted array + binary search is a reasonable option for data structure+algorithm
>>>>> when [past initialization phase] the dictionary is either totally or almost totally static.
>>>> Slow and complex.
>>>>> My intuition tells me nothing about advantage/disadvantage of cmove vs branch for
>>>>> such variation of binary search.
>>>> The question is how predictable your searches are. If they are
>>>> unpredictable, branches cost ~10 cycles (~20 per misprediction) on
>>>> average on a typical desktop CPU,
>>>
>>> 6 cycles on the CPU that was used in this project.
>>>
>>>> and a "branchless" variant is much faster.
>>>
>>> The CPU in question has no CMOVE/Select. Emulating it will take 4 instruction.
>>> However my point is that the most of the time is consumed in strcmp().
>>> bsearch loop itself is relatively insignificant. Except that early access to the next key,
>>> which is likely not in L1D cache, (and not in L2 cache, because CPU has no L2) can
>>> be of benefit. Even if core fetches wrong data 50% of time it is still right another 50%.
>>> And memory bandwidth in this case was, to the 1st order of approximation, free.
>>>
>>>>> As to optimality of bsearch vs. other options, I'd guess, with decent certainty, that in my
>>>>> case hash-based lookup would be faster, but also more complicated to code and
>>>>> a little less robust in the unlikely worst case.
>>>> A simple hash table implementation is much less complicated than
>>>> keeping an array sorted
>>>
>>> No need "to keep it sorted", since after program initialization it never was changed.
>> If never changed, why not build a perfectHash table?
>
> It never changes *after program initialization*. But it does not have to be identical on
> each activation of the program.
>

And not worth building a perfectHash on the fly? There are dynamic
construction algorithms that are O(n) IIRC.

Re: Ill-advised use of CMOVE

<fb745e5a-a527-4359-b8aa-dbd3b1fe371fn@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25391&group=comp.arch#25391

 copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:4208:0:b0:461:d262:7842 with SMTP id k8-20020ad44208000000b00461d2627842mr6143564qvp.113.1652780599205;
Tue, 17 May 2022 02:43:19 -0700 (PDT)
X-Received: by 2002:a05:6830:2705:b0:606:c10:e4b5 with SMTP id
j5-20020a056830270500b006060c10e4b5mr7907817otu.74.1652780598941; Tue, 17 May
2022 02:43:18 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 17 May 2022 02:43:18 -0700 (PDT)
In-Reply-To: <t5vpu4$4u0$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t58mug$ref$2@newsreader4.netcologne.de>
<t5rbct$rbe$1@dont-email.me> <t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at> <a52349f8-ace0-4f2c-af38-540713b6574en@googlegroups.com>
<2022May16.175019@mips.complang.tuwien.ac.at> <c17493ff-4a3d-4e90-bbbd-00f1ec658d23n@googlegroups.com>
<t5vn47$ffi$1@dont-email.me> <1764536f-dad8-4a44-983b-87843568f56fn@googlegroups.com>
<t5vpu4$4u0$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <fb745e5a-a527-4359-b8aa-dbd3b1fe371fn@googlegroups.com>
Subject: Re: Ill-advised use of CMOVE
From: already5...@yahoo.com (Michael S)
Injection-Date: Tue, 17 May 2022 09:43:19 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 1850
 by: Michael S - Tue, 17 May 2022 09:43 UTC

> And not worth building a perfectHash on the fly? There are dynamic
> construction algorithms that are O(n) IIRC.

Not worth a complexity of it.
I don't mean "complexity" in "computation complexity" sense, but in "programmer's effort" sense.
Also in a sense of learning new discipline, non-related to application domain.

Re: Ill-advised use of CMOVE

<01dae4a9-cb7a-4fc3-80ac-faf1c3dab2ffn@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25392&group=comp.arch#25392

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1195:b0:2f3:b8bf:46ab with SMTP id m21-20020a05622a119500b002f3b8bf46abmr18914019qtk.190.1652781086187;
Tue, 17 May 2022 02:51:26 -0700 (PDT)
X-Received: by 2002:a05:6808:13d6:b0:326:c5b0:14c2 with SMTP id
d22-20020a05680813d600b00326c5b014c2mr10580440oiw.37.1652781085913; Tue, 17
May 2022 02:51:25 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.mixmin.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 17 May 2022 02:51:25 -0700 (PDT)
In-Reply-To: <2022May17.110354@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t58mug$ref$2@newsreader4.netcologne.de>
<t5rbct$rbe$1@dont-email.me> <t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at> <a52349f8-ace0-4f2c-af38-540713b6574en@googlegroups.com>
<2022May16.175019@mips.complang.tuwien.ac.at> <c17493ff-4a3d-4e90-bbbd-00f1ec658d23n@googlegroups.com>
<2022May17.110354@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <01dae4a9-cb7a-4fc3-80ac-faf1c3dab2ffn@googlegroups.com>
Subject: Re: Ill-advised use of CMOVE
From: already5...@yahoo.com (Michael S)
Injection-Date: Tue, 17 May 2022 09:51:26 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Michael S - Tue, 17 May 2022 09:51 UTC

On Tuesday, May 17, 2022 at 12:26:13 PM UTC+3, Anton Ertl wrote:
> Michael S <already...@yahoo.com> writes:
> >On Monday, May 16, 2022 at 7:07:29 PM UTC+3, Anton Ertl wrote:
> >> and a "branchless" variant is much faster.
> >
> >The CPU in question has no CMOVE/Select.
> For binary search it's relatively cheap to do without (at least
> someone showed code here many years ago that looked convincing).
> >However my point is that the most of the time is consumed in strcmp().
> Well, that's the other problem with binary search: In a hash table you
> typically notice inequality by looking at the first character, with
> binary search you have to look further after the first few iterations;
> plus, you need </>= information, not just =/!=, so comparing multiple
> chars at a time is more involved.
> >bsearch loop itself is relatively insignificant. Except that early access to the next key,
> >which is likely not in L1D cache, (and not in L2 cache, because CPU has no L2) can
> >be of benefit. Even if core fetches wrong data 50% of time it is still right another 50%.
> >And memory bandwidth in this case was, to the 1st order of approximation, free.
> Then you could even prefetch both sides (at least if you have
> resources that would otherwise be idle).

That does not fit well within into typical bsearch implementation.
Besides, on this particular CPU it would be harmful, because it can
process only 1 Dcache miss at time.

> >Yes, it is likely. But it does not matter.
> >The difference between O(1) of hash vs O(logN) of bsearch is often insignificant.
> ld N typically is in the range 10-30, which is often significant.
> >In this particular case, at the beginning of the project the tables were of order of dozens
> >of entries, but throughout the years they grew to several hundreds, so, considering the
> >relative slowness of the CPU, it was not totally insignificant.
> So ld N<10, which makes binary search more competetive.

You're beginning to see the light!

> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Ill-advised use of CMOVE

<t609ln$fb6$1@newsreader4.netcologne.de>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25393&group=comp.arch#25393

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-5c6-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Ill-advised use of CMOVE
Date: Tue, 17 May 2022 13:58:15 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <t609ln$fb6$1@newsreader4.netcologne.de>
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org>
<t58mug$ref$2@newsreader4.netcologne.de> <t5rbct$rbe$1@dont-email.me>
<t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at>
Injection-Date: Tue, 17 May 2022 13:58:15 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-5c6-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:5c6:0:7285:c2ff:fe6c:992d";
logging-data="15718"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 17 May 2022 13:58 UTC

Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:
> Thomas Koenig <tkoenig@netcologne.de> writes:
>>Stephen Fuld <sfuld@alumni.cmu.edu.invalid> schrieb:
>>
>>> So, I think a good solution is for the compiler to essentially never
>>> decide to issue CMOV as a replacement for a conditional branch for
>>> architectures with good branch predictors (not saying anything here
>>> about predication.).
>>
>>Binary search was also recently quoted here as an example where
>>CMOV makes a lot of sense
>
> As is often the case, this depends very much on the inputs.
>
> And then there's the question of why one uses binary search in the
> first place. The only time I remember using it is when searching for
> the try-catch block of an exception in a JavaVM implementation. In
> that case it is likely that the path through the binary search can be
> predicted from the branch history before the throw.

Splines (or, more general, interpolation) are a big field.

Re: Ill-advised use of CMOVE

<JQPgK.19868$L_b6.3325@fx33.iad>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25397&group=comp.arch#25397

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx33.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Ill-advised use of CMOVE
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t58n6r$1riu$1@gioia.aioe.org> <t5e1ek$s7c$1@dont-email.me> <2022May10.194427@mips.complang.tuwien.ac.at> <0555194d-edc8-48fb-b304-7f78d62255d3n@googlegroups.com> <hlPeK.4263$pqKf.2583@fx12.iad> <2022May14.100411@mips.complang.tuwien.ac.at> <IZ8gK.6876$j0D5.5549@fx09.iad> <2022May15.194522@mips.complang.tuwien.ac.at> <atygK.48016$JSxf.35968@fx11.iad> <6c37df11-28cc-42a6-a509-2d807ce4dcf8n@googlegroups.com>
In-Reply-To: <6c37df11-28cc-42a6-a509-2d807ce4dcf8n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 26
Message-ID: <JQPgK.19868$L_b6.3325@fx33.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Tue, 17 May 2022 16:13:29 UTC
Date: Tue, 17 May 2022 12:13:16 -0400
X-Received-Bytes: 1857
 by: EricP - Tue, 17 May 2022 16:13 UTC

MitchAlsup wrote:
> On Monday, May 16, 2022 at 3:27:53 PM UTC-5, EricP wrote:
>> Anton Ertl wrote:
>
>>> OoO ARM CPUs exist.
>> Most papers I have seen who write about ARMv7 OoO predication
>> uArch says something to the effect of how difficult it is.
>> I haven't been able to find any actual details.
> <
> Perhaps because their predication model was not well though out !!
> <
> This does not mean that all forms of predication are difficult in all
> kinds of instruction queue devices and mechanisms.

I'm not suggesting otherwise.
ARM has had at least 4 major implementations of OoO conditional execution
and predication, and they would have learned a thing or two along the way.
It might be nice to see where the warts are, and any ideas on how
to minimize them.

Re: Ill-advised use of CMOVE

<b5bdadae-8ddf-4148-8fcf-0e5dfc902748n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25401&group=comp.arch#25401

 copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:7f43:0:b0:2f3:d55d:7296 with SMTP id g3-20020ac87f43000000b002f3d55d7296mr20565239qtk.635.1652806041730;
Tue, 17 May 2022 09:47:21 -0700 (PDT)
X-Received: by 2002:a9d:75d0:0:b0:606:19e2:47e with SMTP id
c16-20020a9d75d0000000b0060619e2047emr8423665otl.298.1652806041463; Tue, 17
May 2022 09:47:21 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 17 May 2022 09:47:21 -0700 (PDT)
In-Reply-To: <JQPgK.19868$L_b6.3325@fx33.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:b906:a655:bfcf:bf2f;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:b906:a655:bfcf:bf2f
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t58n6r$1riu$1@gioia.aioe.org>
<t5e1ek$s7c$1@dont-email.me> <2022May10.194427@mips.complang.tuwien.ac.at>
<0555194d-edc8-48fb-b304-7f78d62255d3n@googlegroups.com> <hlPeK.4263$pqKf.2583@fx12.iad>
<2022May14.100411@mips.complang.tuwien.ac.at> <IZ8gK.6876$j0D5.5549@fx09.iad>
<2022May15.194522@mips.complang.tuwien.ac.at> <atygK.48016$JSxf.35968@fx11.iad>
<6c37df11-28cc-42a6-a509-2d807ce4dcf8n@googlegroups.com> <JQPgK.19868$L_b6.3325@fx33.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b5bdadae-8ddf-4148-8fcf-0e5dfc902748n@googlegroups.com>
Subject: Re: Ill-advised use of CMOVE
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 17 May 2022 16:47:21 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 2626
 by: MitchAlsup - Tue, 17 May 2022 16:47 UTC

On Tuesday, May 17, 2022 at 11:13:32 AM UTC-5, EricP wrote:
> MitchAlsup wrote:
> > On Monday, May 16, 2022 at 3:27:53 PM UTC-5, EricP wrote:
> >> Anton Ertl wrote:
> >
> >>> OoO ARM CPUs exist.
> >> Most papers I have seen who write about ARMv7 OoO predication
> >> uArch says something to the effect of how difficult it is.
> >> I haven't been able to find any actual details.
> > <
> > Perhaps because their predication model was not well though out !!
> > <
> > This does not mean that all forms of predication are difficult in all
> > kinds of instruction queue devices and mechanisms.
> I'm not suggesting otherwise.
> ARM has had at least 4 major implementations of OoO conditional execution
> and predication, and they would have learned a thing or two along the way.
> It might be nice to see where the warts are, and any ideas on how
> to minimize them.
<
A) do not do predication with condition codes !
B) do not waste bits in each instruction to support predication !
C) is anything more needed ?!?

Re: Ill-advised use of CMOVE

<t60p72$el5$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25403&group=comp.arch#25403

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Ill-advised use of CMOVE
Date: Tue, 17 May 2022 13:23:28 -0500
Organization: A noiseless patient Spider
Lines: 53
Message-ID: <t60p72$el5$1@dont-email.me>
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org>
<t58mug$ref$2@newsreader4.netcologne.de> <t5rbct$rbe$1@dont-email.me>
<t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at> <OQrgK.3334$6y5f.1197@fx40.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 17 May 2022 18:23:30 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="407bfcdffd720520a9d83d643dadfa29";
logging-data="15013"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/DXCfi9JUWVK0jJh+LwasW"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.9.0
Cancel-Lock: sha1:gFNl5Rw01SoGTVyan1md+Gstb1Y=
In-Reply-To: <OQrgK.3334$6y5f.1197@fx40.iad>
Content-Language: en-US
 by: BGB - Tue, 17 May 2022 18:23 UTC

On 5/16/2022 7:54 AM, EricP wrote:
> Anton Ertl wrote:
>> Thomas Koenig <tkoenig@netcologne.de> writes:
>>> Stephen Fuld <sfuld@alumni.cmu.edu.invalid> schrieb:
>>>
>>>> So, I think a good solution is for the compiler to essentially never
>>>> decide to issue CMOV as a replacement for a conditional branch for
>>>> architectures with good branch predictors (not saying anything here
>>>> about predication.).
>>> Binary search was also recently quoted here as an example where
>>> CMOV makes a lot of sense
>>
>> As is often the case, this depends very much on the inputs.
>>
>> And then there's the question of why one uses binary search in the
>> first place.  The only time I remember using it is when searching for
>> the try-catch block of an exception in a JavaVM implementation.  In
>> that case it is likely that the path through the binary search can be
>> predicted from the branch history before the throw.
>
> Binary search is one of the decompositions for SWITCH statements
> with discontiguous blocks.
>

Agreed.

Before I added jump-tables to BGBCC, binary search was the primary form
of "switch()".

Say:
1..5 case labels:
Use if-Else
6+: Divide space in half, recurse on left and right halves.
Compare-and-branch to right half.

But, with jump tables:
1..5 case labels:
Use if-Else
6..511 case labels, density>=50%:
Use jump table.
Else: Divide space in half, recurse on left and right halves.
Compare-and-branch to right half.

This method deals acceptably with things like switch'ing based on a
bunch of FOURCC values or similar.

Generally, switch needs to gather up a list of all the case labels, sort
them, and then build the dispatch structure, followed by compiling the
switch body.

branchless binary search (was: Ill-advised use of CMOVE)

<2022May17.205327@mips.complang.tuwien.ac.at>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25404&group=comp.arch#25404

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: branchless binary search (was: Ill-advised use of CMOVE)
Date: Tue, 17 May 2022 18:53:27 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 65
Message-ID: <2022May17.205327@mips.complang.tuwien.ac.at>
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t58mug$ref$2@newsreader4.netcologne.de> <t5rbct$rbe$1@dont-email.me> <t5rg76$hp1$1@newsreader4.netcologne.de> <2022May16.080143@mips.complang.tuwien.ac.at> <a52349f8-ace0-4f2c-af38-540713b6574en@googlegroups.com> <2022May16.175019@mips.complang.tuwien.ac.at> <c17493ff-4a3d-4e90-bbbd-00f1ec658d23n@googlegroups.com> <2022May17.110354@mips.complang.tuwien.ac.at>
Injection-Info: reader02.eternal-september.org; posting-host="a6c93b3a07299dfb70710467893880db";
logging-data="19255"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+3rQjbqLxNQzOTGCP1VaJj"
Cancel-Lock: sha1:h9GytJrD7urddNQTvENmU3lLPD4=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 17 May 2022 18:53 UTC

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>Michael S <already5chosen@yahoo.com> writes:
>>On Monday, May 16, 2022 at 7:07:29 PM UTC+3, Anton Ertl wrote:
>>> and a "branchless" variant is much faster.
>>
>>The CPU in question has no CMOVE/Select.
>
>For binary search it's relatively cheap to do without (at least
>someone showed code here many years ago that looked convincing).

I searched around a bit, and found the following code in
<https://stackoverflow.com/questions/11360831/about-the-branchless-binary-search>
by BeeOnRope (Travis Downs):

int needle; // value we are searching for
int *base = ...; // base pointer
int n; // number of elements in the current region
while (n > 1) {
int middle = n/2;
base += (needle < *base[middle]) ? 0 : middle;
n -= middle;
}

(The context indicates that you should verify this code before use.)

And of course (x<y)?0:middle is equivalent to ((x<y)-1)&middle, but
the compiler probably knows this.

The more interesting part is that the posting then goes on to point
out that this branchless code is slower than branching code in the
cases where the loads miss the first few cache levels, even if the
branches are unpredictable. That's because the branchless code makes
every load depend on the previous load, while in the branching code
there is no such dependency if the prediction is correct (50% of the
time in the unpredictable case, more for better predictability).

Downsides of the branching code in the unpredictable case:

* lots of fetching down the predicted path, most of which is probably
canceled.

* in the mispredict case an iteration takes the load latency (in order
to be able to resolce the prediction) plus the mispredict penalty
(plus some minor change).

* in the mispredict case you have speculatively loaded a cache line
that you don't need, and evicted a probably more useful cache line.

You can add prefetches or actual loads to the branchless variant:
E.g., always prefetch the whole left spine down to the leaf (if you
view it like a search tree). In the unpredictable case this is as
accurate (and as wasteful) as the branching code, and the performance
may be a little better (no mispredict penalty). Alternatively,
prefetch one level of both sides. This has the same effect of halving
the recurrence latency, but the load unit and cache eviction cost is
probably lower. You could prefetch further, but the costs rise
exponentially, while the benefits shrink with each additional level.

Of course, if the searches are even moderately predictable, the
branching code wins hands-down.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Ill-advised use of CMOVE

<a2fe9cd2-cacf-4c4e-ac89-d68dcbb4bb4dn@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25405&group=comp.arch#25405

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:c29:b0:45a:fedd:7315 with SMTP id a9-20020a0562140c2900b0045afedd7315mr21363055qvd.59.1652816801670;
Tue, 17 May 2022 12:46:41 -0700 (PDT)
X-Received: by 2002:a05:6808:6d7:b0:325:67ff:a21b with SMTP id
m23-20020a05680806d700b0032567ffa21bmr11066882oih.105.1652816801230; Tue, 17
May 2022 12:46:41 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 17 May 2022 12:46:40 -0700 (PDT)
In-Reply-To: <t60p72$el5$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:b906:a655:bfcf:bf2f;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:b906:a655:bfcf:bf2f
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t58mug$ref$2@newsreader4.netcologne.de>
<t5rbct$rbe$1@dont-email.me> <t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at> <OQrgK.3334$6y5f.1197@fx40.iad> <t60p72$el5$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <a2fe9cd2-cacf-4c4e-ac89-d68dcbb4bb4dn@googlegroups.com>
Subject: Re: Ill-advised use of CMOVE
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 17 May 2022 19:46:41 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Tue, 17 May 2022 19:46 UTC

On Tuesday, May 17, 2022 at 1:23:33 PM UTC-5, BGB wrote:
> On 5/16/2022 7:54 AM, EricP wrote:
> > Anton Ertl wrote:
> >> Thomas Koenig <tko...@netcologne.de> writes:
> >>> Stephen Fuld <sf...@alumni.cmu.edu.invalid> schrieb:
> >>>
> >>>> So, I think a good solution is for the compiler to essentially never
> >>>> decide to issue CMOV as a replacement for a conditional branch for
> >>>> architectures with good branch predictors (not saying anything here
> >>>> about predication.).
> >>> Binary search was also recently quoted here as an example where
> >>> CMOV makes a lot of sense
> >>
> >> As is often the case, this depends very much on the inputs.
> >>
> >> And then there's the question of why one uses binary search in the
> >> first place. The only time I remember using it is when searching for
> >> the try-catch block of an exception in a JavaVM implementation. In
> >> that case it is likely that the path through the binary search can be
> >> predicted from the branch history before the throw.
> >
> > Binary search is one of the decompositions for SWITCH statements
> > with discontiguous blocks.
> >
>
> Agreed.
>
> Before I added jump-tables to BGBCC, binary search was the primary form
> of "switch()".
>
>
> Say:
> 1..5 case labels:
> Use if-Else
> 6+: Divide space in half, recurse on left and right halves.
> Compare-and-branch to right half.
>
> But, with jump tables:
> 1..5 case labels:
> Use if-Else
> 6..511 case labels, density>=50%:
> Use jump table.
> Else: Divide space in half, recurse on left and right halves.
> Compare-and-branch to right half.
<
My 66000 has tabularized jumps directly (and PIC, too)
For a short switch clause (less than 256 instructions in the switch clause)
The tabularized jump is ALWAYS smaller than a series of if-elses.
<
So, the instruction looks like::
<
JTT<size> Rcase, #Bounds
<
if( Rcase < 0 or Rcase > Brounds ) IP = IP + 4 + (Bounds<<2)
else IP = IP + memory.unsigned.size[ IP + 4 + Rcase << size ] << 2;
<
some of the time these offsets are bytes so each entry in the table is ¼
the size of a branch or ⅛th the label of an address. But almost all switch
statements are encoded as 16-bit displacements (unless they fit a byte).
<
The table in in code space, is fetched through the instruction cache,
needs no intermediate register, and is position independent.
<
Typical code sequences look like:
<
JTT Rt,#bounds
.byte label0-.,label1-.,label2-.,default-.
........ // until all the labels have been enumerated
label0: // case[0]
BR end_switch
label1: // case[1]
BR end_switch
...
default: // default clause
...
end_switch:
// normal code from here on out.
>
>
> This method deals acceptably with things like switch'ing based on a
> bunch of FOURCC values or similar.
>
>
> Generally, switch needs to gather up a list of all the case labels, sort
> them, and then build the dispatch structure, followed by compiling the
> switch body.

Re: Ill-advised use of CMOVE

<t6116l$cgc$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25407&group=comp.arch#25407

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Ill-advised use of CMOVE
Date: Tue, 17 May 2022 15:39:47 -0500
Organization: A noiseless patient Spider
Lines: 157
Message-ID: <t6116l$cgc$1@dont-email.me>
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org>
<t58mug$ref$2@newsreader4.netcologne.de> <t5rbct$rbe$1@dont-email.me>
<t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at> <OQrgK.3334$6y5f.1197@fx40.iad>
<t60p72$el5$1@dont-email.me>
<a2fe9cd2-cacf-4c4e-ac89-d68dcbb4bb4dn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 17 May 2022 20:39:49 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="407bfcdffd720520a9d83d643dadfa29";
logging-data="12812"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+uLykzrqGh3wqFFjYb2rvJ"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.9.0
Cancel-Lock: sha1:MFx2BINERpfiYxVuk+eMfrvaXpk=
In-Reply-To: <a2fe9cd2-cacf-4c4e-ac89-d68dcbb4bb4dn@googlegroups.com>
Content-Language: en-US
 by: BGB - Tue, 17 May 2022 20:39 UTC

On 5/17/2022 2:46 PM, MitchAlsup wrote:
> On Tuesday, May 17, 2022 at 1:23:33 PM UTC-5, BGB wrote:
>> On 5/16/2022 7:54 AM, EricP wrote:
>>> Anton Ertl wrote:
>>>> Thomas Koenig <tko...@netcologne.de> writes:
>>>>> Stephen Fuld <sf...@alumni.cmu.edu.invalid> schrieb:
>>>>>
>>>>>> So, I think a good solution is for the compiler to essentially never
>>>>>> decide to issue CMOV as a replacement for a conditional branch for
>>>>>> architectures with good branch predictors (not saying anything here
>>>>>> about predication.).
>>>>> Binary search was also recently quoted here as an example where
>>>>> CMOV makes a lot of sense
>>>>
>>>> As is often the case, this depends very much on the inputs.
>>>>
>>>> And then there's the question of why one uses binary search in the
>>>> first place. The only time I remember using it is when searching for
>>>> the try-catch block of an exception in a JavaVM implementation. In
>>>> that case it is likely that the path through the binary search can be
>>>> predicted from the branch history before the throw.
>>>
>>> Binary search is one of the decompositions for SWITCH statements
>>> with discontiguous blocks.
>>>
>>
>> Agreed.
>>
>> Before I added jump-tables to BGBCC, binary search was the primary form
>> of "switch()".
>>
>>
>> Say:
>> 1..5 case labels:
>> Use if-Else
>> 6+: Divide space in half, recurse on left and right halves.
>> Compare-and-branch to right half.
>>
>> But, with jump tables:
>> 1..5 case labels:
>> Use if-Else
>> 6..511 case labels, density>=50%:
>> Use jump table.
>> Else: Divide space in half, recurse on left and right halves.
>> Compare-and-branch to right half.
> <
> My 66000 has tabularized jumps directly (and PIC, too)
> For a short switch clause (less than 256 instructions in the switch clause)
> The tabularized jump is ALWAYS smaller than a series of if-elses.
> <
> So, the instruction looks like::
> <
> JTT<size> Rcase, #Bounds
> <
> if( Rcase < 0 or Rcase > Brounds ) IP = IP + 4 + (Bounds<<2)
> else IP = IP + memory.unsigned.size[ IP + 4 + Rcase << size ] << 2;
> <
> some of the time these offsets are bytes so each entry in the table is ¼
> the size of a branch or ⅛th the label of an address. But almost all switch
> statements are encoded as 16-bit displacements (unless they fit a byte).
> <
> The table in in code space, is fetched through the instruction cache,
> needs no intermediate register, and is position independent.
> <
> Typical code sequences look like:
> <
> JTT Rt,#bounds
> .byte label0-.,label1-.,label2-.,default-.
> ........ // until all the labels have been enumerated
> label0: // case[0]
> BR end_switch
> label1: // case[1]
> BR end_switch
> ...
> default: // default clause
> ...
> end_switch:
> // normal code from here on out.

OK.

In my case, it would be something like:
SUB Ri, Base, Rt
CMPHI Limit, Rt //unsigned Rt>Limit
BT .default
ADD Rt, Rt //(assuming a table of 32-bit branch ops)
BRA (PC, Rt) //AKA: "BRAF Rt"
... table of branch ops ...

Would take around 16 clock cycles.

But, for 5 or less, eg:
CMPEQ Val1, Ri
BT .lbl1
CMPEQ Val2, Ri
BT .lbl2
CMPEQ Val3, Ri
BT .lbl3
CMPEQ Val4, Ri
BT .lbl4
CMPEQ Val5, Ri
BT .lbl5
BRA .default

Will (on average) get to the destination in less than 16 clock-cycles
(or less for shorter sequences).

For the binary-tree case, the binary branches will tend to have a higher
mispredict rate than the linear-search branches (so a cut-off of 5 makes
more sense than 3 or similar).

Say:
CMPGE Val6, Ri
BT .sub1

CMPEQ Val1, Ri
BT .lbl1
CMPEQ Val2, Ri
BT .lbl2
CMPEQ Val3, Ri
BT .lbl3
CMPEQ Val4, Ri
BT .lbl4
CMPEQ Val5, Ri
BT .lbl5
BRA .default

.sub1:
CMPEQ Val6, Ri
BT .lbl6
CMPEQ Val7, Ri
BT .lbl7
CMPEQ Val8, Ri
BT .lbl8
CMPEQ Val9, Ri
BT .lbl9
BRA .default

Makes sense if the value space is sparse.

>>
>>
>> This method deals acceptably with things like switch'ing based on a
>> bunch of FOURCC values or similar.
>>
>>
>> Generally, switch needs to gather up a list of all the case labels, sort
>> them, and then build the dispatch structure, followed by compiling the
>> switch body.

Re: Ill-advised use of CMOVE

<6f006c4d-143b-424d-8aa5-d15e86dc0496n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25408&group=comp.arch#25408

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:31a2:b0:6a0:1d82:8907 with SMTP id bi34-20020a05620a31a200b006a01d828907mr17741356qkb.408.1652822223450;
Tue, 17 May 2022 14:17:03 -0700 (PDT)
X-Received: by 2002:a05:6830:2467:b0:606:b036:b15f with SMTP id
x39-20020a056830246700b00606b036b15fmr8839212otr.213.1652822223042; Tue, 17
May 2022 14:17:03 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 17 May 2022 14:17:02 -0700 (PDT)
In-Reply-To: <t6116l$cgc$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:b906:a655:bfcf:bf2f;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:b906:a655:bfcf:bf2f
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t58mug$ref$2@newsreader4.netcologne.de>
<t5rbct$rbe$1@dont-email.me> <t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at> <OQrgK.3334$6y5f.1197@fx40.iad>
<t60p72$el5$1@dont-email.me> <a2fe9cd2-cacf-4c4e-ac89-d68dcbb4bb4dn@googlegroups.com>
<t6116l$cgc$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6f006c4d-143b-424d-8aa5-d15e86dc0496n@googlegroups.com>
Subject: Re: Ill-advised use of CMOVE
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 17 May 2022 21:17:03 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Tue, 17 May 2022 21:17 UTC

On Tuesday, May 17, 2022 at 3:39:52 PM UTC-5, BGB wrote:
> On 5/17/2022 2:46 PM, MitchAlsup wrote:
> > On Tuesday, May 17, 2022 at 1:23:33 PM UTC-5, BGB wrote:
> >> On 5/16/2022 7:54 AM, EricP wrote:
> >>> Anton Ertl wrote:
> >>>> Thomas Koenig <tko...@netcologne.de> writes:
> >>>>> Stephen Fuld <sf...@alumni.cmu.edu.invalid> schrieb:
> >>>>>
> >>>>>> So, I think a good solution is for the compiler to essentially never
> >>>>>> decide to issue CMOV as a replacement for a conditional branch for
> >>>>>> architectures with good branch predictors (not saying anything here
> >>>>>> about predication.).
> >>>>> Binary search was also recently quoted here as an example where
> >>>>> CMOV makes a lot of sense
> >>>>
> >>>> As is often the case, this depends very much on the inputs.
> >>>>
> >>>> And then there's the question of why one uses binary search in the
> >>>> first place. The only time I remember using it is when searching for
> >>>> the try-catch block of an exception in a JavaVM implementation. In
> >>>> that case it is likely that the path through the binary search can be
> >>>> predicted from the branch history before the throw.
> >>>
> >>> Binary search is one of the decompositions for SWITCH statements
> >>> with discontiguous blocks.
> >>>
> >>
> >> Agreed.
> >>
> >> Before I added jump-tables to BGBCC, binary search was the primary form
> >> of "switch()".
> >>
> >>
> >> Say:
> >> 1..5 case labels:
> >> Use if-Else
> >> 6+: Divide space in half, recurse on left and right halves.
> >> Compare-and-branch to right half.
> >>
> >> But, with jump tables:
> >> 1..5 case labels:
> >> Use if-Else
> >> 6..511 case labels, density>=50%:
> >> Use jump table.
> >> Else: Divide space in half, recurse on left and right halves.
> >> Compare-and-branch to right half.
> > <
> > My 66000 has tabularized jumps directly (and PIC, too)
> > For a short switch clause (less than 256 instructions in the switch clause)
> > The tabularized jump is ALWAYS smaller than a series of if-elses.
> > <
> > So, the instruction looks like::
> > <
> > JTT<size> Rcase, #Bounds
> > <
> > if( Rcase < 0 or Rcase > Brounds ) IP = IP + 4 + (Bounds<<2)
> > else IP = IP + memory.unsigned.size[ IP + 4 + Rcase << size ] << 2;
> > <
> > some of the time these offsets are bytes so each entry in the table is ¼
> > the size of a branch or ⅛th the label of an address. But almost all switch
> > statements are encoded as 16-bit displacements (unless they fit a byte)..
> > <
> > The table in in code space, is fetched through the instruction cache,
> > needs no intermediate register, and is position independent.
> > <
> > Typical code sequences look like:
> > <
> > JTT Rt,#bounds
> > .byte label0-.,label1-.,label2-.,default-.
> > ........ // until all the labels have been enumerated
> > label0: // case[0]
> > BR end_switch
> > label1: // case[1]
> > BR end_switch
> > ...
> > default: // default clause
> > ...
> > end_switch:
> > // normal code from here on out.
> OK.
>
>
> In my case, it would be something like:
> SUB Ri, Base, Rt
> CMPHI Limit, Rt //unsigned Rt>Limit
> BT .default
> ADD Rt, Rt //(assuming a table of 32-bit branch ops)
> BRA (PC, Rt) //AKA: "BRAF Rt"
> ... table of branch ops ...
>
> Would take around 16 clock cycles.
<
Which is 12 too many........
>
>
>
> But, for 5 or less, eg:
> CMPEQ Val1, Ri
> BT .lbl1
> CMPEQ Val2, Ri
> BT .lbl2
> CMPEQ Val3, Ri
> BT .lbl3
> CMPEQ Val4, Ri
> BT .lbl4
> CMPEQ Val5, Ri
> BT .lbl5
> BRA .default
>
> Will (on average) get to the destination in less than 16 clock-cycles
> (or less for shorter sequences).
<
Mine takes 4 cycles uniformly.
>
>
> For the binary-tree case, the binary branches will tend to have a higher
> mispredict rate than the linear-search branches (so a cut-off of 5 makes
> more sense than 3 or similar).
<
Note: I was only talking about switches in general and not participating in the
search portions of this thread.
>
>
> Say:
> CMPGE Val6, Ri
> BT .sub1
>
> CMPEQ Val1, Ri
> BT .lbl1
> CMPEQ Val2, Ri
> BT .lbl2
> CMPEQ Val3, Ri
> BT .lbl3
> CMPEQ Val4, Ri
> BT .lbl4
> CMPEQ Val5, Ri
> BT .lbl5
> BRA .default
>
> .sub1:
> CMPEQ Val6, Ri
> BT .lbl6
> CMPEQ Val7, Ri
> BT .lbl7
> CMPEQ Val8, Ri
> BT .lbl8
> CMPEQ Val9, Ri
> BT .lbl9
> BRA .default
>
>
> Makes sense if the value space is sparse.
<
By encoding labels as word displacements off the IP, one can allow
much greater sparseness without impacting code size or cycles.
> >>
> >>
> >> This method deals acceptably with things like switch'ing based on a
> >> bunch of FOURCC values or similar.
> >>
> >>
> >> Generally, switch needs to gather up a list of all the case labels, sort
> >> them, and then build the dispatch structure, followed by compiling the
> >> switch body.

Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)

<t614s3$6ch$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25409&group=comp.arch#25409

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ggt...@yahoo.com (Brett)
Newsgroups: comp.arch
Subject: Re: binary search vs. hash tables (was: Ill-advised use of CMOVE)
Date: Tue, 17 May 2022 21:42:27 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 13
Message-ID: <t614s3$6ch$1@dont-email.me>
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org>
<t5t8r4$qdi$1@dont-email.me>
<2022May16.180830@mips.complang.tuwien.ac.at>
<t5u35g$fo4$1@dont-email.me>
<t5u5c4$2n6r$1@gal.iecc.com>
<677f55fc-8833-4830-bacc-9c3925c7b1adn@googlegroups.com>
<6985cff9-13dd-4b59-8a22-0d418f156c12n@googlegroups.com>
<69eee9e4-1f34-4994-88c7-65d756baf30an@googlegroups.com>
<30054530-21ac-4bd3-88b4-a0e74ea5af87n@googlegroups.com>
<62e6eb85-26ef-45fb-9b22-fbe787447f1en@googlegroups.com>
<736a42b0-8608-49be-aa94-e3800e1a0836n@googlegroups.com>
<0f19cd66-c04b-406c-8e79-4565b1013e45n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 17 May 2022 21:42:27 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="864feb8eadbbedee2962f56dbb6930d3";
logging-data="6545"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19ZihovK6KoYleYQbK3A9iC"
User-Agent: NewsTap/5.5 (iPad)
Cancel-Lock: sha1:35qAa9POEyl8de8oQzAmyiytioA=
sha1:kIRPwx63tMlISkQuu5F2qytZ0VU=
 by: Brett - Tue, 17 May 2022 21:42 UTC

Michael S <already5chosen@yahoo.com> wrote:
> On Tuesday, May 17, 2022 at 11:38:22 AM UTC+3, Michael S wrote:
>> Sorry for all 'test' posts.
>> I am trying to figure out if google, in their eternal wisdom, had broken
>> a posting on Google Groups completely,
>> or if it did it selectively, just on Firefox.
>> So far it looks like the later.
>
> More the strangerer

There was a corrupt spam post since deleted that broke the NewTap reader on
iOS, had to delete comp.arch, re-add and restart the app.

Re: branchless binary search

<jwvpmkb7q1h.fsf-monnier+comp.arch@gnu.org>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25410&group=comp.arch#25410

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: branchless binary search
Date: Tue, 17 May 2022 18:28:53 -0400
Organization: A noiseless patient Spider
Lines: 8
Message-ID: <jwvpmkb7q1h.fsf-monnier+comp.arch@gnu.org>
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org>
<t58mug$ref$2@newsreader4.netcologne.de> <t5rbct$rbe$1@dont-email.me>
<t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at>
<a52349f8-ace0-4f2c-af38-540713b6574en@googlegroups.com>
<2022May16.175019@mips.complang.tuwien.ac.at>
<c17493ff-4a3d-4e90-bbbd-00f1ec658d23n@googlegroups.com>
<2022May17.110354@mips.complang.tuwien.ac.at>
<2022May17.205327@mips.complang.tuwien.ac.at>
MIME-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="5699735ead36ff3bc62dd7f83ae394c2";
logging-data="17868"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/DWJcIiUkxv/NUlTrMiRT7"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)
Cancel-Lock: sha1:eFbmFX4UUE1efXIKDQGFOHLYi4s=
sha1:7XPxNjiEdAIxwKPDrVI8z5ySL1k=
 by: Stefan Monnier - Tue, 17 May 2022 22:28 UTC

> * in the mispredict case you have speculatively loaded a cache line
> that you don't need, and evicted a probably more useful cache line.

If your CPU is careful to avoid Spectre attacks, this should not be the
case, right?

Stefan

Re: Ill-advised use of CMOVE

<t617me$q56$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25411&group=comp.arch#25411

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Ill-advised use of CMOVE
Date: Tue, 17 May 2022 17:30:36 -0500
Organization: A noiseless patient Spider
Lines: 217
Message-ID: <t617me$q56$1@dont-email.me>
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org>
<t58mug$ref$2@newsreader4.netcologne.de> <t5rbct$rbe$1@dont-email.me>
<t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at> <OQrgK.3334$6y5f.1197@fx40.iad>
<t60p72$el5$1@dont-email.me>
<a2fe9cd2-cacf-4c4e-ac89-d68dcbb4bb4dn@googlegroups.com>
<t6116l$cgc$1@dont-email.me>
<6f006c4d-143b-424d-8aa5-d15e86dc0496n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 17 May 2022 22:30:38 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="65abca4d26331c3457a4e36355643824";
logging-data="26790"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Pmu3cccMGCiTRdNYpj8oy"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.9.0
Cancel-Lock: sha1:hTdCZxIXcXFzCWoVOvCbwPyDpX4=
In-Reply-To: <6f006c4d-143b-424d-8aa5-d15e86dc0496n@googlegroups.com>
Content-Language: en-US
 by: BGB - Tue, 17 May 2022 22:30 UTC

On 5/17/2022 4:17 PM, MitchAlsup wrote:
> On Tuesday, May 17, 2022 at 3:39:52 PM UTC-5, BGB wrote:
>> On 5/17/2022 2:46 PM, MitchAlsup wrote:
>>> On Tuesday, May 17, 2022 at 1:23:33 PM UTC-5, BGB wrote:
>>>> On 5/16/2022 7:54 AM, EricP wrote:
>>>>> Anton Ertl wrote:
>>>>>> Thomas Koenig <tko...@netcologne.de> writes:
>>>>>>> Stephen Fuld <sf...@alumni.cmu.edu.invalid> schrieb:
>>>>>>>
>>>>>>>> So, I think a good solution is for the compiler to essentially never
>>>>>>>> decide to issue CMOV as a replacement for a conditional branch for
>>>>>>>> architectures with good branch predictors (not saying anything here
>>>>>>>> about predication.).
>>>>>>> Binary search was also recently quoted here as an example where
>>>>>>> CMOV makes a lot of sense
>>>>>>
>>>>>> As is often the case, this depends very much on the inputs.
>>>>>>
>>>>>> And then there's the question of why one uses binary search in the
>>>>>> first place. The only time I remember using it is when searching for
>>>>>> the try-catch block of an exception in a JavaVM implementation. In
>>>>>> that case it is likely that the path through the binary search can be
>>>>>> predicted from the branch history before the throw.
>>>>>
>>>>> Binary search is one of the decompositions for SWITCH statements
>>>>> with discontiguous blocks.
>>>>>
>>>>
>>>> Agreed.
>>>>
>>>> Before I added jump-tables to BGBCC, binary search was the primary form
>>>> of "switch()".
>>>>
>>>>
>>>> Say:
>>>> 1..5 case labels:
>>>> Use if-Else
>>>> 6+: Divide space in half, recurse on left and right halves.
>>>> Compare-and-branch to right half.
>>>>
>>>> But, with jump tables:
>>>> 1..5 case labels:
>>>> Use if-Else
>>>> 6..511 case labels, density>=50%:
>>>> Use jump table.
>>>> Else: Divide space in half, recurse on left and right halves.
>>>> Compare-and-branch to right half.
>>> <
>>> My 66000 has tabularized jumps directly (and PIC, too)
>>> For a short switch clause (less than 256 instructions in the switch clause)
>>> The tabularized jump is ALWAYS smaller than a series of if-elses.
>>> <
>>> So, the instruction looks like::
>>> <
>>> JTT<size> Rcase, #Bounds
>>> <
>>> if( Rcase < 0 or Rcase > Brounds ) IP = IP + 4 + (Bounds<<2)
>>> else IP = IP + memory.unsigned.size[ IP + 4 + Rcase << size ] << 2;
>>> <
>>> some of the time these offsets are bytes so each entry in the table is ¼
>>> the size of a branch or ⅛th the label of an address. But almost all switch
>>> statements are encoded as 16-bit displacements (unless they fit a byte).
>>> <
>>> The table in in code space, is fetched through the instruction cache,
>>> needs no intermediate register, and is position independent.
>>> <
>>> Typical code sequences look like:
>>> <
>>> JTT Rt,#bounds
>>> .byte label0-.,label1-.,label2-.,default-.
>>> ........ // until all the labels have been enumerated
>>> label0: // case[0]
>>> BR end_switch
>>> label1: // case[1]
>>> BR end_switch
>>> ...
>>> default: // default clause
>>> ...
>>> end_switch:
>>> // normal code from here on out.
>> OK.
>>
>>
>> In my case, it would be something like:
>> SUB Ri, Base, Rt
>> CMPHI Limit, Rt //unsigned Rt>Limit
>> BT .default
>> ADD Rt, Rt //(assuming a table of 32-bit branch ops)
>> BRA (PC, Rt) //AKA: "BRAF Rt"
>> ... table of branch ops ...
>>
>> Would take around 16 clock cycles.
> <
> Which is 12 too many........

Just the "BRA (PC, Rt)" by itself takes 8 cycles, since it effectively
flushes the pipeline...

>>
>>
>>
>> But, for 5 or less, eg:
>> CMPEQ Val1, Ri
>> BT .lbl1
>> CMPEQ Val2, Ri
>> BT .lbl2
>> CMPEQ Val3, Ri
>> BT .lbl3
>> CMPEQ Val4, Ri
>> BT .lbl4
>> CMPEQ Val5, Ri
>> BT .lbl5
>> BRA .default
>>
>> Will (on average) get to the destination in less than 16 clock-cycles
>> (or less for shorter sequences).
> <
> Mine takes 4 cycles uniformly.

The variability in this case is due to which branch it goes down and
whether it hits or misses.

My emulator doesn't really model the branch predictor though, mostly
because it doesn't contribute much to the overall value, so works with a
few simplifying assumptions:
BRA/BSR (direct): 2c
BT/BF (direct): 3c (~ average case)
Actual:
1c: Not Taken, Correctly predicted
2c: Taken, Correctly predicted
8c: Mispredicted
RTS: 2c | 8c (if LR was modified recently)
Indirect: 8c | 2c (if R1 and R1 was not modified recently)

There are also Jumbo-Abs48 branches, but these will cost 8 cycles mostly
because the branch predictor ignores them (not really used often enough
to justify the cost of handling them).

The cost of RTS and Indirect branches (via R1) also depends some on
whether or not it would be an Inter-ISA branch, as this case will always
require a pipeline flush.

>>
>>
>> For the binary-tree case, the binary branches will tend to have a higher
>> mispredict rate than the linear-search branches (so a cut-off of 5 makes
>> more sense than 3 or similar).
> <
> Note: I was only talking about switches in general and not participating in the
> search portions of this thread.
>>
>>
>> Say:
>> CMPGE Val6, Ri
>> BT .sub1
>>
>> CMPEQ Val1, Ri
>> BT .lbl1
>> CMPEQ Val2, Ri
>> BT .lbl2
>> CMPEQ Val3, Ri
>> BT .lbl3
>> CMPEQ Val4, Ri
>> BT .lbl4
>> CMPEQ Val5, Ri
>> BT .lbl5
>> BRA .default
>>
>> .sub1:
>> CMPEQ Val6, Ri
>> BT .lbl6
>> CMPEQ Val7, Ri
>> BT .lbl7
>> CMPEQ Val8, Ri
>> BT .lbl8
>> CMPEQ Val9, Ri
>> BT .lbl9
>> BRA .default
>>
>>
>> Makes sense if the value space is sparse.
> <
> By encoding labels as word displacements off the IP, one can allow
> much greater sparseness without impacting code size or cycles.

Possibly.

For typical functions, one could use a table of 16-bit displacements and
or similar.

However, this requires the compiler doing a word "dst=dst+(lbl2-lbl1)"
reloc, and would likely end up being on-average slower than the
double-jump method (if, albeit, slightly more compact).

It is possible to use 16-bit branch ops as well, but this would only be
applicable to small functions (which typically don't contain a switch
large enough to justify using a jump table).

>>>>
>>>>
>>>> This method deals acceptably with things like switch'ing based on a
>>>> bunch of FOURCC values or similar.
>>>>
>>>>
>>>> Generally, switch needs to gather up a list of all the case labels, sort
>>>> them, and then build the dispatch structure, followed by compiling the
>>>> switch body.

Re: Ill-advised use of CMOVE

<13a8920f-dee2-44bf-a0e7-3d2553eda186n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25412&group=comp.arch#25412

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1210:b0:2f3:da26:3778 with SMTP id y16-20020a05622a121000b002f3da263778mr21949899qtx.173.1652827723872;
Tue, 17 May 2022 15:48:43 -0700 (PDT)
X-Received: by 2002:a9d:5888:0:b0:606:10d2:2fc1 with SMTP id
x8-20020a9d5888000000b0060610d22fc1mr9026620otg.29.1652827723599; Tue, 17 May
2022 15:48:43 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 17 May 2022 15:48:43 -0700 (PDT)
In-Reply-To: <t617me$q56$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:b906:a655:bfcf:bf2f;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:b906:a655:bfcf:bf2f
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org> <t58mug$ref$2@newsreader4.netcologne.de>
<t5rbct$rbe$1@dont-email.me> <t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at> <OQrgK.3334$6y5f.1197@fx40.iad>
<t60p72$el5$1@dont-email.me> <a2fe9cd2-cacf-4c4e-ac89-d68dcbb4bb4dn@googlegroups.com>
<t6116l$cgc$1@dont-email.me> <6f006c4d-143b-424d-8aa5-d15e86dc0496n@googlegroups.com>
<t617me$q56$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <13a8920f-dee2-44bf-a0e7-3d2553eda186n@googlegroups.com>
Subject: Re: Ill-advised use of CMOVE
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 17 May 2022 22:48:43 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 3189
 by: MitchAlsup - Tue, 17 May 2022 22:48 UTC

On Tuesday, May 17, 2022 at 5:30:41 PM UTC-5, BGB wrote:
> On 5/17/2022 4:17 PM, MitchAlsup wrote:

> >>
> >> Makes sense if the value space is sparse.
> > <
> > By encoding labels as word displacements off the IP, one can allow
> > much greater sparseness without impacting code size or cycles.
> Possibly.
>
> For typical functions, one could use a table of 16-bit displacements and
> or similar.
>
> However, this requires the compiler doing a word "dst=dst+(lbl2-lbl1)"
> reloc, and would likely end up being on-average slower than the
> double-jump method (if, albeit, slightly more compact).
<
My 66000 implementations avoid the LD-Align stage as the instruction
buffer can absorb cache-access-width every cycle, avoids forwarding
since it knows the displacement will be added to IP (twice), so doing
this stuff in FETCH greatly simplifies and thins down the cycle count.
<
Given that the table immediately follows the JTT instruction, the
displacements which follow have already been consumed by the IB
and are ready for select-out when Rt resolves (small and medium tables).
>
>
> It is possible to use 16-bit branch ops as well, but this would only be
> applicable to small functions (which typically don't contain a switch
> large enough to justify using a jump table).
> >>>>
> >>>>
> >>>> This method deals acceptably with things like switch'ing based on a
> >>>> bunch of FOURCC values or similar.
> >>>>
> >>>>
> >>>> Generally, switch needs to gather up a list of all the case labels, sort
> >>>> them, and then build the dispatch structure, followed by compiling the
> >>>> switch body.

Re: Ill-advised use of CMOVE

<t61alp$c7p$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=25413&group=comp.arch#25413

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Ill-advised use of CMOVE
Date: Tue, 17 May 2022 16:21:28 -0700
Organization: A noiseless patient Spider
Lines: 33
Message-ID: <t61alp$c7p$1@dont-email.me>
References: <jwv1qx4xe9z.fsf-monnier+comp.arch@gnu.org>
<t58mug$ref$2@newsreader4.netcologne.de> <t5rbct$rbe$1@dont-email.me>
<t5rg76$hp1$1@newsreader4.netcologne.de>
<2022May16.080143@mips.complang.tuwien.ac.at> <OQrgK.3334$6y5f.1197@fx40.iad>
<t60p72$el5$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 17 May 2022 23:21:29 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="368e5f2b90e3d951fb5f2b7df8e7b0e2";
logging-data="12537"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/J2PK/ibeXgLOljyoHUb7a"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.9.0
Cancel-Lock: sha1:T85BAcxEiThjSQPCgBLDcefD5dY=
In-Reply-To: <t60p72$el5$1@dont-email.me>
Content-Language: en-US
 by: Ivan Godard - Tue, 17 May 2022 23:21 UTC

On 5/17/2022 11:23 AM, BGB wrote:
> On 5/16/2022 7:54 AM, EricP wrote:
>> Anton Ertl wrote:
>>> Thomas Koenig <tkoenig@netcologne.de> writes:
>>>> Stephen Fuld <sfuld@alumni.cmu.edu.invalid> schrieb:
>>>>
>>>>> So, I think a good solution is for the compiler to essentially
>>>>> never decide to issue CMOV as a replacement for a conditional
>>>>> branch for architectures with good branch predictors (not saying
>>>>> anything here about predication.).
>>>> Binary search was also recently quoted here as an example where
>>>> CMOV makes a lot of sense
>>>
>>> As is often the case, this depends very much on the inputs.
>>>
>>> And then there's the question of why one uses binary search in the
>>> first place.  The only time I remember using it is when searching for
>>> the try-catch block of an exception in a JavaVM implementation.  In
>>> that case it is likely that the path through the binary search can be
>>> predicted from the branch history before the throw.
>>
>> Binary search is one of the decompositions for SWITCH statements
>> with discontiguous blocks.
>>
>
> Agreed.
>
> Before I added jump-tables to BGBCC, binary search was the primary form
> of "switch()".

Still is on Mill, but then Mill uses exit prediction, not branch
prediction, and so has many fewer predictions to make - and misses to
pay for - than BGBCC. Sort of a static version of trace caches.

Pages:123456
server_pubkey.txt

rocksolid light 0.9.7
clearnet tor