Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

If God had a beard, he'd be a UNIX programmer.


devel / comp.arch / Re: Compact representation for common integer constants

SubjectAuthor
* Compact representation for common integer constantsJohnG
+* Re: Compact representation for common integer constantsIvan Godard
|+- Re: Compact representation for common integer constantsDavid Brown
|`* Re: Compact representation for common integer constantsJohnG
| `* Re: Compact representation for common integer constantsBGB
|  `* Re: Compact representation for common integer constantsMitchAlsup
|   `* Re: Compact representation for common integer constantsBGB
|    `* Re: Compact representation for common integer constantsThomas Koenig
|     +- Re: Compact representation for common integer constantsMitchAlsup
|     `* Re: Compact representation for common integer constantsBGB
|      `* Re: Compact representation for common integer constantsMitchAlsup
|       `* Re: Compact representation for common integer constantsIvan Godard
|        +- Re: Compact representation for common integer constantsMarcus
|        +* Re: Compact representation for common integer constantsBGB
|        |`* Re: Compact representation for common integer constantsMitchAlsup
|        | +* Clamping. was: Compact representation for common integer constantsIvan Godard
|        | |+* Re: Clamping. was: Compact representation for common integer constantsMitchAlsup
|        | ||`* Re: Clamping. was: Compact representation for common integerIvan Godard
|        | || `- Re: Clamping. was: Compact representation for common integer constantsMitchAlsup
|        | |`* Re: Clamping. was: Compact representation for common integerBGB
|        | | `* Re: Clamping. was: Compact representation for common integerIvan Godard
|        | |  `- Re: Clamping. was: Compact representation for common integer constantsMitchAlsup
|        | +* Re: Compact representation for common integer constantsMarcus
|        | |`* Re: Compact representation for common integer constantsMitchAlsup
|        | | `* Re: Compact representation for common integer constantsDavid Brown
|        | |  `* Re: Compact representation for common integer constantsMitchAlsup
|        | |   +- Re: Compact representation for common integer constantsThomas Koenig
|        | |   `* Re: Compact representation for common integer constantsDavid Brown
|        | |    `- Re: Compact representation for common integer constantsMitchAlsup
|        | `* Re: Compact representation for common integer constantsThomas Koenig
|        |  +- Re: Compact representation for common integer constantsAnton Ertl
|        |  `* Re: Compact representation for common integer constantsMitchAlsup
|        |   `* Re: Compact representation for common integer constantsThomas Koenig
|        |    +* Re: Compact representation for common integer constantsAnton Ertl
|        |    |`* Re: Compact representation for common integer constantsBrian G. Lucas
|        |    | +* Re: Compact representation for common integer constantsThomas Koenig
|        |    | |`- Re: Compact representation for common integer constantsBrian G. Lucas
|        |    | +- Re: Compact representation for common integer constantsStefan Monnier
|        |    | `* Re: Compact representation for common integer constantsAnton Ertl
|        |    |  `* Re: Compact representation for common integer constantsThomas Koenig
|        |    |   +* Re: Compact representation for common integer constantsAnton Ertl
|        |    |   |`* Re: Compact representation for common integer constantsThomas Koenig
|        |    |   | `- Re: Compact representation for common integer constantsAnton Ertl
|        |    |   `* Re: Compact representation for common integer constantsTerje Mathisen
|        |    |    `- Re: Compact representation for common integer constantsAnton Ertl
|        |    `* Re: Compact representation for common integer constantsMitchAlsup
|        |     `* Re: Compact representation for common integer constantsThomas Koenig
|        |      `* Re: Compact representation for common integer constantsBrian G. Lucas
|        |       `* Re: Compact representation for common integer constantsThomas Koenig
|        |        +* Re: Compact representation for common integer constantsMitchAlsup
|        |        |`- Re: Compact representation for common integer constantsThomas Koenig
|        |        +* Re: Compact representation for common integer constantsAnton Ertl
|        |        |+* Re: Compact representation for common integer constantsThomas Koenig
|        |        ||+* Re: Compact representation for common integer constantsMitchAlsup
|        |        |||`* Re: Compact representation for common integer constantsThomas Koenig
|        |        ||| `- Re: Compact representation for common integer constantsMitchAlsup
|        |        ||`* Re: Compact representation for common integer constantsAnton Ertl
|        |        || +* Re: Compact representation for common integer constantsMitchAlsup
|        |        || |+* Re: Compact representation for common integer constantsEricP
|        |        || ||+* Re: Compact representation for common integer constantsThomas Koenig
|        |        || |||+- Re: Compact representation for common integer constantsMitchAlsup
|        |        || |||+* Re: Compact representation for common integer constantsEricP
|        |        || ||||`* Re: Compact representation for common integer constantsTerje Mathisen
|        |        || |||| `* Re: Compact representation for common integer constantsDavid Brown
|        |        || ||||  `* Re: Compact representation for common integer constantsTerje Mathisen
|        |        || ||||   `* Re: Compact representation for common integer constantsDavid Brown
|        |        || ||||    `- Re: Compact representation for common integer constantsTerje Mathisen
|        |        || |||`* Re: Compact representation for common integer constantsStephen Fuld
|        |        || ||| `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || |||  +- Re: Compact representation for common integer constantsStephen Fuld
|        |        || |||  `* Re: Compact representation for common integer constantsBill Findlay
|        |        || |||   `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || |||    `- Re: Compact representation for common integer constantsBill Findlay
|        |        || ||+* Re: Compact representation for common integer constantsThomas Koenig
|        |        || |||+* Re: Compact representation for common integer constantsStephen Fuld
|        |        || ||||`- Re: Compact representation for common integer constantsThomas Koenig
|        |        || |||`- Re: Compact representation for common integer constantsEricP
|        |        || ||`* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || +* Re: Compact representation for common integer constantsNiklas Holsti
|        |        || || |`* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || | `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |  `* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || |   `* Re: Compact representation for common integer constantsEricP
|        |        || || |    +* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |`* Re: Compact representation for common integer constantsEricP
|        |        || || |    | `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |  `* Re: Compact representation for common integer constantsThomas Koenig
|        |        || || |    |   +- Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |   `* Re: Compact representation for common integer constantsTerje Mathisen
|        |        || || |    |    `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |     +- Re: Compact representation for common integer constantsTerje Mathisen
|        |        || || |    |     `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |      `- Re: Compact representation for common integer constantsTerje Mathisen
|        |        || || |    `* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || |     `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |      +- Re: Compact representation for common integer constantsBill Findlay
|        |        || || |      +* Re: Compact representation for common integer constantsThomas Koenig
|        |        || || |      |+* Re: Compact representation for common integer constantsAnton Ertl
|        |        || || |      ||`* Re: Compact representation for common integer constantsThomas Koenig
|        |        || || |      || `- Re: Compact representation for common integer constantsAnton Ertl
|        |        || || |      |`* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |      +* Re: Compact representation for common integer constantsTerje Mathisen
|        |        || || |      `* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || `* Re: Compact representation for common integer constantsEricP
|        |        || |`- Re: Compact representation for common integer constantsAnton Ertl
|        |        || `* Re: Compact representation for common integer constantsThomas Koenig
|        |        |`* Re: Compact representation for common integer constantsMitchAlsup
|        |        `* Re: Compact representation for common integer constantsBrian G. Lucas
|        `* Re: Compact representation for common integer constantsQuadibloc
+* Re: Compact representation for common integer constantsBGB
`* Re: Compact representation for common integer constantsJohn Levine

Pages:123456789101112131415
Re: Compact representation for common integer constants

<s8522o$tol$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16953&group=comp.arch#16953

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 09:11:20 +0200
Organization: A noiseless patient Spider
Lines: 54
Message-ID: <s8522o$tol$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at> <s8414d$okc$1@dont-email.me>
<2021May20.001137@mips.complang.tuwien.ac.at> <s846ou$s25$1@dont-email.me>
<35c1d9c9-d12a-4812-bcac-07979e0fcaccn@googlegroups.com>
<s84b0a$lht$1@dont-email.me>
<5c8aa7d4-ae7d-465e-bcb4-230ee64b2bfbn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 20 May 2021 07:11:20 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="0f4709d2d2405951cc9d574e4d2a4ec7";
logging-data="30485"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+xGROOBNOGQZNsehXI5M5vxSblDMvrANQ="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
Thunderbird/68.10.0
Cancel-Lock: sha1:IUC6B/0IyfOCA4qKxglrvKf7R9s=
In-Reply-To: <5c8aa7d4-ae7d-465e-bcb4-230ee64b2bfbn@googlegroups.com>
Content-Language: en-GB
 by: David Brown - Thu, 20 May 2021 07:11 UTC

On 20/05/2021 04:12, MitchAlsup wrote:
> On Wednesday, May 19, 2021 at 7:37:33 PM UTC-5, David Brown wrote:
>> On 20/05/2021 01:54, MitchAlsup wrote:
>>> On Wednesday, May 19, 2021 at 6:25:20 PM UTC-5, David Brown wrote:
>>>> On 20/05/2021 00:11, Anton Ertl wrote:
>>>>> David Brown <david...@hesbynett.no> writes:
>>
>>>>
>>>> Perhaps it should also give the suggestion "you should be using unsigned
>>>> types for this kind of thing". As a general rule, anything involving
>>>> bit patterns should be unsigned types - signed types are for abstract
>>>> numbers only.
>>> <
>>> All integral container that do not explicitly need to contain negative
>>> numbers should be unsigned.
>> That also works.
>>
>> But if you are saying this because overflow is defined for unsigned
>> types in C, then I would disagree with the point - as overflow is
>> generally (but not always) a mistake, whether signed or unsigned, and
>> should not occur in normal code.
> <
> I am saying this because; unsigned arithmetic is much better defined
> than signed arithmetic in C. The surprise coefficient is way lower.
> <

Yes, that's what I thought. I disagree. When I use types for
arithmetic (outside of occasional very special cases, and bugs in my
code), the arithmetic does not overflow. Overflows are errors. It
doesn't matter what the language says - if I have 4294967295 apples in a
pile, put another apple on, and you tell me I now have 0 apples, it is
all nonsense.

So IMHO correctly written arithmetic code does not overflow. No
overflow, no surprises.

It's fine to use unsigned types for data that is naturally non-negative.
It makes sense to the reader. And sometimes the extra bit of range is
useful.

But if you think unsigned types are better for general purpose numbers
because they are "better defined", you are using the wrong types for the
arithmetic you are doing, or you have failed to check your values before
using them.

I lost my appetite decades ago for smart-arse code that relies on
overflows to see the end of a loop and that kind of thing. I write my
code to make sense - it says what it does, and does what it says,
instead of playing "bit twiddling tricks". Usually the result is not
only simpler and clearer, safer and more portable, but is at least as
efficient. In the extremely few cases where you need to do something
odd to squeeze the last clock cycle out of critical code, then you put
in the extra "unsigned" casts and other ugly details.

Re: Compact representation for common integer constants

<s855l9$7fi$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16956&group=comp.arch#16956

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!/FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 10:12:25 +0200
Organization: Aioe.org NNTP Server
Lines: 138
Message-ID: <s855l9$7fi$1@gioia.aioe.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s7qlup$dm0$1@newsreader4.netcologne.de>
<e29a79f4-80ba-4dcb-8079-cf2f87a86b3en@googlegroups.com>
<s7svtl$qlt$1@newsreader4.netcologne.de>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<d5edb86e-0166-4c44-83eb-19e1cd8eb2c4n@googlegroups.com>
<nqapI.385883$2A5.264183@fx45.iad> <s83eir$irf$1@newsreader4.netcologne.de>
<U2cpI.62291$od.35116@fx15.iad>
NNTP-Posting-Host: /FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Thu, 20 May 2021 08:12 UTC

EricP wrote:
> Thomas Koenig wrote:
>> EricP <ThatWouldBeTelling@thevillage.com> schrieb:
>>> MitchAlsup wrote:
>>>> On Wednesday, May 19, 2021 at 7:23:04 AM UTC-5, Anton Ertl wrote:
>>>>> Thomas Koenig <tko...@netcologne.de> writes:
>>>>>> Anton Ertl <an...@mips.complang.tuwien.ac.at> schrieb:
>>>>>>> Thomas Koenig <tko...@netcologne.de> writes:
>>>>>>>> Unsigned loops can cause a headache, because it is often not
>>>>>>>> possible to prove they actually terminate...
>>>>>>> And the difference to signed loops is? Note that you wrote
>>>>>>> "actually", so pretending that signed overflow does not happen
>>>>>>> does not count.
>>>>>> According to many language definitions, it does indeed not count.
>>>>> This sentence makes no sense.
>>>> <
>>>> How about this::
>>>> <
>>>> Languages, that want to be ported across multiple architectures, and
>>>> have high performing implementations, cannot specify that signed
>>>> integers overflow.
>>>> <
>>> The language spec should:
>>> - allow the programmer to choose between bounds-range checked,
>>>    unchecked, modulo, or saturating data types
>>
>> That is not going to fly for an established, general-purpose
>> language.  Languages have traditionally tried to stay away from
>> this kind of thing because of the very many different implementations
>> there are.
>
> I didn't say an established language.
> I have no illusions that C is fixable.
>
>> Fortran is a bit of an exception in that it supports IEEE floating
>> point via intrinsic modules.
>>
>> And face it - a numeric type that is not supported by hardware is
>> going to suck rocks through a straw, performance-wise.  Think about
>> the floating-point performance in the Intel days before the numeric
>> coprocessors.
>
> Checked or saturating integer operations are not remotely
> comparable to floating point.
>
> For example, on x64 signed and unsigned overflow check
> are a "jo" or "jc" instruction.
>
> Saturating unsigned and signed add on x64:
>
> // rsi+rdi->rax
> sat_addu64b:
>     add    %rsi, %rdi
>     sbb    %rax, %rax
>     or     %rdi, %rax

That is the easy & sane way to handle unsigned, which only have to
saturate in one direction.

>
> long_max:
>     .quad 0x7fffffffffffffff
> long_min:
>     .quad 0x8000000000000000
>
> // rsi+rdi->rax
> sat_adds64b:
>     mov    %rdi, %rax
>     shr    $0x3f, %rdi
>     add    long_max(%rip), %rdi
>     add    %rsi, %rax
>     cmovo  %rdi, %rax

This is effectively

int64_t adds(int64_t a, int64_t b)
{ int64_t bsign = (b >> 63);
int64_t saturate = bsign + 0x7fffffffffffffff; // 0x80.. or 0x7f..
int64_t sum = a + b;
// overflow happens when sign(a) == sign(b) and sign(sum) != sign(a)
// overflow = ~(a ^ b) & (a ^ sum)
bool overflow = (~(a ^ b) & (a ^sum)) >> 63; // OF mask;
sum = (saturate & overflow) | (sum & ~overflow);
return sum;
}

The compiler/HW can schedule this as

first cycle:
bsign = b >> 63;
sum = a + b;
t_a_xor_b = a ^ b;

second cycle:
saturate = bsign + 0x7fffffffffffffff;
t_a_xor_sum = a ^ sum;
t_a_eq_b = ~t_a_xor_b;

third cycle:
overflow = t_a_xor_sum & t_a_eq_b;

fourth cycle:
overflow >>= 63;

fifth cycle:
saturate &= overflow;
not_overflow = ~overflow;

sixth cycle:
sum &= not_overflow;

seventh cycle:
sum |= saturate;

If the compiler knows about CMOV or similar then we would be better off
writing it as

int64_t adds(int64_t a, int64_t b)
{ int64_t bsign = (b >> 63);
int64_t saturate = bsign + 0x7fffffffffffffff; // 0x80.. or 0x7f..
int64_t sum = a + b;
// overflow happens when sign(a) == sign(b) and sign(sum) != sign(a)
// overflow = ~(a ^ b) & (a ^ sum)
bool overflow = (~(a ^ b) & (a ^sum));
return (overflow < 0) ? saturate: sum;
}

This would save ~3 of those 7 cycles, at which point performance isn't
really that bad.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Compact representation for common integer constants

<s855nt$7fi$2@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16957&group=comp.arch#16957

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!/FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 10:13:49 +0200
Organization: Aioe.org NNTP Server
Lines: 17
Message-ID: <s855nt$7fi$2@gioia.aioe.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s7a8u7$mui$1@dont-email.me>
<9f36daff-8b8f-4550-80ad-2f75dd98f319n@googlegroups.com>
<s7bo65$9gq$1@dont-email.me> <s7h9kc$q16$1@dont-email.me>
<1b0596d8-5d9a-4a33-a4e3-ff9d34fd0fc2n@googlegroups.com>
<s7qlup$dm0$1@newsreader4.netcologne.de>
<e29a79f4-80ba-4dcb-8079-cf2f87a86b3en@googlegroups.com>
<s7svtl$qlt$1@newsreader4.netcologne.de>
<2021May17.144318@mips.complang.tuwien.ac.at> <s80sg0$6ot$1@dont-email.me>
<2021May19.001113@mips.complang.tuwien.ac.at>
<s83kic$lqi$1@newsreader4.netcologne.de>
NNTP-Posting-Host: /FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Thu, 20 May 2021 08:13 UTC

Thomas Koenig wrote:
> Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:
>
>> The major downside is that the world has adapted to the unwise
>> decision to go I32LP64 for ~30 years, so any change now incurs
>> significant switching cost.
>
> You don't care about cache or memory bandwidth, then?
>
We all care about that, which is why all 64-bit cpus have to make usre
that 32-bit ops are at least as fast as the register-sized 64-bit ones.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Compact representation for common integer constants

<s85993$1tao$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16959&group=comp.arch#16959

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!/FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 11:14:11 +0200
Organization: Aioe.org NNTP Server
Lines: 97
Message-ID: <s85993$1tao$1@gioia.aioe.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<e29a79f4-80ba-4dcb-8079-cf2f87a86b3en@googlegroups.com>
<s7svtl$qlt$1@newsreader4.netcologne.de>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at> <s8414d$okc$1@dont-email.me>
NNTP-Posting-Host: /FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Thu, 20 May 2021 09:14 UTC

David Brown wrote:
> On 19/05/2021 21:48, Anton Ertl wrote:
>> Thomas Koenig <tkoenig@netcologne.de> writes:
>>> Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:
>>>> Thomas Koenig <tkoenig@netcologne.de> writes:
>>>>> Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:
>>>>>> Thomas Koenig <tkoenig@netcologne.de> writes:
>>>>>>> Unsigned loops can cause a headache, because it is often
>>>>>>> not possible to prove they actually terminate...
>>>>>>
>>>>>> And the difference to signed loops is? Note that you wrote
>>>>>> "actually", so pretending that signed overflow does not happen does
>>>>>> not count.
>>>>>
>>>>> According to many language definitions, it does indeed not
>>>>> count.
>>>>
>>>> This sentence makes no sense.
>>>>
>>>>> I could quote you chapter and verse from the Fortran standard, but
>>>>> I hope you'll take my word for it.
>>>>
>>>> What do "many language definitions" and "the Fortran standard" have to
>>>> do with proving what *actually* happens?
>>>
>>> I am not sure that I follow.
>>>
>>> Language standards usually describe the behavior of a program at an
>>> abstract level. They specify rules for programs. Compilers ( or
>>> interpreters, or... ) may or may not be required to catch violations
>>> of said rules, depending on the specific rule.
>>
>> And if they don't, the language standard does not say what *actually*
>> happens. And therefore a standard for a language like C and Fortran
>> that includes undefined behaviour cannot be used to prove what
>> *actually* happens in a program.
>>
>>> Let's look at
>>>
>>> void foo (int *a, int from, int to)
>>> {
>>> for (i=from; i<to; i++)
>>> a[i] = 0;
>>> }
>>>
>>> Should the compiler be allowed to conclude that this loop
>>> terminates? The write to a[i] could clobber the memory location of
>>> i, and this could go into an endless loop. At higher optimization
>>> level, i could be kept in a register, and the loop could terminate.
>>
>> Which does not support your claim that you can prove whether a program
>> actually terminates by looking at the language standard. It also does
>> not support your claim that proving termination is harder for unsigned
>> numbers.
>>
>>>> but my experience is that gcc tends towards "optimizing"
>>>> intended-to-be-bounded loops into endless loops, not the reverse.
>>>
>>> Do you have an example for that?
>
> Do you have an example of sensible code where this happens, rather than
> something that would be rejected by any code review as clearly wrong?
>
> "int" in C models mathematical integers, with size limitations. Any
> mathematician will tell you "i < i + 1" is obviously true, regardless of
> the value of "i".

I actually agree with this, even though it hurts my asm-encoded heart.
What is really missing is an auto form of MININT/MAXINT so that you
could check for that, even though it makes the code more complicated:

for (auto i = from; ; i++) {
...
if (i == (auto) MAXINT) break;
}

>
> I really cannot see the problem with gcc's code here. But I would
> greatly prefer a warning about the code - not even "-Wstrict-overflow=5"
> gives a warning, which I find disappointing.

This is IMHO the real issue: There should be no way for the "all UB
allows anything, so optimize it away" to silently do this with no
warnings issued.

I've read all the arguments about how heavily templated code depends on
the compiler identifying and removing huge parts of the intermediate
code in order to perform well, and I somewhat symphatize with the
argument, but here warnings should still be generated and require an
explicit #define/#pragma/etc to suppress, only to be used on very well
tested library code.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Compact representation for common integer constants

<s859t7$6vr$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16960&group=comp.arch#16960

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!/FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 11:24:55 +0200
Organization: Aioe.org NNTP Server
Lines: 105
Message-ID: <s859t7$6vr$1@gioia.aioe.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at> <s8414d$okc$1@dont-email.me>
<2021May20.001137@mips.complang.tuwien.ac.at>
NNTP-Posting-Host: /FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Thu, 20 May 2021 09:24 UTC

Anton Ertl wrote:
> David Brown <david.brown@hesbynett.no> writes:
> ["optimize" bounded loop into an endless loop]
>> Do you have an example of sensible code where this happens, rather than
>> something that would be rejected by any code review as clearly wrong?
>
> The classical example is the SATD function, which was "optimized" into
> an endless loop by a prerelease of gcc-4.8; but the final version and
> later versions of gcc do not perform this "optimization", a fact that
> supposedly has nothing to do with SATD being in SPEC.
>
> int d[16];
>
> int SATD (void)
> {
>
> int satd = 0, dd, k;
> for (dd=d[k=0]; k<16; dd=d[++k]) {
>
> satd += (dd < 0 ? -dd : dd);
> }
> return satd;
> }

That code does generate an access to d[16] before it realizes that k=16
stops the loop, so they really should have at least generated a warning.

Generally, I much prefer warnings to eliminating code based on UB.
>
> Another example is a loop for filling an array with bit patterns,
> something similar to
>
> int a[16];
>
> void bar()
> {
> int i;
> int *p=a;
> for (i=0; i!=-1; i+=0x11111111)
> *p++=i;
> }

The programmer probably intended to write "int32_t i" instead of "int
i", since a 64-bit int would make for a fairly short wait before you got
a runtime out of bounds error.

Fixing it the way Mitch & I prefer is easy:

int32_t a[16];

void bar()
{ uint32_t i;
int *p=a;
for (i=0; i!=(uint32_t) -1; i+=0x11111111)
*p++= (int32_t) i;
}

Although I do suspect he programmer inteded to fill all 16 slots and not
stop after 15?

>
> gcc-4.9.2 -Os produces a warning:
>
> xxx.c: In function ‘bar’:
> xxx.c:7:21: warning: iteration 7u invokes undefined behavior [-Waggressive-loop-optimizations]
> for (i=0; i!=-1; i+=0x11111111)
> ^
> xxx.c:7:3: note: containing loop
> for (i=0; i!=-1; i+=0x11111111)
> ^
>
> and the following "optimized" endless loop:
>
> .L4:
> addq $4, %rdx
> movl %eax, -4(%rdx)
> addl $286331153, %eax
> jmp .L4
>
> Well, at least it warns.
>
>> "int" in C models mathematical integers, with size limitations. Any
>> mathematician will tell you "i < i + 1" is obviously true, regardless of
>> the value of "i".
>
> Any mathematician will tell you that mathematical integers have no
> size limitations. And that the fact that for mathematical integers
> i<i+1 hold regardless of the value of i depends on that property.
>
> As soon as you have some limit, it's no longer the mathematical
> integers, and i<i+1 does not hold for all of them; in particular, it
> does not hold for i=int_max. And that's independent of whether you
> use 2s-complement, 1s-complement, or sign-magnitude representation for
> negative integers, and whether you handle overflow by modulo
> arithmetic, by saturation, by trapping, or anything else.

Right.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Compact representation for common integer constants

<s85c6j$v26$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16961&group=comp.arch#16961

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 12:04:03 +0200
Organization: A noiseless patient Spider
Lines: 7
Message-ID: <s85c6j$v26$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at> <s8414d$okc$1@dont-email.me>
<2021May20.001137@mips.complang.tuwien.ac.at> <s859t7$6vr$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 20 May 2021 10:04:03 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="0f4709d2d2405951cc9d574e4d2a4ec7";
logging-data="31814"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+umnU5VGu7u4KbAFcV8yP817UnBcCspo4="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
Thunderbird/68.10.0
Cancel-Lock: sha1:dkp44AAWRlQakzgiIWrYOGk+1MQ=
In-Reply-To: <s859t7$6vr$1@gioia.aioe.org>
Content-Language: en-GB
 by: David Brown - Thu, 20 May 2021 10:04 UTC

On 20/05/2021 11:24, Terje Mathisen wrote:
>
> Generally, I much prefer warnings to eliminating code based on UB.

I prefer warnings from the compiler (or other tools, or code reviews) so
that I can eliminate the UB from the code.

Re: Compact representation for common integer constants

<s85flm$njk$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16963&group=comp.arch#16963

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 13:03:18 +0200
Organization: A noiseless patient Spider
Lines: 79
Message-ID: <s85flm$njk$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<e29a79f4-80ba-4dcb-8079-cf2f87a86b3en@googlegroups.com>
<s7svtl$qlt$1@newsreader4.netcologne.de>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at> <s8414d$okc$1@dont-email.me>
<s85993$1tao$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 20 May 2021 11:03:18 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="0f4709d2d2405951cc9d574e4d2a4ec7";
logging-data="24180"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/QQG4Z9+0UiXpDVqYCUHr+nFSNbGCdFpc="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
Thunderbird/68.10.0
Cancel-Lock: sha1:ShzwrLPa5vlmyFxcHj5gjfcRmBQ=
In-Reply-To: <s85993$1tao$1@gioia.aioe.org>
Content-Language: en-GB
 by: David Brown - Thu, 20 May 2021 11:03 UTC

On 20/05/2021 11:14, Terje Mathisen wrote:
> David Brown wrote:

>> Do you have an example of sensible code where this happens, rather than
>> something that would be rejected by any code review as clearly wrong?
>>
>> "int" in C models mathematical integers, with size limitations.  Any
>> mathematician will tell you "i < i + 1" is obviously true, regardless of
>> the value of "i".
>
> I actually agree with this, even though it hurts my asm-encoded heart.

Assembly experience helps you write efficient C code - but thinking of C
in terms of assembly is rarely helpful. It leads people to make
unwarranted assumptions (such as the behaviour of signed overflows), or
to write code that is in a convoluted and unclear "assembly" style when
a simpler and more obvious style can let the compiler generate better code.

> What is really missing is an auto form of MININT/MAXINT so that you
> could check for that, even though it makes the code more complicated:
>
>   for (auto i = from; ; i++) {
>     ...
>     if (i == (auto) MAXINT) break;
>   }

"auto" in this manner is C++, not C (unless you count gcc's
"__auto_type" extension).

And if you have C++, you have:

std::numeric_limits<decltype(i)>::max()

Ugly, but it works. (If <limits> were redesigned in current C++, it
would probably use template constexpr variables for these range values,
which would be marginally less ugly.)

I have always thought that if C++ were to copy anything from Ada,
attributes would be a good start - "i'max" is definitely a better syntax
here.

If you want to stick to C, you could make a _Generic macro that returned
the max value of the type of its operand.

So it is all possible, but not part of the standard library.

>
>>
>> I really cannot see the problem with gcc's code here.  But I would
>> greatly prefer a warning about the code - not even "-Wstrict-overflow=5"
>> gives a warning, which I find disappointing.
>
> This is IMHO the real issue: There should be no way for the "all UB
> allows anything, so optimize it away" to silently do this with no
> warnings issued.

From some brief testing, it looks like there was a warning up to about
gcc 7, then no warning for newer versions. Sometimes these things
change depending on the orders of the passes and code re-arrangements,
but I would definitely count that as a regression in gcc.

>
> I've read all the arguments about how heavily templated code depends on
> the compiler identifying and removing huge parts of the intermediate
> code in order to perform well, and I somewhat symphatize with the
> argument, but here warnings should still be generated and require an
> explicit #define/#pragma/etc to suppress, only to be used on very well
> tested library code.
>

There is a balance to be found in that kind of thing, and it is not
always easy - certainly different people will want different levels
here. In this case, however, there are no warning flags (that I can
find) that give a warning on newer gcc, so the balance is definitely not
ideal.

Re: Compact representation for common integer constants

<s85g75$133a$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16964&group=comp.arch#16964

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!/FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 13:12:38 +0200
Organization: Aioe.org NNTP Server
Lines: 18
Message-ID: <s85g75$133a$1@gioia.aioe.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at> <s8414d$okc$1@dont-email.me>
<2021May20.001137@mips.complang.tuwien.ac.at> <s859t7$6vr$1@gioia.aioe.org>
<s85c6j$v26$1@dont-email.me>
NNTP-Posting-Host: /FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Thu, 20 May 2021 11:12 UTC

David Brown wrote:
> On 20/05/2021 11:24, Terje Mathisen wrote:
>>
>> Generally, I much prefer warnings to eliminating code based on UB.
>
> I prefer warnings from the compiler (or other tools, or code reviews) so
> that I can eliminate the UB from the code.
>
I agree 100%!

I just want those warnings (and my default is to compile with warnings
== error) so that I can find them.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Compact representation for common integer constants

<s85i3s$929$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16965&group=comp.arch#16965

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: m.del...@this.bitsnbites.eu (Marcus)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 13:45:00 +0200
Organization: A noiseless patient Spider
Lines: 78
Message-ID: <s85i3s$929$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at> <s8414d$okc$1@dont-email.me>
<2021May20.001137@mips.complang.tuwien.ac.at> <s846ou$s25$1@dont-email.me>
<35c1d9c9-d12a-4812-bcac-07979e0fcaccn@googlegroups.com>
<s84b0a$lht$1@dont-email.me>
<5c8aa7d4-ae7d-465e-bcb4-230ee64b2bfbn@googlegroups.com>
<s8522o$tol$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 20 May 2021 11:45:00 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="7cde9733b2e6a4f80cbbfcde220299fe";
logging-data="9289"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18V98lOLZ2rNCMa25nVBI7Pa3XF2UcweDk="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.8.1
Cancel-Lock: sha1:1u6vkIjG9ZhgBxN/RitCNpWYirs=
In-Reply-To: <s8522o$tol$1@dont-email.me>
Content-Language: en-US
 by: Marcus - Thu, 20 May 2021 11:45 UTC

Den 2021-05-20 kl. 09:11, skrev David Brown:
> On 20/05/2021 04:12, MitchAlsup wrote:
>> On Wednesday, May 19, 2021 at 7:37:33 PM UTC-5, David Brown wrote:
>>> On 20/05/2021 01:54, MitchAlsup wrote:
>>>> On Wednesday, May 19, 2021 at 6:25:20 PM UTC-5, David Brown wrote:
>>>>> On 20/05/2021 00:11, Anton Ertl wrote:
>>>>>> David Brown <david...@hesbynett.no> writes:
>>>
>>>>>
>>>>> Perhaps it should also give the suggestion "you should be using unsigned
>>>>> types for this kind of thing". As a general rule, anything involving
>>>>> bit patterns should be unsigned types - signed types are for abstract
>>>>> numbers only.
>>>> <
>>>> All integral container that do not explicitly need to contain negative
>>>> numbers should be unsigned.
>>> That also works.
>>>
>>> But if you are saying this because overflow is defined for unsigned
>>> types in C, then I would disagree with the point - as overflow is
>>> generally (but not always) a mistake, whether signed or unsigned, and
>>> should not occur in normal code.
>> <
>> I am saying this because; unsigned arithmetic is much better defined
>> than signed arithmetic in C. The surprise coefficient is way lower.
>> <
>
> Yes, that's what I thought. I disagree. When I use types for
> arithmetic (outside of occasional very special cases, and bugs in my
> code), the arithmetic does not overflow. Overflows are errors. It
> doesn't matter what the language says - if I have 4294967295 apples in a
> pile, put another apple on, and you tell me I now have 0 apples, it is
> all nonsense.
>
> So IMHO correctly written arithmetic code does not overflow. No
> overflow, no surprises.
>
> It's fine to use unsigned types for data that is naturally non-negative.
> It makes sense to the reader. And sometimes the extra bit of range is
> useful.
>
> But if you think unsigned types are better for general purpose numbers
> because they are "better defined", you are using the wrong types for the
> arithmetic you are doing, or you have failed to check your values before
> using them.
>
> I lost my appetite decades ago for smart-arse code that relies on
> overflows to see the end of a loop and that kind of thing. I write my
> code to make sense - it says what it does, and does what it says,
> instead of playing "bit twiddling tricks". Usually the result is not
> only simpler and clearer, safer and more portable, but is at least as
> efficient. In the extremely few cases where you need to do something
> odd to squeeze the last clock cycle out of critical code, then you put
> in the extra "unsigned" casts and other ugly details.
>

For some time now I have been in the camp that prefers signed integers
over unsigned integers for general arithmetic operations.

I think that for most situations where you need to use numbers for
arithmetic operations, it is wise to use a type that can represent
negative numbers (e.g. to support various combinations of subtractions
and lt/gt comparisons without surprises).

OTOH, for bitwise operations (and/or/shift/...), I prefer unsigned
integers. I think of "unsigned integer" as "bit vector".

That extra bit of range that you get with unsigned integers is usually
just a fallacy: if you need it, you're so close to the limit that you
should probably just double the integer size instead (e.g. go from
int32_t to int64_t).

Finally, I really don't like type-casting back and forth between
different integer types (not explicitly nor implicitly), so I tend
to use a single type as far as possible - and that usually happens to
be signed integers.

/Marcus

Re: Compact representation for common integer constants

<s85iv4$et4$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16967&group=comp.arch#16967

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 13:59:31 +0200
Organization: A noiseless patient Spider
Lines: 206
Message-ID: <s85iv4$et4$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s7qlup$dm0$1@newsreader4.netcologne.de>
<e29a79f4-80ba-4dcb-8079-cf2f87a86b3en@googlegroups.com>
<s7svtl$qlt$1@newsreader4.netcologne.de>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<d5edb86e-0166-4c44-83eb-19e1cd8eb2c4n@googlegroups.com>
<nqapI.385883$2A5.264183@fx45.iad> <s83eir$irf$1@newsreader4.netcologne.de>
<U2cpI.62291$od.35116@fx15.iad> <s855l9$7fi$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 20 May 2021 11:59:32 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="0f4709d2d2405951cc9d574e4d2a4ec7";
logging-data="15268"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/r/7hxNKcakf4AzK7YE/qj2Aaasfv+76M="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
Thunderbird/68.10.0
Cancel-Lock: sha1:1AJotJSPsFn4Ok/soQBMICWOWRw=
In-Reply-To: <s855l9$7fi$1@gioia.aioe.org>
Content-Language: en-GB
 by: David Brown - Thu, 20 May 2021 11:59 UTC

On 20/05/2021 10:12, Terje Mathisen wrote:
> EricP wrote:
>> Thomas Koenig wrote:
>>> EricP <ThatWouldBeTelling@thevillage.com> schrieb:
>>>> MitchAlsup wrote:
>>>>> On Wednesday, May 19, 2021 at 7:23:04 AM UTC-5, Anton Ertl wrote:
>>>>>> Thomas Koenig <tko...@netcologne.de> writes:
>>>>>>> Anton Ertl <an...@mips.complang.tuwien.ac.at> schrieb:
>>>>>>>> Thomas Koenig <tko...@netcologne.de> writes:
>>>>>>>>> Unsigned loops can cause a headache, because it is often not
>>>>>>>>> possible to prove they actually terminate...
>>>>>>>> And the difference to signed loops is? Note that you wrote
>>>>>>>> "actually", so pretending that signed overflow does not happen
>>>>>>>> does not count.
>>>>>>> According to many language definitions, it does indeed not count.
>>>>>> This sentence makes no sense.
>>>>> <
>>>>> How about this::
>>>>> <
>>>>> Languages, that want to be ported across multiple architectures, and
>>>>> have high performing implementations, cannot specify that signed
>>>>> integers overflow.
>>>>> <
>>>> The language spec should:
>>>> - allow the programmer to choose between bounds-range checked,
>>>>    unchecked, modulo, or saturating data types
>>>
>>> That is not going to fly for an established, general-purpose
>>> language.  Languages have traditionally tried to stay away from
>>> this kind of thing because of the very many different implementations
>>> there are.
>>
>> I didn't say an established language.
>> I have no illusions that C is fixable.
>>
>>> Fortran is a bit of an exception in that it supports IEEE floating
>>> point via intrinsic modules.
>>>
>>> And face it - a numeric type that is not supported by hardware is
>>> going to suck rocks through a straw, performance-wise.  Think about
>>> the floating-point performance in the Intel days before the numeric
>>> coprocessors.
>>
>> Checked or saturating integer operations are not remotely
>> comparable to floating point.
>>
>> For example, on x64 signed and unsigned overflow check
>> are a "jo" or "jc" instruction.
>>
>> Saturating unsigned and signed add on x64:
>>
>> // rsi+rdi->rax
>> sat_addu64b:
>>      add    %rsi, %rdi
>>      sbb    %rax, %rax
>>      or     %rdi, %rax
>
> That is the easy & sane way to handle unsigned, which only have to
> saturate in one direction.
>

uint64_t sat_addu64_2(uint64_t a, uint64_t b) {
uint64_t c = a + b;
if (c < a) return -1;
return c;
}

uint64_t sat_addu64_3(uint64_t a, uint64_t b) {
uint64_t c = a + b;
uint64_t d = 0;
if (c < a) d = -1;
return c | d;
}

With gcc, that gives:

sat_addu64_2:
addq %rsi, %rdi
movq $-1, %rax
cmovnc %rdi, %rax
ret

sat_addu64_3:
addq %rsi, %rdi
sbbq %rax, %rax
orq %rdi, %rax
ret

I don't know which would be most efficient.

>>
>> long_max:
>>      .quad 0x7fffffffffffffff
>> long_min:
>>      .quad 0x8000000000000000
>>
>> // rsi+rdi->rax
>> sat_adds64b:
>>      mov    %rdi, %rax
>>      shr    $0x3f, %rdi
>>      add    long_max(%rip), %rdi
>>      add    %rsi, %rax
>>      cmovo  %rdi, %rax
>
> This is effectively
>
> int64_t adds(int64_t a, int64_t b)
> {
>   int64_t bsign = (b >> 63);
>   int64_t saturate = bsign + 0x7fffffffffffffff; // 0x80.. or 0x7f..
>   int64_t sum = a + b;

These should be done using uint64_t copies of "a" and "b" - otherwise
your overflows are UB.

>   // overflow happens when sign(a) == sign(b) and sign(sum) != sign(a)
>   // overflow = ~(a ^ b) & (a ^ sum)
>   bool overflow = (~(a ^ b) & (a ^sum)) >> 63; // OF mask;
>   sum = (saturate & overflow) | (sum & ~overflow);

Something's not right here - you didn't mean to take the complement of a
bool?

>   return sum;
> }
>
>
> The compiler/HW can schedule this as
>
> first cycle:
>  bsign = b >> 63;
>  sum = a + b;
>  t_a_xor_b = a ^ b;
>
> second cycle:
>  saturate =  bsign + 0x7fffffffffffffff;
>  t_a_xor_sum = a ^ sum;
>  t_a_eq_b = ~t_a_xor_b;
>
> third cycle:
>  overflow = t_a_xor_sum & t_a_eq_b;
>
> fourth cycle:
>  overflow >>= 63;
>
> fifth cycle:
>  saturate &= overflow;
>  not_overflow = ~overflow;
>
> sixth cycle:
>  sum &= not_overflow;
>
> seventh cycle:
>  sum |= saturate;
>
> If the compiler knows about CMOV or similar then we would be better off
> writing it as
>
> int64_t adds(int64_t a, int64_t b)
> {
>   int64_t bsign = (b >> 63);
>   int64_t saturate = bsign + 0x7fffffffffffffff; // 0x80.. or 0x7f..
>   int64_t sum = a + b;
>   // overflow happens when sign(a) == sign(b) and sign(sum) != sign(a)
>   // overflow = ~(a ^ b) & (a ^ sum)
>   bool overflow = (~(a ^ b) & (a ^sum));
>   return (overflow < 0) ? saturate: sum;

"overflow" is a bool, and therefore never negative.

> }
>
> This would save ~3 of those 7 cycles, at which point performance isn't
> really that bad.
>
> Terje
>

Taking advantage of gcc's builtins:

int64_t adds(int64_t a, int64_t b) {
int64_t c;
if (__builtin_saddl_overflow(a, b, &c)) {
return (a < 0) ? -0x8000000000000000 : 0x7fffffffffffffffull;
} else {
return c;
}
}

adds:
addq %rdi, %rsi
jo .L17
movq %rsi, %rax
ret
..L17:
movabsq $9223372036854775807, %rdx
testq %rdi, %rdi
movabsq $-9223372036854775808, %rax
cmovns %rdx, %rax
ret

I don't know how that compares in speed to the other versions (I have
long experience of assembly, but not on x86). It seems reasonable to me
to have a faster path for the non-saturating line.

Re: Compact representation for common integer constants

<s85j8h$hce$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16968&group=comp.arch#16968

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 14:04:33 +0200
Organization: A noiseless patient Spider
Lines: 100
Message-ID: <s85j8h$hce$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at> <s8414d$okc$1@dont-email.me>
<2021May20.001137@mips.complang.tuwien.ac.at> <s846ou$s25$1@dont-email.me>
<35c1d9c9-d12a-4812-bcac-07979e0fcaccn@googlegroups.com>
<s84b0a$lht$1@dont-email.me>
<5c8aa7d4-ae7d-465e-bcb4-230ee64b2bfbn@googlegroups.com>
<s8522o$tol$1@dont-email.me> <s85i3s$929$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 20 May 2021 12:04:33 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="0f4709d2d2405951cc9d574e4d2a4ec7";
logging-data="17806"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX187eVlwWkj3NmZej5rhqHCT0Nf4bzeBLHQ="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
Thunderbird/68.10.0
Cancel-Lock: sha1:TZskWr5L+lWejOpr75jeA5RvnPo=
In-Reply-To: <s85i3s$929$1@dont-email.me>
Content-Language: en-GB
 by: David Brown - Thu, 20 May 2021 12:04 UTC

On 20/05/2021 13:45, Marcus wrote:
> Den 2021-05-20 kl. 09:11, skrev David Brown:
>> On 20/05/2021 04:12, MitchAlsup wrote:
>>> On Wednesday, May 19, 2021 at 7:37:33 PM UTC-5, David Brown wrote:
>>>> On 20/05/2021 01:54, MitchAlsup wrote:
>>>>> On Wednesday, May 19, 2021 at 6:25:20 PM UTC-5, David Brown wrote:
>>>>>> On 20/05/2021 00:11, Anton Ertl wrote:
>>>>>>> David Brown <david...@hesbynett.no> writes:
>>>>
>>>>>>
>>>>>> Perhaps it should also give the suggestion "you should be using
>>>>>> unsigned
>>>>>> types for this kind of thing". As a general rule, anything involving
>>>>>> bit patterns should be unsigned types - signed types are for abstract
>>>>>> numbers only.
>>>>> <
>>>>> All integral container that do not explicitly need to contain negative
>>>>> numbers should be unsigned.
>>>> That also works.
>>>>
>>>> But if you are saying this because overflow is defined for unsigned
>>>> types in C, then I would disagree with the point - as overflow is
>>>> generally (but not always) a mistake, whether signed or unsigned, and
>>>> should not occur in normal code.
>>> <
>>> I am saying this because; unsigned arithmetic is much better defined
>>> than signed arithmetic in C. The surprise coefficient is way lower.
>>> <
>>
>> Yes, that's what I thought.  I disagree.  When I use types for
>> arithmetic (outside of occasional very special cases, and bugs in my
>> code), the arithmetic does not overflow.  Overflows are errors.  It
>> doesn't matter what the language says - if I have 4294967295 apples in a
>> pile, put another apple on, and you tell me I now have 0 apples, it is
>> all nonsense.
>>
>> So IMHO correctly written arithmetic code does not overflow.  No
>> overflow, no surprises.
>>
>> It's fine to use unsigned types for data that is naturally non-negative.
>>   It makes sense to the reader.  And sometimes the extra bit of range is
>> useful.
>>
>> But if you think unsigned types are better for general purpose numbers
>> because they are "better defined", you are using the wrong types for the
>> arithmetic you are doing, or you have failed to check your values before
>> using them.
>>
>> I lost my appetite decades ago for smart-arse code that relies on
>> overflows to see the end of a loop and that kind of thing.  I write my
>> code to make sense - it says what it does, and does what it says,
>> instead of playing "bit twiddling tricks".  Usually the result is not
>> only simpler and clearer, safer and more portable, but is at least as
>> efficient.  In the extremely few cases where you need to do something
>> odd to squeeze the last clock cycle out of critical code, then you put
>> in the extra "unsigned" casts and other ugly details.
>>
>
> For some time now I have been in the camp that prefers signed integers
> over unsigned integers for general arithmetic operations.
>
> I think that for most situations where you need to use numbers for
> arithmetic operations, it is wise to use a type that can represent
> negative numbers (e.g. to support various combinations of subtractions
> and lt/gt comparisons without surprises).
>
> OTOH, for bitwise operations (and/or/shift/...), I prefer unsigned
> integers. I think of "unsigned integer" as "bit vector".

Agreed.

>
> That extra bit of range that you get with unsigned integers is usually
> just a fallacy: if you need it, you're so close to the limit that you
> should probably just double the integer size instead (e.g. go from
> int32_t to int64_t).

I work with microcontrollers, so that extra bit is often needed to
correctly interact with hardware registers. But I agree that it is rare
in normal usage to have a need to store numbers bigger than about 2
billion but never need to go above around 4 billion.

>
> Finally, I really don't like type-casting back and forth between
> different integer types (not explicitly nor implicitly), so I tend
> to use a single type as far as possible - and that usually happens to
> be signed integers.
>

It is always nicest when you don't need conversions - but getting code
correct is more important than getting it nice!

(Nitpick on the terminology, if you are interested - in C, there is no
such thing as "typecasting". That's something that happens to actors in
film and theatre. What you are talking about is /conversions/.
Implicit conversions happen at assignment, in expressions, and function
calls. Explicit conversions - such as "(uint64_t) x" - are called
"casts". If you are not interested, that's okay - it's obvious what you
meant.)

Re: Compact representation for common integer constants

<s85mdc$3d4$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16969&group=comp.arch#16969

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!/FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 14:58:21 +0200
Organization: Aioe.org NNTP Server
Lines: 94
Message-ID: <s85mdc$3d4$1@gioia.aioe.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at> <s8414d$okc$1@dont-email.me>
<2021May20.001137@mips.complang.tuwien.ac.at> <s846ou$s25$1@dont-email.me>
<35c1d9c9-d12a-4812-bcac-07979e0fcaccn@googlegroups.com>
<s84b0a$lht$1@dont-email.me>
<5c8aa7d4-ae7d-465e-bcb4-230ee64b2bfbn@googlegroups.com>
<s8522o$tol$1@dont-email.me> <s85i3s$929$1@dont-email.me>
<s85j8h$hce$1@dont-email.me>
NNTP-Posting-Host: /FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Thu, 20 May 2021 12:58 UTC

David Brown wrote:
> On 20/05/2021 13:45, Marcus wrote:
>> Den 2021-05-20 kl. 09:11, skrev David Brown:
>>> On 20/05/2021 04:12, MitchAlsup wrote:
>>>> On Wednesday, May 19, 2021 at 7:37:33 PM UTC-5, David Brown wrote:
>>>>> On 20/05/2021 01:54, MitchAlsup wrote:
>>>>>> On Wednesday, May 19, 2021 at 6:25:20 PM UTC-5, David Brown wrote:
>>>>>>> On 20/05/2021 00:11, Anton Ertl wrote:
>>>>>>>> David Brown <david...@hesbynett.no> writes:
>>>>>
>>>>>>>
>>>>>>> Perhaps it should also give the suggestion "you should be using
>>>>>>> unsigned
>>>>>>> types for this kind of thing". As a general rule, anything involving
>>>>>>> bit patterns should be unsigned types - signed types are for abstract
>>>>>>> numbers only.
>>>>>> <
>>>>>> All integral container that do not explicitly need to contain negative
>>>>>> numbers should be unsigned.
>>>>> That also works.
>>>>>
>>>>> But if you are saying this because overflow is defined for unsigned
>>>>> types in C, then I would disagree with the point - as overflow is
>>>>> generally (but not always) a mistake, whether signed or unsigned, and
>>>>> should not occur in normal code.
>>>> <
>>>> I am saying this because; unsigned arithmetic is much better defined
>>>> than signed arithmetic in C. The surprise coefficient is way lower.
>>>> <
>>>
>>> Yes, that's what I thought.  I disagree.  When I use types for
>>> arithmetic (outside of occasional very special cases, and bugs in my
>>> code), the arithmetic does not overflow.  Overflows are errors.  It
>>> doesn't matter what the language says - if I have 4294967295 apples in a
>>> pile, put another apple on, and you tell me I now have 0 apples, it is
>>> all nonsense.
>>>
>>> So IMHO correctly written arithmetic code does not overflow.  No
>>> overflow, no surprises.
>>>
>>> It's fine to use unsigned types for data that is naturally non-negative.
>>>   It makes sense to the reader.  And sometimes the extra bit of range is
>>> useful.
>>>
>>> But if you think unsigned types are better for general purpose numbers
>>> because they are "better defined", you are using the wrong types for the
>>> arithmetic you are doing, or you have failed to check your values before
>>> using them.
>>>
>>> I lost my appetite decades ago for smart-arse code that relies on
>>> overflows to see the end of a loop and that kind of thing.  I write my
>>> code to make sense - it says what it does, and does what it says,
>>> instead of playing "bit twiddling tricks".  Usually the result is not
>>> only simpler and clearer, safer and more portable, but is at least as
>>> efficient.  In the extremely few cases where you need to do something
>>> odd to squeeze the last clock cycle out of critical code, then you put
>>> in the extra "unsigned" casts and other ugly details.
>>>
>>
>> For some time now I have been in the camp that prefers signed integers
>> over unsigned integers for general arithmetic operations.
>>
>> I think that for most situations where you need to use numbers for
>> arithmetic operations, it is wise to use a type that can represent
>> negative numbers (e.g. to support various combinations of subtractions
>> and lt/gt comparisons without surprises).
>>
>> OTOH, for bitwise operations (and/or/shift/...), I prefer unsigned
>> integers. I think of "unsigned integer" as "bit vector".
>
> Agreed.
>
>>
>> That extra bit of range that you get with unsigned integers is usually
>> just a fallacy: if you need it, you're so close to the limit that you
>> should probably just double the integer size instead (e.g. go from
>> int32_t to int64_t).
>
> I work with microcontrollers, so that extra bit is often needed to
> correctly interact with hardware registers. But I agree that it is rare
> in normal usage to have a need to store numbers bigger than about 2
> billion but never need to go above around 4 billion.

Rather the opposite:

Pretty much all large 32-bit programs needed to handle near-maximum
memory pools (i.e. ~3GB), and that meant doing it right with 32-bit ops,
since using 64-bit would be far slower.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Compact representation for common integer constants

<s85mdh$3d4$2@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16970&group=comp.arch#16970

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!/FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 14:58:26 +0200
Organization: Aioe.org NNTP Server
Lines: 171
Message-ID: <s85mdh$3d4$2@gioia.aioe.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s7qlup$dm0$1@newsreader4.netcologne.de>
<e29a79f4-80ba-4dcb-8079-cf2f87a86b3en@googlegroups.com>
<s7svtl$qlt$1@newsreader4.netcologne.de>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<d5edb86e-0166-4c44-83eb-19e1cd8eb2c4n@googlegroups.com>
<nqapI.385883$2A5.264183@fx45.iad> <s83eir$irf$1@newsreader4.netcologne.de>
<U2cpI.62291$od.35116@fx15.iad> <s855l9$7fi$1@gioia.aioe.org>
<s85iv4$et4$1@dont-email.me>
NNTP-Posting-Host: /FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Thu, 20 May 2021 12:58 UTC

David Brown wrote:
> On 20/05/2021 10:12, Terje Mathisen wrote:
>> That is the easy & sane way to handle unsigned, which only have to
>> saturate in one direction.
>>
>
> uint64_t sat_addu64_2(uint64_t a, uint64_t b) {
> uint64_t c = a + b;
> if (c < a) return -1;
> return c;
> }
>
> uint64_t sat_addu64_3(uint64_t a, uint64_t b) {
> uint64_t c = a + b;
> uint64_t d = 0;
> if (c < a) d = -1;
> return c | d;
> }
>
>
> With gcc, that gives:
>
> sat_addu64_2:
> addq %rsi, %rdi
> movq $-1, %rax
> cmovnc %rdi, %rax
> ret
>
> sat_addu64_3:
> addq %rsi, %rdi
> sbbq %rax, %rax
> orq %rdi, %rax
> ret
>
>
> I don't know which would be most efficient.
>
>>>
>>> long_max:
>>>      .quad 0x7fffffffffffffff
>>> long_min:
>>>      .quad 0x8000000000000000
>>>
>>> // rsi+rdi->rax
>>> sat_adds64b:
>>>      mov    %rdi, %rax
>>>      shr    $0x3f, %rdi
>>>      add    long_max(%rip), %rdi
>>>      add    %rsi, %rax
>>>      cmovo  %rdi, %rax
>>
>> This is effectively
>>
>> int64_t adds(int64_t a, int64_t b)
>> {

I want a arithmetic shift right here, to turn the sign bit into a 64-bit
-1/0 flag.
>>   int64_t bsign = (b >> 63);
>>   int64_t saturate = bsign + 0x7fffffffffffffff; // 0x80.. or 0x7f..
>>   int64_t sum = a + b;
>
> These should be done using uint64_t copies of "a" and "b" - otherwise
> your overflows are UB.

Yeah, I'm thinking in asm but limiting myself to C syntax. :-(

>
>>   // overflow happens when sign(a) == sign(b) and sign(sum) != sign(a)
>>   // overflow = ~(a ^ b) & (a ^ sum)
>>   bool overflow = (~(a ^ b) & (a ^sum)) >> 63; // OF mask;
>>   sum = (saturate & overflow) | (sum & ~overflow);
>
> Something's not right here - you didn't mean to take the complement of a
> bool?

Oops, I changed the algorithm while typing but forgot to update the
definition. I.e. I realized that I could treat
>
>>   return sum;
>> }
>>
>>
>> The compiler/HW can schedule this as
>>
>> first cycle:
>>  bsign = b >> 63;
>>  sum = a + b;
>>  t_a_xor_b = a ^ b;
>>
>> second cycle:
>>  saturate =  bsign + 0x7fffffffffffffff;
>>  t_a_xor_sum = a ^ sum;
>>  t_a_eq_b = ~t_a_xor_b;
>>
>> third cycle:
>>  overflow = t_a_xor_sum & t_a_eq_b;
>>
>> fourth cycle:
>>  overflow >>= 63;
>>
>> fifth cycle:
>>  saturate &= overflow;
>>  not_overflow = ~overflow;
>>
>> sixth cycle:
>>  sum &= not_overflow;
>>
>> seventh cycle:
>>  sum |= saturate;
>>
>> If the compiler knows about CMOV or similar then we would be better off
>> writing it as
>>
>> int64_t adds(int64_t a, int64_t b)
>> {
>>   int64_t bsign = (b >> 63);
>>   int64_t saturate = bsign + 0x7fffffffffffffff; // 0x80.. or 0x7f..
>>   int64_t sum = a + b;
>>   // overflow happens when sign(a) == sign(b) and sign(sum) != sign(a)
>>   // overflow = ~(a ^ b) & (a ^ sum)
>>   bool overflow = (~(a ^ b) & (a ^sum));
>>   return (overflow < 0) ? saturate: sum;
>
> "overflow" is a bool, and therefore never negative.
>
>> }
>>
>> This would save ~3 of those 7 cycles, at which point performance isn't
>> really that bad.
>>
>> Terje
>>
>
> Taking advantage of gcc's builtins:
>
> int64_t adds(int64_t a, int64_t b) {
> int64_t c;
> if (__builtin_saddl_overflow(a, b, &c)) {
> return (a < 0) ? -0x8000000000000000 : 0x7fffffffffffffffull;
-0x8000000000000000ll probably

> } else {
> return c;
> }
> }
>
> adds:
> addq %rdi, %rsi
> jo .L17
> movq %rsi, %rax
> ret
> .L17:
> movabsq $9223372036854775807, %rdx
> testq %rdi, %rdi
> movabsq $-9223372036854775808, %rax
> cmovns %rdx, %rax
> ret
>
> I don't know how that compares in speed to the other versions (I have
> long experience of assembly, but not on x86). It seems reasonable to me
> to have a faster path for the non-saturating line.

That is probably a good idea, except for inline code which we would like
to be short & branch-free.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Compact representation for common integer constants

<s85n1t$urq$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16971&group=comp.arch#16971

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-4fe8-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 13:09:17 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <s85n1t$urq$1@newsreader4.netcologne.de>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<e29a79f4-80ba-4dcb-8079-cf2f87a86b3en@googlegroups.com>
<s7svtl$qlt$1@newsreader4.netcologne.de>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at> <s8414d$okc$1@dont-email.me>
<s85993$1tao$1@gioia.aioe.org>
Injection-Date: Thu, 20 May 2021 13:09:17 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-4fe8-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:4fe8:0:7285:c2ff:fe6c:992d";
logging-data="31610"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Thu, 20 May 2021 13:09 UTC

Terje Mathisen <terje.mathisen@tmsw.no> schrieb:

> What is really missing is an auto form of MININT/MAXINT so that you
> could check for that, even though it makes the code more complicated:
>
> for (auto i = from; ; i++) {
> ...
> if (i == (auto) MAXINT) break;
> }

Can I recommend Fortran? :-)

You can use huge(i) to get the maximum value for the
variable i. You have to be careful about using
this as an upper limit for a DO loop, though.

Re: Compact representation for common integer constants

<jwv5yzdh70t.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16972&group=comp.arch#16972

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 09:27:32 -0400
Organization: A noiseless patient Spider
Lines: 13
Message-ID: <jwv5yzdh70t.fsf-monnier+comp.arch@gnu.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at>
<s8414d$okc$1@dont-email.me>
<2021May20.001137@mips.complang.tuwien.ac.at>
<s84tt1$ftk$2@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="db17ff49512a31551d0d5a25d081e9ff";
logging-data="16680"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1832xndfVrUlc/eeO9huTnH"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:U8ZBbhBRyesnNtsmCcI32JlAPYY=
sha1:b0LJe87a1HiANxpMuq0dYELS+/E=
 by: Stefan Monnier - Thu, 20 May 2021 13:27 UTC

> satd.c: In function 'SATD':
> satd.c:7:28: warning: iteration 15 invokes undefined behavior [-Waggressive-loop-optimizations]
> 7 | for (dd=d[k=0]; k<16; dd=d[++k]) {

The warning is pretty funny: the "[-W...]" part seems to say that this
warns about places where GCC applies "aggressive optimizations", which
in turn suggests that has optimizations that target code which GCC
considers broken enough to warrant a warning.

Since when do we care about the speed of broken code?

Stefan

Re: Compact representation for common integer constants

<s85rhk$g8s$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16973&group=comp.arch#16973

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 16:25:56 +0200
Organization: A noiseless patient Spider
Lines: 191
Message-ID: <s85rhk$g8s$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s7qlup$dm0$1@newsreader4.netcologne.de>
<e29a79f4-80ba-4dcb-8079-cf2f87a86b3en@googlegroups.com>
<s7svtl$qlt$1@newsreader4.netcologne.de>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<d5edb86e-0166-4c44-83eb-19e1cd8eb2c4n@googlegroups.com>
<nqapI.385883$2A5.264183@fx45.iad> <s83eir$irf$1@newsreader4.netcologne.de>
<U2cpI.62291$od.35116@fx15.iad> <s855l9$7fi$1@gioia.aioe.org>
<s85iv4$et4$1@dont-email.me> <s85mdh$3d4$2@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 20 May 2021 14:25:56 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="0f4709d2d2405951cc9d574e4d2a4ec7";
logging-data="16668"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18nCbMyAMIm2e+HDCpLWPAJ/iPIJuZVDnQ="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
Thunderbird/68.10.0
Cancel-Lock: sha1:hoqHarFK+bPtLyZphk5fbf5ZfbM=
In-Reply-To: <s85mdh$3d4$2@gioia.aioe.org>
Content-Language: en-GB
 by: David Brown - Thu, 20 May 2021 14:25 UTC

On 20/05/2021 14:58, Terje Mathisen wrote:
> David Brown wrote:
>> On 20/05/2021 10:12, Terje Mathisen wrote:
>>> That is the easy & sane way to handle unsigned, which only have to
>>> saturate in one direction.
>>>
>>
>> uint64_t sat_addu64_2(uint64_t a, uint64_t b) {
>>      uint64_t c = a + b;
>>      if (c < a) return -1;
>>      return c;
>> }
>>
>> uint64_t sat_addu64_3(uint64_t a, uint64_t b) {
>>      uint64_t c = a + b;
>>      uint64_t d = 0;
>>      if (c < a) d = -1;
>>      return c | d;
>> }
>>
>>
>> With gcc, that gives:
>>
>> sat_addu64_2:
>>          addq    %rsi, %rdi
>>          movq    $-1, %rax
>>          cmovnc  %rdi, %rax
>>          ret
>>
>> sat_addu64_3:
>>          addq    %rsi, %rdi
>>          sbbq    %rax, %rax
>>          orq     %rdi, %rax
>>          ret
>>
>>
>> I don't know which would be most efficient.
>>
>>>>
>>>> long_max:
>>>>       .quad 0x7fffffffffffffff
>>>> long_min:
>>>>       .quad 0x8000000000000000
>>>>
>>>> // rsi+rdi->rax
>>>> sat_adds64b:
>>>>       mov    %rdi, %rax
>>>>       shr    $0x3f, %rdi
>>>>       add    long_max(%rip), %rdi
>>>>       add    %rsi, %rax
>>>>       cmovo  %rdi, %rax
>>>
>>> This is effectively
>>>
>>> int64_t adds(int64_t a, int64_t b)
>>> {
>
> I want a arithmetic shift right here, to turn the sign bit into a 64-bit
> -1/0 flag.

Fair enough - but that is implementation dependent. (It's unlikely that
an x86 implementation would define it in any other way, however.)

uint64_t bsign = (b < 0) ? -1 : 0;

is fully defined and portable, and gives the same code on gcc.

>>>    int64_t bsign = (b >> 63);
>>>    int64_t saturate = bsign + 0x7fffffffffffffff; // 0x80.. or 0x7f..
>>>    int64_t sum = a + b;
>>
>> These should be done using uint64_t copies of "a" and "b" - otherwise
>> your overflows are UB.
>
> Yeah, I'm thinking in asm but limiting myself to C syntax. :-(
>
>>
>>>    // overflow happens when sign(a) == sign(b) and sign(sum) != sign(a)
>>>    // overflow = ~(a ^ b) & (a ^ sum)
>>>    bool overflow = (~(a ^ b) & (a ^sum)) >> 63; // OF mask;
>>>    sum = (saturate & overflow) | (sum & ~overflow);
>>
>> Something's not right here - you didn't mean to take the complement of a
>> bool?
>
> Oops, I changed the algorithm while typing but forgot to update the
> definition. I.e. I realized that I could treat

<https://godbolt.org> is a great place to test code before posting on
Usenet :-)

>>
>>>    return sum;
>>> }
>>>
>>>
>>> The compiler/HW can schedule this as
>>>
>>> first cycle:
>>>   bsign = b >> 63;
>>>   sum = a + b;
>>>   t_a_xor_b = a ^ b;
>>>
>>> second cycle:
>>>   saturate =  bsign + 0x7fffffffffffffff;
>>>   t_a_xor_sum = a ^ sum;
>>>   t_a_eq_b = ~t_a_xor_b;
>>>
>>> third cycle:
>>>   overflow = t_a_xor_sum & t_a_eq_b;
>>>
>>> fourth cycle:
>>>   overflow >>= 63;
>>>
>>> fifth cycle:
>>>   saturate &= overflow;
>>>   not_overflow = ~overflow;
>>>
>>> sixth cycle:
>>>   sum &= not_overflow;
>>>
>>> seventh cycle:
>>>   sum |= saturate;
>>>
>>> If the compiler knows about CMOV or similar then we would be better off
>>> writing it as
>>>
>>> int64_t adds(int64_t a, int64_t b)
>>> {
>>>    int64_t bsign = (b >> 63);
>>>    int64_t saturate = bsign + 0x7fffffffffffffff; // 0x80.. or 0x7f..
>>>    int64_t sum = a + b;
>>>    // overflow happens when sign(a) == sign(b) and sign(sum) != sign(a)
>>>    // overflow = ~(a ^ b) & (a ^ sum)
>>>    bool overflow = (~(a ^ b) & (a ^sum));
>>>    return (overflow < 0) ? saturate: sum;
>>
>> "overflow" is a bool, and therefore never negative.
>>
>>> }
>>>
>>> This would save ~3 of those 7 cycles, at which point performance isn't
>>> really that bad.
>>>
>>> Terje
>>>
>>
>> Taking advantage of gcc's builtins:
>>
>> int64_t adds(int64_t a, int64_t b) {
>>      int64_t c;
>>      if (__builtin_saddl_overflow(a, b, &c)) {
>>          return (a < 0) ? -0x8000000000000000 : 0x7fffffffffffffffull;

> -0x8000000000000000ll probably

No need for that, as 0x8000000000000000 is already an unsigned 64-bit
constant (ul or ull, as needed), unless you have a platform with signed
int at 128-bit.

>
>>      } else {
>>          return c;
>>      }
>> }
>>
>> adds:
>>          addq    %rdi, %rsi
>>          jo      .L17
>>          movq    %rsi, %rax
>>          ret
>> .L17:
>>          movabsq $9223372036854775807, %rdx
>>          testq   %rdi, %rdi
>>          movabsq $-9223372036854775808, %rax
>>          cmovns  %rdx, %rax
>>          ret
>>
>> I don't know how that compares in speed to the other versions (I have
>> long experience of assembly, but not on x86).  It seems reasonable to me
>> to have a faster path for the non-saturating line.
>
> That is probably a good idea, except for inline code which we would like
> to be short & branch-free.
>

Short, yes - branch-free is not as critical, AFAIUI. Either the code is
run often, in which case it will be predicted (and speculatively
executed) efficiently, or it is run rarely and the speed details don't
matter.

Re: Compact representation for common integer constants

<s85rth$ho3$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16974&group=comp.arch#16974

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 16:32:17 +0200
Organization: A noiseless patient Spider
Lines: 39
Message-ID: <s85rth$ho3$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at> <s8414d$okc$1@dont-email.me>
<2021May20.001137@mips.complang.tuwien.ac.at> <s846ou$s25$1@dont-email.me>
<35c1d9c9-d12a-4812-bcac-07979e0fcaccn@googlegroups.com>
<s84b0a$lht$1@dont-email.me>
<5c8aa7d4-ae7d-465e-bcb4-230ee64b2bfbn@googlegroups.com>
<s8522o$tol$1@dont-email.me> <s85i3s$929$1@dont-email.me>
<s85j8h$hce$1@dont-email.me> <s85mdc$3d4$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 20 May 2021 14:32:17 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="0f4709d2d2405951cc9d574e4d2a4ec7";
logging-data="18179"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/XYVPWa8i884oxrcwEaHIeQxLkKlAd6RI="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
Thunderbird/68.10.0
Cancel-Lock: sha1:IvcRMXQy0eH8DvKwEcQu9Js9If8=
In-Reply-To: <s85mdc$3d4$1@gioia.aioe.org>
Content-Language: en-GB
 by: David Brown - Thu, 20 May 2021 14:32 UTC

On 20/05/2021 14:58, Terje Mathisen wrote:
> David Brown wrote:
>> On 20/05/2021 13:45, Marcus wrote:

>>>
>>> That extra bit of range that you get with unsigned integers is usually
>>> just a fallacy: if you need it, you're so close to the limit that you
>>> should probably just double the integer size instead (e.g. go from
>>> int32_t to int64_t).
>>
>> I work with microcontrollers, so that extra bit is often needed to
>> correctly interact with hardware registers.  But I agree that it is rare
>> in normal usage to have a need to store numbers bigger than about 2
>> billion but never need to go above around 4 billion.
>
> Rather the opposite:
>
> Pretty much all large 32-bit programs needed to handle near-maximum
> memory pools (i.e. ~3GB), and that meant doing it right with 32-bit ops,
> since using 64-bit would be far slower.
>

Really?

There are some programs that would want to push the limits and use as
much space as possible, but surely they are not common. After all, when
32-bit cpus were the norm then 4 GB memory was rare.

And the default user/kernel split on 32-bit Windows was 2GB/2GB, IIRC,
so on that platform you could not normally access more than 2 GB anyway.

By the time you have a 64-bit processor and a program that needs lots of
memory, you'll have more than 4 GB and you'll use 64-bit pointers and
sizes in order to access large data blocks.

I'd believe having 32-bit size_t values of more than 2 GB would be used
sometimes, I just find it hard to imagine that it happens often.

But perhaps it's all a matter of the kind of programs involved.

Re: Compact representation for common integer constants

<s85u1e$2nb$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16975&group=comp.arch#16975

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.niel.me!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 08:08:29 -0700
Organization: A noiseless patient Spider
Lines: 99
Message-ID: <s85u1e$2nb$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s7qlup$dm0$1@newsreader4.netcologne.de>
<e29a79f4-80ba-4dcb-8079-cf2f87a86b3en@googlegroups.com>
<s7svtl$qlt$1@newsreader4.netcologne.de>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<d5edb86e-0166-4c44-83eb-19e1cd8eb2c4n@googlegroups.com>
<nqapI.385883$2A5.264183@fx45.iad> <s84b5i$mal$1@dont-email.me>
<igmeioFr49pU1@mid.individual.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 20 May 2021 15:08:30 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="f146a5fae9cf371bf9b3358d40b3ee42";
logging-data="2795"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18kDTtca4VcPymnnGXXn9nt3//QVp+Elxs="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.2
Cancel-Lock: sha1:wcv4wLRx++QVDHdgGQPVppP4qz4=
In-Reply-To: <igmeioFr49pU1@mid.individual.net>
Content-Language: en-US
 by: Stephen Fuld - Thu, 20 May 2021 15:08 UTC

On 5/19/2021 11:40 PM, Niklas Holsti wrote:
> On 2021-05-20 3:40, Stephen Fuld wrote:
>> On 5/19/2021 8:47 AM, EricP wrote:
>>> MitchAlsup wrote:
>>>> On Wednesday, May 19, 2021 at 7:23:04 AM UTC-5, Anton Ertl wrote:
>>>>> Thomas Koenig <tko...@netcologne.de> writes:
>>>>>> Anton Ertl <an...@mips.complang.tuwien.ac.at> schrieb:
>>>>>>> Thomas Koenig <tko...@netcologne.de> writes:
>>>>>>>> Unsigned loops can cause a headache, because it is often not
>>>>>>>> possible to prove they actually terminate...
>>>>>>> And the difference to signed loops is? Note that you wrote
>>>>>>> "actually", so pretending that signed overflow does not happen
>>>>>>> does not count.
>>>>>> According to many language definitions, it does indeed not count.
>>>>> This sentence makes no sense.
>>>> <
>>>> How about this::
>>>> <
>>>> Languages, that want to be ported across multiple architectures, and
>>>> have high performing implementations, cannot specify that signed
>>>> integers overflow.
>>>> <
>>>
>>> The language spec should:
>>> - allow the programmer to choose between bounds-range checked,
>>>    unchecked, modulo, or saturating data types
>>
>> Ada allows range checked, i.e. Range 20-100, unchecked, at least
>> within the limits of the size, and Modulo variables.
>
>
> Yes. But "unchecked" is not required; there are pragmas to _permit_ the
> Ada compiler to omit the range checks and overflow checks locally or
> globally, but the compiler does not have to obey them. The intent of
> these pragmas is that the programmer knows that the checks would not
> fail, and therefore they can be omitted. The intent is not to get some
> specific behaviour on range violation or overflow.

Fair enough. I didn't state that well. I was thinking of a variable
that you know will never exceed say 150, but don't declare it with a
range attribute, so it is effectively unchecked, at least up to 255 (if
it is an eight bit wide variable).

>
>> I suspect adding saturated wouldn't be a big stretch if it were
>> popular enough.
>
> I agree.
>
> The next Ada standard, now in final editorial review in ISO WG9, will
> also have standard libraries for "big numbers", that is, numbers with
> unbounded range and precision, both integer and real. These libraries of
> course provide the usual infix and prefix operators, so you can write
> I+J even for big-integers I and J.
>
> I suspect that saturating arithmetic would be added in the same way, as
> standard libraries rather than as a new core feature of the Ada type
> system.

Fair enough. This sort of gets back to the question of whether to add a
feature like this as an attribute of the variable (e.g. range) or of the
operation (e.g. bignum). I think there are arguments for each.

BTW, I am unsure of what the requirement is here. Suppose you have a
variable with a range of 10-90. Do you want saturation at 90? That is
if its value is 85 and you add 10 to it, do you want 90, not a range
violation? What about if you subtract 3 from the variable that contains
a value of 12? Do you want 9? Or is the requirement only to support
saturation at variable size boundaries, i.e. 255 for byte wide variables?

>>> - specify that range and array bounds checks can be enabled or disabled
>>>    at compile time by command line options or inline pragmas,
>>>    for blocks of code, individual data types, or individual objects.
>>
>> Since Ada supports Range attributes, and you can use a range
>> restricted variable as a subscript or loop limit, I think you can do
>> all of that.
>
>
> Yes, but see the note on check-disabling pragmas above.

Right. I was thinking that if you didn't want range checks, you would
simply specify a variable without them.

>> But Nick Holsti is the Ada expert around here, so perhaps he will
>> chime in.
>
>
> You got it right, Stephen.

Thank you.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Compact representation for common integer constants

<ec6f372d-d575-4cca-a595-65b81fe868e9n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16977&group=comp.arch#16977

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:7c50:: with SMTP id o16mr4154765qtv.153.1621527039782;
Thu, 20 May 2021 09:10:39 -0700 (PDT)
X-Received: by 2002:a54:4794:: with SMTP id o20mr3680775oic.99.1621527039522;
Thu, 20 May 2021 09:10:39 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.mixmin.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 20 May 2021 09:10:39 -0700 (PDT)
In-Reply-To: <s84ohr$plc$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:e8f5:6d8b:7a01:6d6c;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:e8f5:6d8b:7a01:6d6c
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s7qlup$dm0$1@newsreader4.netcologne.de> <e29a79f4-80ba-4dcb-8079-cf2f87a86b3en@googlegroups.com>
<s7svtl$qlt$1@newsreader4.netcologne.de> <bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de> <2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de> <2021May19.124934@mips.complang.tuwien.ac.at>
<d5edb86e-0166-4c44-83eb-19e1cd8eb2c4n@googlegroups.com> <nqapI.385883$2A5.264183@fx45.iad>
<s83eir$irf$1@newsreader4.netcologne.de> <s84ohr$plc$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ec6f372d-d575-4cca-a595-65b81fe868e9n@googlegroups.com>
Subject: Re: Compact representation for common integer constants
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 20 May 2021 16:10:39 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Thu, 20 May 2021 16:10 UTC

On Wednesday, May 19, 2021 at 11:28:46 PM UTC-5, Stephen Fuld wrote:
> On 5/19/2021 9:32 AM, Thomas Koenig wrote:
> > EricP <ThatWould...@thevillage.com> schrieb:
> >> MitchAlsup wrote:
> >>> On Wednesday, May 19, 2021 at 7:23:04 AM UTC-5, Anton Ertl wrote:
> >>>> Thomas Koenig <tko...@netcologne.de> writes:
> >>>>> Anton Ertl <an...@mips.complang.tuwien.ac.at> schrieb:
> >>>>>> Thomas Koenig <tko...@netcologne.de> writes:
> >>>>>>> Unsigned loops can cause a headache, because it is often
> >>>>>>> not possible to prove they actually terminate...
> >>>>>> And the difference to signed loops is? Note that you wrote
> >>>>>> "actually", so pretending that signed overflow does not happen does
> >>>>>> not count.
> >>>>> According to many language definitions, it does indeed not
> >>>>> count.
> >>>> This sentence makes no sense.
> >>> <
> >>> How about this::
> >>> <
> >>> Languages, that want to be ported across multiple architectures, and
> >>> have high performing implementations, cannot specify that signed integers
> >>> overflow.
> >>> <
> >>
> >> The language spec should:
> >> - allow the programmer to choose between bounds-range checked,
> >> unchecked, modulo, or saturating data types
> >
> > That is not going to fly for an established, general-purpose
> > language.
<
> A counter example is Ada, which implements most of those things.
<
And ADA has taken over what corner of programming/programmers ?
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: Compact representation for common integer constants

<fe350e43-65c2-41a6-9695-5e42b4dca1a3n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16979&group=comp.arch#16979

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:574:: with SMTP id p20mr6415290qkp.70.1621527870024;
Thu, 20 May 2021 09:24:30 -0700 (PDT)
X-Received: by 2002:a9d:3bcb:: with SMTP id k69mr4743408otc.206.1621527869796;
Thu, 20 May 2021 09:24:29 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 20 May 2021 09:24:29 -0700 (PDT)
In-Reply-To: <s8522o$tol$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:e8f5:6d8b:7a01:6d6c;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:e8f5:6d8b:7a01:6d6c
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com> <s7u338$cek$1@newsreader4.netcologne.de>
<s7v6j1$qns$1@dont-email.me> <s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at> <s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at> <s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at> <s8414d$okc$1@dont-email.me>
<2021May20.001137@mips.complang.tuwien.ac.at> <s846ou$s25$1@dont-email.me>
<35c1d9c9-d12a-4812-bcac-07979e0fcaccn@googlegroups.com> <s84b0a$lht$1@dont-email.me>
<5c8aa7d4-ae7d-465e-bcb4-230ee64b2bfbn@googlegroups.com> <s8522o$tol$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <fe350e43-65c2-41a6-9695-5e42b4dca1a3n@googlegroups.com>
Subject: Re: Compact representation for common integer constants
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 20 May 2021 16:24:30 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 71
 by: MitchAlsup - Thu, 20 May 2021 16:24 UTC

On Thursday, May 20, 2021 at 2:11:23 AM UTC-5, David Brown wrote:
> On 20/05/2021 04:12, MitchAlsup wrote:
> > On Wednesday, May 19, 2021 at 7:37:33 PM UTC-5, David Brown wrote:
> >> On 20/05/2021 01:54, MitchAlsup wrote:
> >>> On Wednesday, May 19, 2021 at 6:25:20 PM UTC-5, David Brown wrote:
> >>>> On 20/05/2021 00:11, Anton Ertl wrote:
> >>>>> David Brown <david...@hesbynett.no> writes:
> >>
> >>>>
> >>>> Perhaps it should also give the suggestion "you should be using unsigned
> >>>> types for this kind of thing". As a general rule, anything involving
> >>>> bit patterns should be unsigned types - signed types are for abstract
> >>>> numbers only.
> >>> <
> >>> All integral container that do not explicitly need to contain negative
> >>> numbers should be unsigned.
> >> That also works.
> >>
> >> But if you are saying this because overflow is defined for unsigned
> >> types in C, then I would disagree with the point - as overflow is
> >> generally (but not always) a mistake, whether signed or unsigned, and
> >> should not occur in normal code.
> > <
> > I am saying this because; unsigned arithmetic is much better defined
> > than signed arithmetic in C. The surprise coefficient is way lower.
> > <
> Yes, that's what I thought. I disagree. When I use types for
> arithmetic (outside of occasional very special cases, and bugs in my
> code), the arithmetic does not overflow. Overflows are errors. It
> doesn't matter what the language says - if I have 4294967295 apples in a
> pile, put another apple on, and you tell me I now have 0 apples, it is
> all nonsense.
<
It would not be nonsense on s 64-bit integer machine, and similar to
the reason we no longer use 16-bit machines: the containers are no
longer big enough and we can afford bigger containers.
>
> So IMHO correctly written arithmetic code does not overflow. No
> overflow, no surprises.
>
> It's fine to use unsigned types for data that is naturally non-negative.
> It makes sense to the reader. And sometimes the extra bit of range is
> useful.
>
> But if you think unsigned types are better for general purpose numbers
> because they are "better defined", you are using the wrong types for the
> arithmetic you are doing, or you have failed to check your values before
> using them.
<
I write CPU simulators, 99% of the things these simulators do is to push
piles of typeless bits around. Types are implied only sparingly along
the path*. Nor, is the index to any CPU resource ever negative. The only
place I need signed integers is when calling a library function hat
requires a signed argument or delivers a singled result.
<
(*) For example a value read from the register file is typeless until it
gets to the calculation unit and is typeless through the rest of the
pipeline. It falls into the "just a bunch of bits" category. It is not
signed, it is not unsigned either--it is signless:: a concept not yet
supported by implementation languages. Unsigned is the slosest
type to the desired type.
>
> I lost my appetite decades ago for smart-arse code that relies on
> overflows to see the end of a loop and that kind of thing. I write my
> code to make sense - it says what it does, and does what it says,
> instead of playing "bit twiddling tricks". Usually the result is not
> only simpler and clearer, safer and more portable, but is at least as
> efficient. In the extremely few cases where you need to do something
> odd to squeeze the last clock cycle out of critical code, then you put
> in the extra "unsigned" casts and other ugly details.
<
BTW I don't write code that overflows, in either sense.

Re: Compact representation for common integer constants

<s862i1$6hd$2@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16980&group=comp.arch#16980

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!news.dns-netz.com!news.freedyn.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-3221-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 16:25:38 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <s862i1$6hd$2@newsreader4.netcologne.de>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s7qlup$dm0$1@newsreader4.netcologne.de>
<e29a79f4-80ba-4dcb-8079-cf2f87a86b3en@googlegroups.com>
<s7svtl$qlt$1@newsreader4.netcologne.de>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<d5edb86e-0166-4c44-83eb-19e1cd8eb2c4n@googlegroups.com>
<nqapI.385883$2A5.264183@fx45.iad> <s83m67$lqi$3@newsreader4.netcologne.de>
<s84p61$sqh$1@dont-email.me>
Injection-Date: Thu, 20 May 2021 16:25:38 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-3221-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd6:3221:0:7285:c2ff:fe6c:992d";
logging-data="6701"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Thu, 20 May 2021 16:25 UTC

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> schrieb:
> On 5/19/2021 11:42 AM, Thomas Koenig wrote:
>> EricP <ThatWouldBeTelling@thevillage.com> schrieb:
>>
>> Coming back.
>>> The language spec should:
>>> - allow the programmer to choose between bounds-range checked,
>>> unchecked, modulo, or saturating data types
>>
>> That would be an interesting language design...
>>
>> Using Fortran's attribute syntax here, because it
>> could easily be extended:
>>
>> integer, unchecked :: i
>> integer, modulo :: j
>> integer, saturating :: n
>>
>> Just a few question about this...
>>
>> Should integer constants also have these properties?
>
> I don't see why they should. Most of these properties are only useful
> if the item is the target of an assign statement, which obviously, a
> constant can't be.

The question is if they should also apply to expressions.
Let's say you have

n = huge(n) ! Assign maximum value to saturating variable
write (*,*) n + 10

what should happen?

And if you declare a named constant in a safe way (not
C's #define), what about it?

And should the evaluation of an expression on the RHS of an
assignment depend on what is on the LHS?

I would consider that a design mistake. Perl is the only
programming language I use that does this, but Perl is
weird by design.

>> What about
>> named constants (parameters in Fortran parlance)?
>
> Same arguments as above.
>
>
>> What about mixed
>> expressions, how would they work (or not)?
>
> If you disallow side effects in expressions,

That is a big if.

Fortran compiler witers regularly get flak if they dare only
do one function evaluation for

a = foo(x) + foo(x)

Fortran rules clearly state that this could be evaluated as

tmp = foo(x)
a = tmp + tmp

although the rules of the Fortran standard are crystal clear
on that one.

> then no need to. You can
> effect some of what you might need with various simple functions that
> could probably be inlined.

What I wrote about expressions above should probably come
below here, so I left it in.

>
>
>> What about expressions
>> using different integer sizes? What about calling a procedure with
>> a saturating argument, should it also require a saturating argument
>> as actual argument or should that be rejected or warned about?
>
> I think it should be required, unless you have some extra syntax saying
> "I know this is unusual, but I know what I am doing, so trust me."

A big step for any language. I am not aware of anything that
does so at the moment (Ada seems to have some different rules),
so you might actually need to invent your own language for that.

But maybe extending / modifying Ada would still be your best
bet, I don't know the language well enough to judge.

Re: Compact representation for common integer constants

<s862oo$jgt$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16981&group=comp.arch#16981

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!/FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 18:29:13 +0200
Organization: Aioe.org NNTP Server
Lines: 41
Message-ID: <s862oo$jgt$1@gioia.aioe.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s7qlup$dm0$1@newsreader4.netcologne.de>
<e29a79f4-80ba-4dcb-8079-cf2f87a86b3en@googlegroups.com>
<s7svtl$qlt$1@newsreader4.netcologne.de>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<d5edb86e-0166-4c44-83eb-19e1cd8eb2c4n@googlegroups.com>
<nqapI.385883$2A5.264183@fx45.iad> <s83eir$irf$1@newsreader4.netcologne.de>
<U2cpI.62291$od.35116@fx15.iad> <s855l9$7fi$1@gioia.aioe.org>
<s85iv4$et4$1@dont-email.me> <s85mdh$3d4$2@gioia.aioe.org>
<s85rhk$g8s$1@dont-email.me>
NNTP-Posting-Host: /FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Thu, 20 May 2021 16:29 UTC

David Brown wrote:
> On 20/05/2021 14:58, Terje Mathisen wrote:
>>> I don't know how that compares in speed to the other versions (I have
>>> long experience of assembly, but not on x86).  It seems reasonable to me
>>> to have a faster path for the non-saturating line.
>>
>> That is probably a good idea, except for inline code which we would like
>> to be short & branch-free.
>>
>
> Short, yes - branch-free is not as critical, AFAIUI. Either the code is
> run often, in which case it will be predicted (and speculatively
> executed) efficiently, or it is run rarely and the speed details don't
> matter.
>
Even if you do decide that you need an inline branch, you should make it
a branch around the fixup code instead of having to branch twice: Even
with perfect prediction, using twice as many BTB slots will hurt.

I.e.

add rax,rbx
jno done

;; If we got an overflow and RBX is positive, then
;; saturate at 0x7fff..., else at 0x8000...
shr rbx,63 ;; Isolate the sign bit
mov rax,0x7ffffffffffffffff
add rax,rbx ;; Inc from 0x7fff to 0x8000

done:

This is a single cycle with a correctly predicted no-overflow add, 3
cycles when predicted to overflow. One branch miss penalty otherwise,
but still likely to be much faster than the branchless CMOV version.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Compact representation for common integer constants

<622fea9b-dfb5-4f89-8908-c03f7b40da31n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16982&group=comp.arch#16982

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5d88:: with SMTP id d8mr1813038qtx.147.1621528229714;
Thu, 20 May 2021 09:30:29 -0700 (PDT)
X-Received: by 2002:a05:6830:a:: with SMTP id c10mr4874999otp.114.1621528229516;
Thu, 20 May 2021 09:30:29 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.mixmin.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 20 May 2021 09:30:29 -0700 (PDT)
In-Reply-To: <s85mdc$3d4$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:e8f5:6d8b:7a01:6d6c;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:e8f5:6d8b:7a01:6d6c
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com> <s7u338$cek$1@newsreader4.netcologne.de>
<s7v6j1$qns$1@dont-email.me> <s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at> <s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at> <s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at> <s8414d$okc$1@dont-email.me>
<2021May20.001137@mips.complang.tuwien.ac.at> <s846ou$s25$1@dont-email.me>
<35c1d9c9-d12a-4812-bcac-07979e0fcaccn@googlegroups.com> <s84b0a$lht$1@dont-email.me>
<5c8aa7d4-ae7d-465e-bcb4-230ee64b2bfbn@googlegroups.com> <s8522o$tol$1@dont-email.me>
<s85i3s$929$1@dont-email.me> <s85j8h$hce$1@dont-email.me> <s85mdc$3d4$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <622fea9b-dfb5-4f89-8908-c03f7b40da31n@googlegroups.com>
Subject: Re: Compact representation for common integer constants
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 20 May 2021 16:30:29 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Thu, 20 May 2021 16:30 UTC

On Thursday, May 20, 2021 at 7:58:24 AM UTC-5, Terje Mathisen wrote:
> David Brown wrote:
> > On 20/05/2021 13:45, Marcus wrote:
> >> Den 2021-05-20 kl. 09:11, skrev David Brown:
> >
> > I work with microcontrollers, so that extra bit is often needed to
> > correctly interact with hardware registers. But I agree that it is rare
> > in normal usage to have a need to store numbers bigger than about 2
> > billion but never need to go above around 4 billion.
> Rather the opposite:
>
> Pretty much all large 32-bit programs needed to handle near-maximum
> memory pools (i.e. ~3GB), and that meant doing it right with 32-bit ops,
> since using 64-bit would be far slower.
<
We should have made the transition to 64-bit machines (what) nearly 2
decades ago ? 2003
<
> Terje
>
> --
> - <Terje.Mathisen at tmsw.no>
> "almost all programming can be viewed as an exercise in caching"

Re: Compact representation for common integer constants

<51fddd24-666f-4914-9161-30e010e0ef92n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16983&group=comp.arch#16983

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:514f:: with SMTP id h15mr6015932qtn.122.1621528492245;
Thu, 20 May 2021 09:34:52 -0700 (PDT)
X-Received: by 2002:a9d:2ee:: with SMTP id 101mr3394924otl.76.1621528492005;
Thu, 20 May 2021 09:34:52 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!feeder1.feed.usenet.farm!feed.usenet.farm!2.eu.feeder.erje.net!feeder.erje.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 20 May 2021 09:34:51 -0700 (PDT)
In-Reply-To: <s85u1e$2nb$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:e8f5:6d8b:7a01:6d6c;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:e8f5:6d8b:7a01:6d6c
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s7qlup$dm0$1@newsreader4.netcologne.de> <e29a79f4-80ba-4dcb-8079-cf2f87a86b3en@googlegroups.com>
<s7svtl$qlt$1@newsreader4.netcologne.de> <bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de> <2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de> <2021May19.124934@mips.complang.tuwien.ac.at>
<d5edb86e-0166-4c44-83eb-19e1cd8eb2c4n@googlegroups.com> <nqapI.385883$2A5.264183@fx45.iad>
<s84b5i$mal$1@dont-email.me> <igmeioFr49pU1@mid.individual.net> <s85u1e$2nb$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <51fddd24-666f-4914-9161-30e010e0ef92n@googlegroups.com>
Subject: Re: Compact representation for common integer constants
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 20 May 2021 16:34:52 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Thu, 20 May 2021 16:34 UTC

On Thursday, May 20, 2021 at 10:08:32 AM UTC-5, Stephen Fuld wrote:
> On 5/19/2021 11:40 PM, Niklas Holsti wrote:
> > On 2021-05-20 3:40, Stephen Fuld wrote:
> >> On 5/19/2021 8:47 AM, EricP wrote:

> Fair enough. This sort of gets back to the question of whether to add a
> feature like this as an attribute of the variable (e.g. range) or of the
> operation (e.g. bignum). I think there are arguments for each.
>
> BTW, I am unsure of what the requirement is here. Suppose you have a
> variable with a range of 10-90. Do you want saturation at 90? That is
<
If the value to be written is outside of the range [10..90] you raise an
exception. The raised exception decides what to do.
<
> if its value is 85 and you add 10 to it, do you want 90, not a range
> violation? What about if you subtract 3 from the variable that contains
> a value of 12? Do you want 9? Or is the requirement only to support
> saturation at variable size boundaries, i.e. 255 for byte wide variables?
> >>> - specify that range and array bounds checks can be enabled or disabled
> >>> at compile time by command line options or inline pragmas,
> >>> for blocks of code, individual data types, or individual objects.
> >>
> >> Since Ada supports Range attributes, and you can use a range
> >> restricted variable as a subscript or loop limit, I think you can do
> >> all of that.
> >
> >
> > Yes, but see the note on check-disabling pragmas above.
> Right. I was thinking that if you didn't want range checks, you would
> simply specify a variable without them.
> >> But Nick Holsti is the Ada expert around here, so perhaps he will
> >> chime in.
> >
> >
> > You got it right, Stephen.
> Thank you.
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: Compact representation for common integer constants

<s86354$pv4$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16984&group=comp.arch#16984

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!/FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Thu, 20 May 2021 18:35:49 +0200
Organization: Aioe.org NNTP Server
Lines: 67
Message-ID: <s86354$pv4$1@gioia.aioe.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<bcd3dd9a-cfcd-4691-9163-842ddf1f483dn@googlegroups.com>
<s7u338$cek$1@newsreader4.netcologne.de> <s7v6j1$qns$1@dont-email.me>
<s7vn9p$bfu$1@newsreader4.netcologne.de>
<2021May18.183723@mips.complang.tuwien.ac.at>
<s80rjn$1ci$2@newsreader4.netcologne.de>
<2021May19.124934@mips.complang.tuwien.ac.at>
<s83h0k$jaj$1@newsreader4.netcologne.de>
<2021May19.214832@mips.complang.tuwien.ac.at> <s8414d$okc$1@dont-email.me>
<2021May20.001137@mips.complang.tuwien.ac.at> <s846ou$s25$1@dont-email.me>
<35c1d9c9-d12a-4812-bcac-07979e0fcaccn@googlegroups.com>
<s84b0a$lht$1@dont-email.me>
<5c8aa7d4-ae7d-465e-bcb4-230ee64b2bfbn@googlegroups.com>
<s8522o$tol$1@dont-email.me> <s85i3s$929$1@dont-email.me>
<s85j8h$hce$1@dont-email.me> <s85mdc$3d4$1@gioia.aioe.org>
<s85rth$ho3$1@dont-email.me>
NNTP-Posting-Host: /FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Thu, 20 May 2021 16:35 UTC

David Brown wrote:
> On 20/05/2021 14:58, Terje Mathisen wrote:
>> David Brown wrote:
>>> On 20/05/2021 13:45, Marcus wrote:
>
>>>>
>>>> That extra bit of range that you get with unsigned integers is usually
>>>> just a fallacy: if you need it, you're so close to the limit that you
>>>> should probably just double the integer size instead (e.g. go from
>>>> int32_t to int64_t).
>>>
>>> I work with microcontrollers, so that extra bit is often needed to
>>> correctly interact with hardware registers.  But I agree that it is rare
>>> in normal usage to have a need to store numbers bigger than about 2
>>> billion but never need to go above around 4 billion.
>>
>> Rather the opposite:
>>
>> Pretty much all large 32-bit programs needed to handle near-maximum
>> memory pools (i.e. ~3GB), and that meant doing it right with 32-bit ops,
>> since using 64-bit would be far slower.
>>
>
> Really?
>
> There are some programs that would want to push the limits and use as
> much space as possible, but surely they are not common. After all, when
> 32-bit cpus were the norm then 4 GB memory was rare.

Again, that is wrong: We had 4GB as the default memory on 32-bit systems
for a very long time (this was in the industrial/energy sector).
Remember that we had 32-bit Windows OS for many years after the CPUs
supported 64-bit.

>
> And the default user/kernel split on 32-bit Windows was 2GB/2GB, IIRC,
> so on that platform you could not normally access more than 2 GB anyway.

This is _definitely_ bogus: The only real 2/2 split happened due to unix
code that had been ported, and using all "negative" addresses returned
by malloc to indicate an allocation failure, instead of testing for the
exact (typically uint32_t (-1)) error value.

It was however common enough that you needed to set a flag in the EXE
header to tell the OS that your program did not suffer from this
particular bug and could safely be trusted with up to about 3.5 GB of
memory allocations.
>
> By the time you have a 64-bit processor and a program that needs lots of
> memory, you'll have more than 4 GB and you'll use 64-bit pointers and
> sizes in order to access large data blocks.
>
> I'd believe having 32-bit size_t values of more than 2 GB would be used
> sometimes, I just find it hard to imagine that it happens often.

Like I said, this happened all the time, particularly on Win2008 server
installations, but also on end-user PCs.
>
> But perhaps it's all a matter of the kind of programs involved.
>
Yeah. :-)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Pages:123456789101112131415
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor