Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

The universe does not have laws -- it has habits, and habits can be broken.


devel / comp.arch / Why separate 32-bit arithmetic on a 64-bit architecture?

SubjectAuthor
* Why separate 32-bit arithmetic on a 64-bit architecture?Thomas Koenig
+* Re: Why separate 32-bit arithmetic on a 64-bit architecture?BGB
|`* Re: Why separate 32-bit arithmetic on a 64-bit architecture?David Brown
| +- Re: Why separate 32-bit arithmetic on a 64-bit architecture?BGB
| `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Anton Ertl
|  `- Re: Why separate 32-bit arithmetic on a 64-bit architecture?Anton Ertl
+* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Anton Ertl
|`- Re: Why separate 32-bit arithmetic on a 64-bit architecture?BGB
+* Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
|`* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Marcus
| +- Re: Why separate 32-bit arithmetic on a 64-bit architecture?Marcus
| `- Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
+* Re: Why separate 32-bit arithmetic on a 64-bit architecture?EricP
|+* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Thomas Koenig
||`* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Thomas Koenig
|| `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?EricP
||  `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Thomas Koenig
||   `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc
||    +* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Terje Mathisen
||    |`* The cost of gradual underflow (was: Why separate 32-bit arithmetic on a 64-bit aStefan Monnier
||    | `- Re: The cost of gradual underflowTerje Mathisen
||    +- Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
||    `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?antispam
||     +* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Terje Mathisen
||     |`* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Michael S
||     | +* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Terje Mathisen
||     | |`* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Michael S
||     | | +* Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
||     | | |`* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Michael S
||     | | | `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
||     | | |  `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Michael S
||     | | |   +- Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
||     | | |   `- Re: Why separate 32-bit arithmetic on a 64-bit architecture?Anton Ertl
||     | | `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Terje Mathisen
||     | |  `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Michael S
||     | |   `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
||     | |    `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Michael S
||     | |     `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
||     | |      `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Michael S
||     | |       `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
||     | |        `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Michael S
||     | |         +* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Thomas Koenig
||     | |         |+- Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc
||     | |         |`- Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc
||     | |         `- Re: Why separate 32-bit arithmetic on a 64-bit architecture?Terje Mathisen
||     | `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
||     |  +* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc
||     |  |+* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc
||     |  ||`- Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
||     |  |+* Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
||     |  ||`* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc
||     |  || `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
||     |  ||  `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc
||     |  ||   +* Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
||     |  ||   |`* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Thomas Koenig
||     |  ||   | `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Michael S
||     |  ||   |  +* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Anton Ertl
||     |  ||   |  |+- Re: Why separate 32-bit arithmetic on a 64-bit architecture?EricP
||     |  ||   |  |`* Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
||     |  ||   |  | `- Re: Why separate 32-bit arithmetic on a 64-bit architecture?Michael S
||     |  ||   |  `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc
||     |  ||   |   +- Re: Why separate 32-bit arithmetic on a 64-bit architecture?Michael S
||     |  ||   |   +* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc
||     |  ||   |   |`* Re: Why separate 32-bit arithmetic on a 64-bit architecture?George Neuner
||     |  ||   |   | `- Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc
||     |  ||   |   `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc
||     |  ||   |    +- Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc
||     |  ||   |    +- Spectre ane EPIC (was: Why separate 32-bit arithmetic...)Anton Ertl
||     |  ||   |    +* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Michael S
||     |  ||   |    |`* Spectre (was: Why separate 32-bit arithmetic ...)Anton Ertl
||     |  ||   |    | +* Re: Spectre (was: Why separate 32-bit arithmetic ...)Michael S
||     |  ||   |    | |+* Re: SpectreEricP
||     |  ||   |    | ||+* Re: SpectreMitchAlsup
||     |  ||   |    | |||`* Re: SpectreEricP
||     |  ||   |    | ||| `- Re: SpectreMitchAlsup
||     |  ||   |    | ||`- Re: SpectreAnton Ertl
||     |  ||   |    | |`* Re: Spectre (was: Why separate 32-bit arithmetic ...)Anton Ertl
||     |  ||   |    | | +* Re: Spectre (was: Why separate 32-bit arithmetic ...)MitchAlsup
||     |  ||   |    | | |`- Re: Spectre (was: Why separate 32-bit arithmetic ...)Thomas Koenig
||     |  ||   |    | | `- Re: Spectre (was: Why separate 32-bit arithmetic ...)Anton Ertl
||     |  ||   |    | +* Re: SpectreEricP
||     |  ||   |    | |`* Re: SpectreAnton Ertl
||     |  ||   |    | | +* Memory encryption (was: Spectre)Thomas Koenig
||     |  ||   |    | | |`* Re: Memory encryption (was: Spectre)Anton Ertl
||     |  ||   |    | | | `* Re: Memory encryption (was: Spectre)Elijah Stone
||     |  ||   |    | | |  +- Re: Memory encryption (was: Spectre)Michael S
||     |  ||   |    | | |  `* Re: Memory encryption (was: Spectre)Anton Ertl
||     |  ||   |    | | |   +- Re: Memory encryption (was: Spectre)MitchAlsup
||     |  ||   |    | | |   `* Re: Memory encryption (was: Spectre)Thomas Koenig
||     |  ||   |    | | |    `- Re: Memory encryption (was: Spectre)Anton Ertl
||     |  ||   |    | | `* Re: SpectreTerje Mathisen
||     |  ||   |    | |  `* Re: SpectreThomas Koenig
||     |  ||   |    | |   +* Re: SpectreAnton Ertl
||     |  ||   |    | |   |`* Re: SpectreThomas Koenig
||     |  ||   |    | |   | +- Re: SpectreAnton Ertl
||     |  ||   |    | |   | `- Re: SpectreMichael S
||     |  ||   |    | |   `- Re: SpectreMitchAlsup
||     |  ||   |    | `* Re: Spectre (was: Why separate 32-bit arithmetic ...)MitchAlsup
||     |  ||   |    |  `- Re: Spectre (was: Why separate 32-bit arithmetic ...)Anton Ertl
||     |  ||   |    `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc
||     |  ||   |     `- Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc
||     |  ||   `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Anton Ertl
||     |  |+- Re: Why separate 32-bit arithmetic on a 64-bit architecture?Bill Findlay
||     |  |`* Re: Imprecision, was Why separate 32-bit arithmetic on a 64-bit architecture?John Levine
||     |  `- Re: Why separate 32-bit arithmetic on a 64-bit architecture?Michael S
||     `* Re: Why separate 32-bit arithmetic on a 64-bit architecture?MitchAlsup
|`* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Anton Ertl
`* Re: Why separate 32-bit arithmetic on a 64-bit architecture?Quadibloc

Pages:1234567
Why separate 32-bit arithmetic on a 64-bit architecture?

<sso6aq$37b$1@newsreader4.netcologne.de>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23092&group=comp.arch#23092

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-238-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Tue, 25 Jan 2022 06:45:46 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <sso6aq$37b$1@newsreader4.netcologne.de>
Injection-Date: Tue, 25 Jan 2022 06:45:46 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-238-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:238:0:7285:c2ff:fe6c:992d";
logging-data="3307"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 25 Jan 2022 06:45 UTC

Looking at Alpha, and also now at LoonArch, I find separate
instructions for 32-bit and 64-bit addition for a 64-bit
architecture. For example, Alpha has both addl (add longword)
and addq (add quadword).

What is not quite clear to me is the rationale behind this - a
64-bit addition is (blindlingly obviously) just a 32-bit addition
when ignoring the high bits.

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<ssoarg$jip$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23093&group=comp.arch#23093

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Tue, 25 Jan 2022 02:02:54 -0600
Organization: A noiseless patient Spider
Lines: 42
Message-ID: <ssoarg$jip$1@dont-email.me>
References: <sso6aq$37b$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 25 Jan 2022 08:02:56 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="00ec3774dd4bc1698573d4078f83b292";
logging-data="20057"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+PIn8sXHusu+kwEYudj9+I"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:mTdj6bEguZSpRrMteejUH6FUffQ=
In-Reply-To: <sso6aq$37b$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: BGB - Tue, 25 Jan 2022 08:02 UTC

On 1/25/2022 12:45 AM, Thomas Koenig wrote:
> Looking at Alpha, and also now at LoonArch, I find separate
> instructions for 32-bit and 64-bit addition for a 64-bit
> architecture. For example, Alpha has both addl (add longword)
> and addq (add quadword).
>
> What is not quite clear to me is the rationale behind this - a
> 64-bit addition is (blindlingly obviously) just a 32-bit addition
> when ignoring the high bits.

Can't answer for them, but I ended up with 3 ADD major variants in BJX2:
ADD //64-bit
ADDS.L //32-bit sign-extend
ADDU.L //32-bit zero-extend

Along with SUB variants (excluding Immediate forms, where SUB is encoded
as an ADD with a negative immediate).

Well, and "ADD Imm16s, Rn" is only available in 64-bit form, because the
dominant use case for this instruction is adjusting SP based on the size
of the stack-frame during function call/return. Normal arithmetic far
more often ends up needing an "ADDx Rm, Imm9, Rn" encoding or similar
(eg, something like "j=i+15;" tends to be more common than "j+=0x1234;"
or similar).

Why:
There are places in the ISA where there are *not* separate 32 and 64-bit
instructions;
Bad stuff can happen if "int" or "unsigned int" is allowed to go outside
of its nominal range (and the high bits are not ignored);
Sticking in a bunch of extra sign or zero extension instructions is
undesirable;
....

But, one doesn't need duplicate ops for AND/OR/XOR because, unlike ADD
or SUB, these do not produce out-of-range results when given in-range
inputs.

....

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<ssobot$oqn$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23094&group=comp.arch#23094

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Tue, 25 Jan 2022 09:18:36 +0100
Organization: A noiseless patient Spider
Lines: 49
Message-ID: <ssobot$oqn$1@dont-email.me>
References: <sso6aq$37b$1@newsreader4.netcologne.de>
<ssoarg$jip$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 25 Jan 2022 08:18:37 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8981ca717f4c56865f2b105b3b551e10";
logging-data="25431"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19iJUex1YlbuOM2+CQCn9QNIUaaGJhvsX4="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:3k23Wk/1x+wnqSISPIAziqXZweI=
In-Reply-To: <ssoarg$jip$1@dont-email.me>
Content-Language: en-GB
 by: David Brown - Tue, 25 Jan 2022 08:18 UTC

On 25/01/2022 09:02, BGB wrote:
> On 1/25/2022 12:45 AM, Thomas Koenig wrote:
>> Looking at Alpha, and also now at LoonArch, I find separate
>> instructions for 32-bit and 64-bit addition for a 64-bit
>> architecture.  For example, Alpha has both addl (add longword)
>> and addq (add quadword).
>>
>> What is not quite clear to me is the rationale behind this - a
>> 64-bit addition is (blindlingly obviously) just a 32-bit addition
>> when ignoring the high bits.
>
> Can't answer for them, but I ended up with 3 ADD major variants in BJX2:
>   ADD     //64-bit
>   ADDS.L  //32-bit sign-extend
>   ADDU.L  //32-bit zero-extend
>
>
> Along with SUB variants (excluding Immediate forms, where SUB is encoded
> as an ADD with a negative immediate).
>
> Well, and "ADD Imm16s, Rn" is only available in 64-bit form, because the
> dominant use case for this instruction is adjusting SP based on the size
> of the stack-frame during function call/return. Normal arithmetic far
> more often ends up needing an "ADDx Rm, Imm9, Rn" encoding or similar
> (eg, something like "j=i+15;" tends to be more common than "j+=0x1234;"
> or similar).
>
>
> Why:
> There are places in the ISA where there are *not* separate 32 and 64-bit
> instructions;
> Bad stuff can happen if "int" or "unsigned int" is allowed to go outside
> of its nominal range (and the high bits are not ignored);
> Sticking in a bunch of extra sign or zero extension instructions is
> undesirable;
> ...
>
> But, one doesn't need duplicate ops for AND/OR/XOR because, unlike ADD
> or SUB, these do not produce out-of-range results when given in-range
> inputs.
>

If I understand you correctly, you are saying it is not the addition
itself that is differentiated between 32-bit and 64-bit, but the way the
operands are treated and extended. That makes sense to me.

For smaller implementations, division (and possibly multiplication)
could be slower for larger operands, so having alternative sizes here
makes sense.

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<ssoghd$jrh$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23095&group=comp.arch#23095

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Tue, 25 Jan 2022 03:39:55 -0600
Organization: A noiseless patient Spider
Lines: 72
Message-ID: <ssoghd$jrh$1@dont-email.me>
References: <sso6aq$37b$1@newsreader4.netcologne.de>
<ssoarg$jip$1@dont-email.me> <ssobot$oqn$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 25 Jan 2022 09:39:57 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="00ec3774dd4bc1698573d4078f83b292";
logging-data="20337"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19SGtEH0kGXVv1qQFlAtY7g"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:g2oP+zuZqSDM5iHmgcWUZdUDOrk=
In-Reply-To: <ssobot$oqn$1@dont-email.me>
Content-Language: en-US
 by: BGB - Tue, 25 Jan 2022 09:39 UTC

On 1/25/2022 2:18 AM, David Brown wrote:
> On 25/01/2022 09:02, BGB wrote:
>> On 1/25/2022 12:45 AM, Thomas Koenig wrote:
>>> Looking at Alpha, and also now at LoonArch, I find separate
>>> instructions for 32-bit and 64-bit addition for a 64-bit
>>> architecture.  For example, Alpha has both addl (add longword)
>>> and addq (add quadword).
>>>
>>> What is not quite clear to me is the rationale behind this - a
>>> 64-bit addition is (blindlingly obviously) just a 32-bit addition
>>> when ignoring the high bits.
>>
>> Can't answer for them, but I ended up with 3 ADD major variants in BJX2:
>>   ADD     //64-bit
>>   ADDS.L  //32-bit sign-extend
>>   ADDU.L  //32-bit zero-extend
>>
>>
>> Along with SUB variants (excluding Immediate forms, where SUB is encoded
>> as an ADD with a negative immediate).
>>
>> Well, and "ADD Imm16s, Rn" is only available in 64-bit form, because the
>> dominant use case for this instruction is adjusting SP based on the size
>> of the stack-frame during function call/return. Normal arithmetic far
>> more often ends up needing an "ADDx Rm, Imm9, Rn" encoding or similar
>> (eg, something like "j=i+15;" tends to be more common than "j+=0x1234;"
>> or similar).
>>
>>
>> Why:
>> There are places in the ISA where there are *not* separate 32 and 64-bit
>> instructions;
>> Bad stuff can happen if "int" or "unsigned int" is allowed to go outside
>> of its nominal range (and the high bits are not ignored);
>> Sticking in a bunch of extra sign or zero extension instructions is
>> undesirable;
>> ...
>>
>> But, one doesn't need duplicate ops for AND/OR/XOR because, unlike ADD
>> or SUB, these do not produce out-of-range results when given in-range
>> inputs.
>>
>
> If I understand you correctly, you are saying it is not the addition
> itself that is differentiated between 32-bit and 64-bit, but the way the
> operands are treated and extended. That makes sense to me.
>

Yeah, pretty much.

It is mostly a special case of sign or zero extending the result, which
doesn't cost that much (and can save clock cycles that might have
otherwise been spent on explicit sign or zero extensions).

> For smaller implementations, division (and possibly multiplication)
> could be slower for larger operands, so having alternative sizes here
> makes sense.

In my case, there is only a 32-bit multiplier (in hardware) and no
divide instructions. This was mostly for cost reasons.

So, 64-bit multiply is done in software (via multiple 32-bit
multiplies). Integer division generally turns into a function call to a
"binary long division" loop (except for things like divide-by-constant
and similar).

For most things, this isn't a serious issue...

A 32-bit-only integer multiply is sufficient for maybe 95% of use-cases.

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<2022Jan25.103539@mips.complang.tuwien.ac.at>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23096&group=comp.arch#23096

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Tue, 25 Jan 2022 09:35:39 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 40
Distribution: world
Message-ID: <2022Jan25.103539@mips.complang.tuwien.ac.at>
References: <sso6aq$37b$1@newsreader4.netcologne.de>
Injection-Info: reader02.eternal-september.org; posting-host="aa604b7d5db7b2c1d90eafe053592a2a";
logging-data="27450"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19anINbF6xO4xsIstENxpcO"
Cancel-Lock: sha1:5Kbbql/N1x89dNsqUpGAlRIYUgM=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 25 Jan 2022 09:35 UTC

Thomas Koenig <tkoenig@netcologne.de> writes:
>What is not quite clear to me is the rationale behind this - a
>64-bit addition is (blindlingly obviously) just a 32-bit addition
>when ignoring the high bits.

addl is also used for pure sign-extension on Alpha.

The rationale is unclear to me. Maybe they looked for ways to make
use of the 32 bits of an instruction and integrating a 32->64 bit sign
extension into every integer instruction seemed worthwhile to them.

You want the sign-extended rather than garbage-extended version when
performing address arithmetic, and probably also when passing 32-bit
values in the ABI (yes, IIRC Alpha even passes unsigned 32-bit values
in sign-extended form).

For comparison:

ARM A64 has addressing modes that zero-extend or sign-extend one of
the involved registers. Not sure how many 32-bit instructions they
have.

RV64I has ADDIW, ADDW, SUBW, SLLIW, and SLLW (plus SRLIW, SRLW, SRAIW,
SRAW but here the result is not just the same as the 64-bit variant
followed by sign extension). I find sign extension pretty perverse
for SRLW and SRLIW, but for shift amounts >0 sign-extension and zero
extension have the same result anyway.

Given RISC-V's dogma of using instruction fusion instead of adding
more instructions, the five instructions that could be replaced by the
64-bit variant followed by a sign-extension instruction are a
surprise. And the shift-right instructions could be replaced by a
sign-extension of zero-extension instruction followed by the 64-bit
shift instruction, followed for SRA by a sign-extension instruction
(if the shift amount can be =0).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<2022Jan25.110002@mips.complang.tuwien.ac.at>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23097&group=comp.arch#23097

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Tue, 25 Jan 2022 10:00:02 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 38
Message-ID: <2022Jan25.110002@mips.complang.tuwien.ac.at>
References: <sso6aq$37b$1@newsreader4.netcologne.de> <ssoarg$jip$1@dont-email.me> <ssobot$oqn$1@dont-email.me>
Injection-Info: reader02.eternal-september.org; posting-host="aa604b7d5db7b2c1d90eafe053592a2a";
logging-data="27450"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Y1xoFuQ+vh/7skTcIPdy7"
Cancel-Lock: sha1:d1NA+6Ft7TO60Oh+3IQhM5JCV4E=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 25 Jan 2022 10:00 UTC

David Brown <david.brown@hesbynett.no> writes:
>If I understand you correctly, you are saying it is not the addition
>itself that is differentiated between 32-bit and 64-bit, but the way the
>operands are treated and extended.

On Alpha, only the results. They are sign-extended.

>For smaller implementations, division (and possibly multiplication)
>could be slower for larger operands, so having alternative sizes here
>makes sense.

Alpha does not have an integer division instruction.

For integer division, the 32-bit variant is not equivalent to the
64-bit variant in the lower 32 bits (unlike for addition, subtraction,
multiplication, shift left, and, or, xor). You can get the same
result by first zero-extending or sign-extending the operands, and
then performing the 64-bit division.

As for performance, on at least some Intel and some ARM division
implementations, performance depends on the dynamic values, not on the
static instruction used. On Cortex-A53 and A73, the time seems to
depend on the size of the result (what is the highest bit set in the
result); on Intel (at least Skylake), 128/64 division has a step
function: If the dividend's higher half is just the zero extension
(for unsigned division) or sign extension (for signed division) of the
lower half, it takes ~26 cycles, otherwise IIRC ~80 cycles.

A cool algorithm for implementing floored division on top of unsigned
division runs afoul of this whenever you pass it a negative dividend:
the unsigned division then needs ~80 cycles even if you just divide -7
by 3, whereas the more pedestrian algorithm based on the signed
division instruction does not invoke the slow case for such operands.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<2022Jan25.114951@mips.complang.tuwien.ac.at>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23098&group=comp.arch#23098

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Tue, 25 Jan 2022 10:49:51 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 16
Message-ID: <2022Jan25.114951@mips.complang.tuwien.ac.at>
References: <sso6aq$37b$1@newsreader4.netcologne.de> <ssoarg$jip$1@dont-email.me> <ssobot$oqn$1@dont-email.me> <2022Jan25.110002@mips.complang.tuwien.ac.at>
Injection-Info: reader02.eternal-september.org; posting-host="aa604b7d5db7b2c1d90eafe053592a2a";
logging-data="27450"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/bnhRj07tU7XVfy9zk+KWr"
Cancel-Lock: sha1:u6vjkmCH/+mhGaBx4pAxFYBhLx8=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 25 Jan 2022 10:49 UTC

anton@mips.complang.tuwien.ac.at (Anton Ertl) writes:
>For integer division, the 32-bit variant is not equivalent to the
>64-bit variant in the lower 32 bits (unlike for addition, subtraction,
>multiplication, shift left, and, or, xor). You can get the same
>result by first zero-extending or sign-extending the operands, and
>then performing the 64-bit division.

One special case is minint/-1. If the 64-bit instruction traps in
that case or produces a special result, and you want the same
behaviour for the 32-bit operation, some additional twistrs may be
necessary.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<ssp7js$ns5$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23100&group=comp.arch#23100

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Tue, 25 Jan 2022 10:13:47 -0600
Organization: A noiseless patient Spider
Lines: 76
Message-ID: <ssp7js$ns5$1@dont-email.me>
References: <sso6aq$37b$1@newsreader4.netcologne.de>
<2022Jan25.103539@mips.complang.tuwien.ac.at>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 25 Jan 2022 16:13:49 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="00ec3774dd4bc1698573d4078f83b292";
logging-data="24453"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19CrIifwazdiTvdfWLGf6Us"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:XEUszjZe9tSbTGAY7EdZWttT1Rs=
In-Reply-To: <2022Jan25.103539@mips.complang.tuwien.ac.at>
Content-Language: en-US
 by: BGB - Tue, 25 Jan 2022 16:13 UTC

On 1/25/2022 3:35 AM, Anton Ertl wrote:
> Thomas Koenig <tkoenig@netcologne.de> writes:
>> What is not quite clear to me is the rationale behind this - a
>> 64-bit addition is (blindlingly obviously) just a 32-bit addition
>> when ignoring the high bits.
>
> addl is also used for pure sign-extension on Alpha.
>
> The rationale is unclear to me. Maybe they looked for ways to make
> use of the 32 bits of an instruction and integrating a 32->64 bit sign
> extension into every integer instruction seemed worthwhile to them.
>
> You want the sign-extended rather than garbage-extended version when
> performing address arithmetic, and probably also when passing 32-bit
> values in the ABI (yes, IIRC Alpha even passes unsigned 32-bit values
> in sign-extended form).
>
> For comparison:
>
> ARM A64 has addressing modes that zero-extend or sign-extend one of
> the involved registers. Not sure how many 32-bit instructions they
> have.
>

This sort of thing is the alternative.

If every instruction is capable of ignoring the high bits, then one does
not need sign or zero extension.

Otherwise, having a few sign and zero extending forms are useful.

Going the "neither" route means likely manually sign or zero extending
values either on the output from arithmetic expressions, or on the
inputs to instructions which do not ignore the high bits.

I was tempted at one point to go with defining all of the memory access
instructions as using 32-bit displacements (and ignoring the high order
bits), but had instead ended up with 33-bits (to match the 24+9 jumbo
displacements, and also the pipelines' use of a 33-bit value for passing
immediate values).

Though, on the positive side, requiring the displacements to be (either)
sign or zero extended (based on whether it is signed or unsigned), does
leave open the option of later expanding displacements to 48 bits (this
option would be closed off if these were defined in terms of ignoring
the high order bits).

> RV64I has ADDIW, ADDW, SUBW, SLLIW, and SLLW (plus SRLIW, SRLW, SRAIW,
> SRAW but here the result is not just the same as the 64-bit variant
> followed by sign extension). I find sign extension pretty perverse
> for SRLW and SRLIW, but for shift amounts >0 sign-extension and zero
> extension have the same result anyway.
>

Yeah. BJX2 is similar, just with a different naming convention here.

For shifts, one also has to (internally) sign or zero extend the input
values to get the expected results (so, in effect, the shift
instructions are double-extended).

> Given RISC-V's dogma of using instruction fusion instead of adding
> more instructions, the five instructions that could be replaced by the
> 64-bit variant followed by a sign-extension instruction are a
> surprise. And the shift-right instructions could be replaced by a
> sign-extension of zero-extension instruction followed by the 64-bit
> shift instruction, followed for SRA by a sign-extension instruction
> (if the shift amount can be =0).
>

Dunno though, but it could be helpful for implementations which don't
perform instruction fusion in this case.

....

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<06d93522-9312-4c73-8c4f-8fc29e305b81n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23103&group=comp.arch#23103

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:9e8e:: with SMTP id h136mr6479419qke.561.1643133872339;
Tue, 25 Jan 2022 10:04:32 -0800 (PST)
X-Received: by 2002:a9d:6254:: with SMTP id i20mr2805405otk.94.1643133872039;
Tue, 25 Jan 2022 10:04:32 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 25 Jan 2022 10:04:31 -0800 (PST)
In-Reply-To: <sso6aq$37b$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:8cce:80ce:44ea:2dbd;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:8cce:80ce:44ea:2dbd
References: <sso6aq$37b$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <06d93522-9312-4c73-8c4f-8fc29e305b81n@googlegroups.com>
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 25 Jan 2022 18:04:32 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 20
 by: MitchAlsup - Tue, 25 Jan 2022 18:04 UTC

On Tuesday, January 25, 2022 at 12:45:50 AM UTC-6, Thomas Koenig wrote:
> Looking at Alpha, and also now at LoonArch, I find separate
> instructions for 32-bit and 64-bit addition for a 64-bit
> architecture. For example, Alpha has both addl (add longword)
> and addq (add quadword).
>
> What is not quite clear to me is the rationale behind this - a
> 64-bit addition is (blindlingly obviously) just a 32-bit addition
> when ignoring the high bits.
<
For My 66000 architecture, arithmetic is only register size (except
an accommodation for single precision floats).
<
On the integer side: if one wants a particular number of bits after
arithmetic has been performed, I have both signed and unsigned
versions of extract (any number of bits from 1..64) not just std
sizes.
<
On the floating pint side: There is a full complement (49) of insts
that convert one {S, U, F, D} to the other with any of the rounding
modes.

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<UPXHJ.6202$9O.4300@fx12.iad>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23106&group=comp.arch#23106

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx12.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
References: <sso6aq$37b$1@newsreader4.netcologne.de>
In-Reply-To: <sso6aq$37b$1@newsreader4.netcologne.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 22
Message-ID: <UPXHJ.6202$9O.4300@fx12.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Tue, 25 Jan 2022 19:03:16 UTC
Date: Tue, 25 Jan 2022 14:03:00 -0500
X-Received-Bytes: 1630
 by: EricP - Tue, 25 Jan 2022 19:03 UTC

Thomas Koenig wrote:
> Looking at Alpha, and also now at LoonArch, I find separate
> instructions for 32-bit and 64-bit addition for a 64-bit
> architecture. For example, Alpha has both addl (add longword)
> and addq (add quadword).
>
> What is not quite clear to me is the rationale behind this - a
> 64-bit addition is (blindlingly obviously) just a 32-bit addition
> when ignoring the high bits.

Alpha was porting a lot of legacy 32-bit VAX code to 64-bits.
ADDL gives similar signed wrapping/overflow behavior when
comparing to 32-bit constants and load-with-sign-extend values.

I would imagine that code analysis showed that without ADDL
there were lots of sign extend instructions.
Alpha didn't have SEXT/ZEXT sign/zero extend instructions
so it would have required a pair of shifts.

MIPS probably faced similar situation going from MIPS32 to MIPS64.

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<ssplm7$1sv$1@newsreader4.netcologne.de>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23110&group=comp.arch#23110

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-238-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Tue, 25 Jan 2022 20:13:59 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <ssplm7$1sv$1@newsreader4.netcologne.de>
References: <sso6aq$37b$1@newsreader4.netcologne.de>
<UPXHJ.6202$9O.4300@fx12.iad>
Injection-Date: Tue, 25 Jan 2022 20:13:59 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-238-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:238:0:7285:c2ff:fe6c:992d";
logging-data="1951"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 25 Jan 2022 20:13 UTC

EricP <ThatWouldBeTelling@thevillage.com> schrieb:
> Thomas Koenig wrote:
>> Looking at Alpha, and also now at LoonArch, I find separate
>> instructions for 32-bit and 64-bit addition for a 64-bit
>> architecture. For example, Alpha has both addl (add longword)
>> and addq (add quadword).
>>
>> What is not quite clear to me is the rationale behind this - a
>> 64-bit addition is (blindlingly obviously) just a 32-bit addition
>> when ignoring the high bits.
>
> Alpha was porting a lot of legacy 32-bit VAX code to 64-bits.
> ADDL gives similar signed wrapping/overflow behavior when
> comparing to 32-bit constants and load-with-sign-extend values.

> I would imagine that code analysis showed that without ADDL
> there were lots of sign extend instructions.

That sounds like a good explanation.

IIRC, Alpha also used their CALL_PAL instruction to emulate complex
VAX instructions like POLY, at least on VMS, so assembler programs
were somewhat easier to port.

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<sspnd8$3d6$1@newsreader4.netcologne.de>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23111&group=comp.arch#23111

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-238-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Tue, 25 Jan 2022 20:43:20 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <sspnd8$3d6$1@newsreader4.netcologne.de>
References: <sso6aq$37b$1@newsreader4.netcologne.de>
<UPXHJ.6202$9O.4300@fx12.iad> <ssplm7$1sv$1@newsreader4.netcologne.de>
Injection-Date: Tue, 25 Jan 2022 20:43:20 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-238-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:238:0:7285:c2ff:fe6c:992d";
logging-data="3494"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 25 Jan 2022 20:43 UTC

Thomas Koenig <tkoenig@netcologne.de> schrieb:
> EricP <ThatWouldBeTelling@thevillage.com> schrieb:
>> Thomas Koenig wrote:
>>> Looking at Alpha, and also now at LoonArch, I find separate
>>> instructions for 32-bit and 64-bit addition for a 64-bit
>>> architecture. For example, Alpha has both addl (add longword)
>>> and addq (add quadword).
>>>
>>> What is not quite clear to me is the rationale behind this - a
>>> 64-bit addition is (blindlingly obviously) just a 32-bit addition
>>> when ignoring the high bits.
>>
>> Alpha was porting a lot of legacy 32-bit VAX code to 64-bits.
>> ADDL gives similar signed wrapping/overflow behavior when
>> comparing to 32-bit constants and load-with-sign-extend values.
>
>> I would imagine that code analysis showed that without ADDL
>> there were lots of sign extend instructions.
>
> That sounds like a good explanation.
>
> IIRC, Alpha also used their CALL_PAL instruction to emulate complex
> VAX instructions like POLY, at least on VMS, so assembler programs
> were somewhat easier to port.

Before somebody corrects me, let me do it myself: POLY wasn't
in there, but the queue instructions like INSQUEL were.

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<tJZHJ.10626$8Q.353@fx19.iad>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23112&group=comp.arch#23112

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.uzoreto.com!newsreader4.netcologne.de!news.netcologne.de!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx19.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
References: <sso6aq$37b$1@newsreader4.netcologne.de> <UPXHJ.6202$9O.4300@fx12.iad> <ssplm7$1sv$1@newsreader4.netcologne.de> <sspnd8$3d6$1@newsreader4.netcologne.de>
In-Reply-To: <sspnd8$3d6$1@newsreader4.netcologne.de>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 35
Message-ID: <tJZHJ.10626$8Q.353@fx19.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Tue, 25 Jan 2022 21:12:57 UTC
Date: Tue, 25 Jan 2022 16:12:43 -0500
X-Received-Bytes: 2341
 by: EricP - Tue, 25 Jan 2022 21:12 UTC

Thomas Koenig wrote:
> Thomas Koenig <tkoenig@netcologne.de> schrieb:
>> EricP <ThatWouldBeTelling@thevillage.com> schrieb:
>>> Thomas Koenig wrote:
>>>> Looking at Alpha, and also now at LoonArch, I find separate
>>>> instructions for 32-bit and 64-bit addition for a 64-bit
>>>> architecture. For example, Alpha has both addl (add longword)
>>>> and addq (add quadword).
>>>>
>>>> What is not quite clear to me is the rationale behind this - a
>>>> 64-bit addition is (blindlingly obviously) just a 32-bit addition
>>>> when ignoring the high bits.
>>> Alpha was porting a lot of legacy 32-bit VAX code to 64-bits.
>>> ADDL gives similar signed wrapping/overflow behavior when
>>> comparing to 32-bit constants and load-with-sign-extend values.
>>> I would imagine that code analysis showed that without ADDL
>>> there were lots of sign extend instructions.
>> That sounds like a good explanation.
>>
>> IIRC, Alpha also used their CALL_PAL instruction to emulate complex
>> VAX instructions like POLY, at least on VMS, so assembler programs
>> were somewhat easier to port.
>
> Before somebody corrects me, let me do it myself: POLY wasn't
> in there, but the queue instructions like INSQUEL were.

Apparently the various Vaxen buggered up POLY so badly
that they had to remove it from VAX HW too and use emulation.

"How the VAX Lost Its POLY (and EMOD and ACB_floating too)", by Bob Supnik
http://simh.trailing-edge.com/docs/vax_poly.pdf

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<cfc24dc2-495b-409b-a52c-a9a942d745b7n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23114&group=comp.arch#23114

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:14c8:: with SMTP id u8mr18272520qtx.197.1643147335932;
Tue, 25 Jan 2022 13:48:55 -0800 (PST)
X-Received: by 2002:a05:6808:1155:: with SMTP id u21mr157657oiu.133.1643147335565;
Tue, 25 Jan 2022 13:48:55 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 25 Jan 2022 13:48:55 -0800 (PST)
In-Reply-To: <sso6aq$37b$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:2d69:e22:a021:ec33;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:2d69:e22:a021:ec33
References: <sso6aq$37b$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <cfc24dc2-495b-409b-a52c-a9a942d745b7n@googlegroups.com>
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Tue, 25 Jan 2022 21:48:55 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 15
 by: Quadibloc - Tue, 25 Jan 2022 21:48 UTC

On Monday, January 24, 2022 at 11:45:50 PM UTC-7, Thomas Koenig wrote:
> Looking at Alpha, and also now at LoonArch, I find separate
> instructions for 32-bit and 64-bit addition for a 64-bit
> architecture. For example, Alpha has both addl (add longword)
> and addq (add quadword).
>
> What is not quite clear to me is the rationale behind this - a
> 64-bit addition is (blindlingly obviously) just a 32-bit addition
> when ignoring the high bits.

Surely it's obvious. One can obtain the result of a 32-bit addition
in fewer nanoseconds than one can obtain the result of a 64-bit
addition. Therefore, one wants to have a faster instruction available
in order to get the answer one needs sooner.

John Savard

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<jwvlez3zc7i.fsf-monnier+comp.arch@gnu.org>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23115&group=comp.arch#23115

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Tue, 25 Jan 2022 17:26:19 -0500
Organization: A noiseless patient Spider
Lines: 17
Message-ID: <jwvlez3zc7i.fsf-monnier+comp.arch@gnu.org>
References: <sso6aq$37b$1@newsreader4.netcologne.de>
<cfc24dc2-495b-409b-a52c-a9a942d745b7n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="ca3be2731a3224b19f4b8cf593c70e1b";
logging-data="20877"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18J4NoIzNojbRRFgssOoXck"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)
Cancel-Lock: sha1:dP3MBuqroQ7Cmz0rV0sMS3MfZJM=
sha1:KK5psjRLpZDHdvTb6Yp00Ad7/ak=
 by: Stefan Monnier - Tue, 25 Jan 2022 22:26 UTC

Quadibloc [2022-01-25 13:48:55] wrote:
> On Monday, January 24, 2022 at 11:45:50 PM UTC-7, Thomas Koenig wrote:
>> What is not quite clear to me is the rationale behind this - a
>> 64-bit addition is (blindlingly obviously) just a 32-bit addition
>> when ignoring the high bits.
> Surely it's obvious. One can obtain the result of a 32-bit addition
> in fewer nanoseconds than one can obtain the result of a 64-bit
> addition. Therefore, one wants to have a faster instruction available
> in order to get the answer one needs sooner.

Hmm... I'm not aware of a processor that offers a 64bit addition
instruction and whose latency is higher than 1 cycle (at least not in
the Alpha family, don't know about Loongson), so the 32bit
addition is not faster :-(

Stefan

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<2022Jan25.235313@mips.complang.tuwien.ac.at>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23116&group=comp.arch#23116

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Tue, 25 Jan 2022 22:53:13 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 18
Message-ID: <2022Jan25.235313@mips.complang.tuwien.ac.at>
References: <sso6aq$37b$1@newsreader4.netcologne.de> <UPXHJ.6202$9O.4300@fx12.iad>
Injection-Info: reader02.eternal-september.org; posting-host="aa604b7d5db7b2c1d90eafe053592a2a";
logging-data="27259"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19cODC3OkemoimwGism50/2"
Cancel-Lock: sha1:esmKY+waD//4KY57/Z2H0joj5po=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 25 Jan 2022 22:53 UTC

EricP <ThatWouldBeTelling@thevillage.com> writes:
>Alpha didn't have SEXT/ZEXT sign/zero extend instructions
>so it would have required a pair of shifts.

Alpha uses addl for sign extension and some zap instruction for zero
extension. If they had not added addl, they could have added an sext
instruction instead.

>MIPS probably faced similar situation going from MIPS32 to MIPS64.

MIPS already had a 32-bit software base, and AFAIK wanted (and
succeeded) in having a modeless extension to 64 bits. Which makes
RISC-V's moded approach quite surprising.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<49f8f03b-8859-4898-8741-ea10fa0b1adan@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23117&group=comp.arch#23117

 copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:5968:: with SMTP id eq8mr3651136qvb.68.1643153486503;
Tue, 25 Jan 2022 15:31:26 -0800 (PST)
X-Received: by 2002:aca:f102:: with SMTP id p2mr1950048oih.325.1643153486277;
Tue, 25 Jan 2022 15:31:26 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 25 Jan 2022 15:31:26 -0800 (PST)
In-Reply-To: <jwvlez3zc7i.fsf-monnier+comp.arch@gnu.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:90e4:4827:3d9d:a2f8;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:90e4:4827:3d9d:a2f8
References: <sso6aq$37b$1@newsreader4.netcologne.de> <cfc24dc2-495b-409b-a52c-a9a942d745b7n@googlegroups.com>
<jwvlez3zc7i.fsf-monnier+comp.arch@gnu.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <49f8f03b-8859-4898-8741-ea10fa0b1adan@googlegroups.com>
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 25 Jan 2022 23:31:26 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 31
 by: MitchAlsup - Tue, 25 Jan 2022 23:31 UTC

On Tuesday, January 25, 2022 at 4:26:22 PM UTC-6, Stefan Monnier wrote:
> Quadibloc [2022-01-25 13:48:55] wrote:
> > On Monday, January 24, 2022 at 11:45:50 PM UTC-7, Thomas Koenig wrote:
> >> What is not quite clear to me is the rationale behind this - a
> >> 64-bit addition is (blindlingly obviously) just a 32-bit addition
> >> when ignoring the high bits.
> > Surely it's obvious. One can obtain the result of a 32-bit addition
> > in fewer nanoseconds than one can obtain the result of a 64-bit
> > addition. Therefore, one wants to have a faster instruction available
> > in order to get the answer one needs sooner.
<
> Hmm... I'm not aware of a processor that offers a 64bit addition
> instruction and whose latency is higher than 1 cycle (at least not in
> the Alpha family, don't know about Loongson), so the 32bit
> addition is not faster :-(
<
64-bit addition is 11-gates of delay
32-bit addition is 9 gates of delay
Using Carry Select Adder technology (like Alpha).
<
For your typical 16-gate per cycle design point this makes no difference
whatsoever. For the 8-9 gate cycle of P4 it does--but not as you would expect:
<
Let us postulate a 9-gate cycle time, and a 9-gate adder. You still do not have
time to perform forwarding, so you still do not get back-to back additions !
Logic cannot take more then 3/4 of a cycle for back to back calculations
in a 1-wide machine (forwarding from 3 places) and cannot take more than
5/8 of a cycle for a 3-wide superscalar design point (more delay in forwarding
due to fan-in).
>
>
> Stefan

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<ssq34l$t88$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23119&group=comp.arch#23119

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Tue, 25 Jan 2022 16:03:32 -0800
Organization: A noiseless patient Spider
Lines: 42
Message-ID: <ssq34l$t88$1@dont-email.me>
References: <sso6aq$37b$1@newsreader4.netcologne.de>
<cfc24dc2-495b-409b-a52c-a9a942d745b7n@googlegroups.com>
<jwvlez3zc7i.fsf-monnier+comp.arch@gnu.org>
<49f8f03b-8859-4898-8741-ea10fa0b1adan@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 26 Jan 2022 00:03:33 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="cd57e7c70d5a7bb44bf64aeacecac6e7";
logging-data="29960"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19jovgpOw3i6axwmbB6Cyu4"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:jIIRMdL+q3h0Yom0uCQ/bPm729Y=
In-Reply-To: <49f8f03b-8859-4898-8741-ea10fa0b1adan@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Wed, 26 Jan 2022 00:03 UTC

On 1/25/2022 3:31 PM, MitchAlsup wrote:
> On Tuesday, January 25, 2022 at 4:26:22 PM UTC-6, Stefan Monnier wrote:
>> Quadibloc [2022-01-25 13:48:55] wrote:
>>> On Monday, January 24, 2022 at 11:45:50 PM UTC-7, Thomas Koenig wrote:
>>>> What is not quite clear to me is the rationale behind this - a
>>>> 64-bit addition is (blindlingly obviously) just a 32-bit addition
>>>> when ignoring the high bits.
>>> Surely it's obvious. One can obtain the result of a 32-bit addition
>>> in fewer nanoseconds than one can obtain the result of a 64-bit
>>> addition. Therefore, one wants to have a faster instruction available
>>> in order to get the answer one needs sooner.
> <
>> Hmm... I'm not aware of a processor that offers a 64bit addition
>> instruction and whose latency is higher than 1 cycle (at least not in
>> the Alpha family, don't know about Loongson), so the 32bit
>> addition is not faster :-(
> <
> 64-bit addition is 11-gates of delay
> 32-bit addition is 9 gates of delay
> Using Carry Select Adder technology (like Alpha).
> <
> For your typical 16-gate per cycle design point this makes no difference
> whatsoever. For the 8-9 gate cycle of P4 it does--but not as you would expect:
> <
> Let us postulate a 9-gate cycle time, and a 9-gate adder. You still do not have
> time to perform forwarding, so you still do not get back-to back additions !
> Logic cannot take more then 3/4 of a cycle for back to back calculations
> in a 1-wide machine (forwarding from 3 places) and cannot take more than
> 5/8 of a cycle for a 3-wide superscalar design point (more delay in forwarding
> due to fan-in).

This is almost exactly the logic we follow in deciding what the clock
rate should be. The 1-lat crossbar is like your forwarding, only wider
(8+3 on Silver), and that has to fit into whatever is left after however
many gates are needed for the integer add of the chosen size. We wound
up setting the crosspoint at 32 bits for Silver: w-width add is one
cycle, d and q widths are two ("int" is 32 bits). But this is purely a
market choice and varies: slot count (crossbar size) vs raw clock rate
vs relative market importance of speed for different ALU data
widths.

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<ssqrr7$ptr$1@newsreader4.netcologne.de>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23120&group=comp.arch#23120

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-238-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Wed, 26 Jan 2022 07:05:11 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <ssqrr7$ptr$1@newsreader4.netcologne.de>
References: <sso6aq$37b$1@newsreader4.netcologne.de>
<UPXHJ.6202$9O.4300@fx12.iad> <ssplm7$1sv$1@newsreader4.netcologne.de>
<sspnd8$3d6$1@newsreader4.netcologne.de> <tJZHJ.10626$8Q.353@fx19.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 26 Jan 2022 07:05:11 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-238-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:238:0:7285:c2ff:fe6c:992d";
logging-data="26555"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Wed, 26 Jan 2022 07:05 UTC

EricP <ThatWouldBeTelling@thevillage.com> schrieb:
> Thomas Koenig wrote:
>> Thomas Koenig <tkoenig@netcologne.de> schrieb:
>>> EricP <ThatWouldBeTelling@thevillage.com> schrieb:
>>>> Thomas Koenig wrote:
>>>>> Looking at Alpha, and also now at LoonArch, I find separate
>>>>> instructions for 32-bit and 64-bit addition for a 64-bit
>>>>> architecture. For example, Alpha has both addl (add longword)
>>>>> and addq (add quadword).
>>>>>
>>>>> What is not quite clear to me is the rationale behind this - a
>>>>> 64-bit addition is (blindlingly obviously) just a 32-bit addition
>>>>> when ignoring the high bits.
>>>> Alpha was porting a lot of legacy 32-bit VAX code to 64-bits.
>>>> ADDL gives similar signed wrapping/overflow behavior when
>>>> comparing to 32-bit constants and load-with-sign-extend values.
>>>> I would imagine that code analysis showed that without ADDL
>>>> there were lots of sign extend instructions.
>>> That sounds like a good explanation.
>>>
>>> IIRC, Alpha also used their CALL_PAL instruction to emulate complex
>>> VAX instructions like POLY, at least on VMS, so assembler programs
>>> were somewhat easier to port.
>>
>> Before somebody corrects me, let me do it myself: POLY wasn't
>> in there, but the queue instructions like INSQUEL were.
>
> Apparently the various Vaxen buggered up POLY so badly
> that they had to remove it from VAX HW too and use emulation.
>
> "How the VAX Lost Its POLY (and EMOD and ACB_floating too)", by Bob Supnik
> http://simh.trailing-edge.com/docs/vax_poly.pdf

An cautionary tale of how many things you can get wrong in floating
point arithmetic.

One remark, right at the beginning, struck me: "[Mary Payne] was not
pleased to discover that, due to lack of algorithmic understanding,
the PDP-11 floating point unit could be off by as much as a full
LSB (least significant bit)."

Hmm... while googling around, I found another amusing story
about gradual underflow, and how it made it into IEEE 754:
http://people.eecs.berkeley.edu/~wkahan/19July10.pdf

I especially like the sentence

DEC’s main advocate on the IEEE p754 Committee was a Mathematician
and Numerical Analyst Dr. Mary H. Payne. She was experienced,
fully competent,

and misled by DEC’s hardware engineers

when they assured her that Gradual Underflow was unimplementable
as the default (no trap) of any computer arithmetic aspiring to
high performance.

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<ssqsck$ptr$2@newsreader4.netcologne.de>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23121&group=comp.arch#23121

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-238-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Wed, 26 Jan 2022 07:14:28 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <ssqsck$ptr$2@newsreader4.netcologne.de>
References: <sso6aq$37b$1@newsreader4.netcologne.de>
<UPXHJ.6202$9O.4300@fx12.iad> <2022Jan25.235313@mips.complang.tuwien.ac.at>
Injection-Date: Wed, 26 Jan 2022 07:14:28 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-238-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:238:0:7285:c2ff:fe6c:992d";
logging-data="26555"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Wed, 26 Jan 2022 07:14 UTC

Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:
> EricP <ThatWouldBeTelling@thevillage.com> writes:
>>Alpha didn't have SEXT/ZEXT sign/zero extend instructions
>>so it would have required a pair of shifts.
>
> Alpha uses addl for sign extension and some zap instruction for zero
> extension. If they had not added addl, they could have added an sext
> instruction instead.

Or a load unsigned / load signed.

Hmm... some random googling found that Alpha was supposed to have
had a signed int (for VAX compatibility?) Early versions also
did not have the sign extension instructions.

Sign extending a char in C must have been painful.

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<0cf5023d-3458-46d2-ad3d-fa0e6ecb18dfn@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23122&group=comp.arch#23122

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:e47:: with SMTP id o7mr23074427qvc.118.1643184800193;
Wed, 26 Jan 2022 00:13:20 -0800 (PST)
X-Received: by 2002:a05:6830:1b62:: with SMTP id d2mr3445725ote.142.1643184799850;
Wed, 26 Jan 2022 00:13:19 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 26 Jan 2022 00:13:19 -0800 (PST)
In-Reply-To: <ssqrr7$ptr$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:2d69:e22:a021:ec33;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:2d69:e22:a021:ec33
References: <sso6aq$37b$1@newsreader4.netcologne.de> <UPXHJ.6202$9O.4300@fx12.iad>
<ssplm7$1sv$1@newsreader4.netcologne.de> <sspnd8$3d6$1@newsreader4.netcologne.de>
<tJZHJ.10626$8Q.353@fx19.iad> <ssqrr7$ptr$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <0cf5023d-3458-46d2-ad3d-fa0e6ecb18dfn@googlegroups.com>
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 26 Jan 2022 08:13:20 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 34
 by: Quadibloc - Wed, 26 Jan 2022 08:13 UTC

On Wednesday, January 26, 2022 at 12:05:13 AM UTC-7, Thomas Koenig wrote:

> I especially like the sentence
>
> DEC’s main advocate on the IEEE p754 Committee was a Mathematician
> and Numerical Analyst Dr. Mary H. Payne. She was experienced,
> fully competent,
>
> and misled by DEC’s hardware engineers
>
> when they assured her that Gradual Underflow was unimplementable
> as the default (no trap) of any computer arithmetic aspiring to
> high performance.

That is true, but, although I may be mistaken, it appeared to me that
while one certainly _could_ have a high-performance floating-point
implementation which fully supported IEEE 754 gradual underflow,

one would have to do it by representing floating-point numbers in an
"internal form" when they're in registers, like the way the 8087 did it,
where the internal form was like an old-style plain floating-point format,
but with a range larger than the architectural floating-point format, so
that it covered all the gradual underflow territory.

And, of course, if you do it that way, you can't properly and accurately
trap when a register-to-register calculation underflows, because that isn't
easily visible any longer. Instead, you find out when you store the result
in memory.

So if you _exclude_ the idea of doing calculations on an internal form
of numbers, because it's problematic, then indeed gradual underflow will
obstruct high performance.

John Savard

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<89dcb0a5-99c1-4de6-a1d1-0d7e3fb05c89n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23123&group=comp.arch#23123

 copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:90:: with SMTP id c16mr10459100qtg.349.1643185066045;
Wed, 26 Jan 2022 00:17:46 -0800 (PST)
X-Received: by 2002:a05:6808:1707:: with SMTP id bc7mr3088402oib.179.1643185065679;
Wed, 26 Jan 2022 00:17:45 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 26 Jan 2022 00:17:45 -0800 (PST)
In-Reply-To: <jwvlez3zc7i.fsf-monnier+comp.arch@gnu.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:2d69:e22:a021:ec33;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:2d69:e22:a021:ec33
References: <sso6aq$37b$1@newsreader4.netcologne.de> <cfc24dc2-495b-409b-a52c-a9a942d745b7n@googlegroups.com>
<jwvlez3zc7i.fsf-monnier+comp.arch@gnu.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <89dcb0a5-99c1-4de6-a1d1-0d7e3fb05c89n@googlegroups.com>
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 26 Jan 2022 08:17:46 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 26
 by: Quadibloc - Wed, 26 Jan 2022 08:17 UTC

On Tuesday, January 25, 2022 at 3:26:22 PM UTC-7, Stefan Monnier wrote:

> Hmm... I'm not aware of a processor that offers a 64bit addition
> instruction and whose latency is higher than 1 cycle (at least not in
> the Alpha family, don't know about Loongson), so the 32bit
> addition is not faster :-(

That's a good point.

Silly me, though, I imagine that any proper specification of an ISA must
be suitable to a wide range of implementations.

An out-of-order implementation is one possibility.
An in-order implementation is another.
A microcoded implementation with an 8-bit ALU is *still* another.

I keep forgetting that we're not living in the era of solid logic
technology any more.

And in that last case, a 32-bit add will be faster than a 64-bit add.

But others raised the point that there will, even in fast implementations,
be a speed difference between a 64-bit divide and a 32-bit divide - and
if you don't have a 32-bit add, you will need type conversion instructions
to properly set up for divides and multiplies.

John Savard

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<ssr045$1lfk$1@gioia.aioe.org>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23124&group=comp.arch#23124

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!To5nvU/sTaigmVbgRJ05pQ.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Wed, 26 Jan 2022 09:18:20 +0100
Organization: Aioe.org NNTP Server
Message-ID: <ssr045$1lfk$1@gioia.aioe.org>
References: <sso6aq$37b$1@newsreader4.netcologne.de>
<UPXHJ.6202$9O.4300@fx12.iad> <2022Jan25.235313@mips.complang.tuwien.ac.at>
<ssqsck$ptr$2@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="54772"; posting-host="To5nvU/sTaigmVbgRJ05pQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.10.2
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Wed, 26 Jan 2022 08:18 UTC

Thomas Koenig wrote:
> Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:
>> EricP <ThatWouldBeTelling@thevillage.com> writes:
>>> Alpha didn't have SEXT/ZEXT sign/zero extend instructions
>>> so it would have required a pair of shifts.
>>
>> Alpha uses addl for sign extension and some zap instruction for zero
>> extension. If they had not added addl, they could have added an sext
>> instruction instead.
>
> Or a load unsigned / load signed.
>
> Hmm... some random googling found that Alpha was supposed to have
> had a signed int (for VAX compatibility?) Early versions also
> did not have the sign extension instructions.
>
> Sign extending a char in C must have been painful.
>
AFAIR, the most painful part of the original Alpha was the missing
8/16-bit load store ops:

The official sequence for an arbitrarily aligned 16-bit store needed to
load two 32-bit words, shift & mask, or, align and store back. 11
instructions?

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<ssr9sm$c55$1@gioia.aioe.org>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23125&group=comp.arch#23125

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!To5nvU/sTaigmVbgRJ05pQ.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
Date: Wed, 26 Jan 2022 12:05:02 +0100
Organization: Aioe.org NNTP Server
Message-ID: <ssr9sm$c55$1@gioia.aioe.org>
References: <sso6aq$37b$1@newsreader4.netcologne.de>
<UPXHJ.6202$9O.4300@fx12.iad> <ssplm7$1sv$1@newsreader4.netcologne.de>
<sspnd8$3d6$1@newsreader4.netcologne.de> <tJZHJ.10626$8Q.353@fx19.iad>
<ssqrr7$ptr$1@newsreader4.netcologne.de>
<0cf5023d-3458-46d2-ad3d-fa0e6ecb18dfn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="12453"; posting-host="To5nvU/sTaigmVbgRJ05pQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.10.2
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Wed, 26 Jan 2022 11:05 UTC

Quadibloc wrote:
> On Wednesday, January 26, 2022 at 12:05:13 AM UTC-7, Thomas Koenig wrote:
>
>> I especially like the sentence
>>
>> DEC’s main advocate on the IEEE p754 Committee was a Mathematician
>> and Numerical Analyst Dr. Mary H. Payne. She was experienced,
>> fully competent,
>>
>> and misled by DEC’s hardware engineers
>>
>> when they assured her that Gradual Underflow was unimplementable
>> as the default (no trap) of any computer arithmetic aspiring to
>> high performance.
>
> That is true, but, although I may be mistaken, it appeared to me that
> while one certainly _could_ have a high-performance floating-point
> implementation which fully supported IEEE 754 gradual underflow,
>
> one would have to do it by representing floating-point numbers in an
> "internal form" when they're in registers, like the way the 8087 did it,
> where the internal form was like an old-style plain floating-point format,
> but with a range larger than the architectural floating-point format, so
> that it covered all the gradual underflow territory.
>
> And, of course, if you do it that way, you can't properly and accurately
> trap when a register-to-register calculation underflows, because that isn't
> easily visible any longer. Instead, you find out when you store the result
> in memory.
>
> So if you _exclude_ the idea of doing calculations on an internal form
> of numbers, because it's problematic, then indeed gradual underflow will
> obstruct high performance.

This is simply wrong.

Please go back to the several rounds where primarily Mitch and I have
discussed how zero-cycle handling of subnormal numbers can fall out of
having the large normalization network needed to support FMAC, which you
need to do anyway.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Why separate 32-bit arithmetic on a 64-bit architecture?

<5NdIJ.15873$mS1.13257@fx10.iad>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=23127&group=comp.arch#23127

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!newsreader4.netcologne.de!news.netcologne.de!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx10.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Why separate 32-bit arithmetic on a 64-bit architecture?
References: <sso6aq$37b$1@newsreader4.netcologne.de> <UPXHJ.6202$9O.4300@fx12.iad> <2022Jan25.235313@mips.complang.tuwien.ac.at>
In-Reply-To: <2022Jan25.235313@mips.complang.tuwien.ac.at>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 29
Message-ID: <5NdIJ.15873$mS1.13257@fx10.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 26 Jan 2022 15:29:05 UTC
Date: Wed, 26 Jan 2022 10:28:28 -0500
X-Received-Bytes: 2144
 by: EricP - Wed, 26 Jan 2022 15:28 UTC

Anton Ertl wrote:
> EricP <ThatWouldBeTelling@thevillage.com> writes:
>> Alpha didn't have SEXT/ZEXT sign/zero extend instructions
>> so it would have required a pair of shifts.
>
> Alpha uses addl for sign extension and some zap instruction for zero
> extension. If they had not added addl, they could have added an sext
> instruction instead.

There were lots of instructions they could have had, but they seemed
almost obsessive that the R in RISC means reduced instruction _count_.

They left out the byte and word load and stores, supposedly because
that would require the byte shifter network on the critical path.
I think that was a disastrous decision that contributed greatly
to porting difficulties and lack of wide market acceptance.
Once it established in peoples minds that it is difficult to work with
or more trouble than its worth, it is tough to come back from.
In effect, they created a market barrier to themselves.

And when they finally did add byte and word load and store,
load byte and word only did zero extend not sign extend
supposedly because they did not want to put the sign extension
logic into the load critical path. But still, WTF!

Anyway, SEXT and ZEXT to sign or zero extend a register from
a bit position would not have affected the cycle time.

Pages:1234567
server_pubkey.txt

rocksolid light 0.9.7
clearnet tor