Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

If I had only known, I would have been a locksmith. -- Albert Einstein


devel / comp.arch / Re: Multi-word addition with a strict Ra = Rb op Rc ISA

SubjectAuthor
* Multi-word addition with a strict Ra = Rb op Rc ISAThomas Koenig
+* Re: Multi-word addition with a strict Ra = Rb op Rc ISArobf...@gmail.com
|`- Re: Multi-word addition with a strict Ra = Rb op Rc ISABGB
`* Re: Multi-word addition with a strict Ra = Rb op Rc ISAAnton Ertl
 +* Re: Multi-word addition with a strict Ra = Rb op Rc ISAMitchAlsup
 |`* Re: Multi-word addition with a strict Ra = Rb op Rc ISAMarcus
 | `- Re: Multi-word addition with a strict Ra = Rb op Rc ISAMitchAlsup
 `* Re: Multi-word addition with a strict Ra = Rb op Rc ISAThomas Koenig
  `* Re: Multi-word addition with a strict Ra = Rb op Rc ISAAnton Ertl
   `* Re: Multi-word addition with a strict Ra = Rb op Rc ISAThomas Koenig
    `* Re: Multi-word addition with a strict Ra = Rb op Rc ISAMitchAlsup
     `- Re: Multi-word addition with a strict Ra = Rb op Rc ISAThomas Koenig

1
Multi-word addition with a strict Ra = Rb op Rc ISA

<sqcer1$8hj$1@newsreader4.netcologne.de>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=22483&group=comp.arch#22483

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.de!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-eb03-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Multi-word addition with a strict Ra = Rb op Rc ISA
Date: Mon, 27 Dec 2021 13:24:49 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <sqcer1$8hj$1@newsreader4.netcologne.de>
Injection-Date: Mon, 27 Dec 2021 13:24:49 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-eb03-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:eb03:0:7285:c2ff:fe6c:992d";
logging-data="8755"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Mon, 27 Dec 2021 13:24 UTC

Triggered by this post

https://gmplib.org/list-archives/gmp-devel/2021-September/006013.html

(where somebody discusses, in rather stringent terms, the performance
of RISC-V for multitword addition in GMP), I wonder:

RISC-V has (from that post) the instruction sequence

add t0, a4, a6 // add low words
sltu t6, t0, a4 // compute carry-out from low add

so one would need to wait for the result of the add to compare
to generate the carry.

Would it not be better to have an "add and generate the carry only"
instruction instead, one coud write, hypothetically,

add t0, a4, a6
addc t6, a4, a6

so there would be no interdependency of the instructions (and it
would be an obvious candidate for instruction fusion)?

Or am I missing something obvious here?

Re: Multi-word addition with a strict Ra = Rb op Rc ISA

<0d72426f-cdeb-4ace-aaef-acb4460238edn@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=22484&group=comp.arch#22484

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:5197:: with SMTP id kl23mr15082941qvb.48.1640614321737;
Mon, 27 Dec 2021 06:12:01 -0800 (PST)
X-Received: by 2002:a4a:be90:: with SMTP id o16mr9320782oop.28.1640614321402;
Mon, 27 Dec 2021 06:12:01 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 27 Dec 2021 06:12:01 -0800 (PST)
In-Reply-To: <sqcer1$8hj$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:fb00:489a:7943:ad66:5554;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:fb00:489a:7943:ad66:5554
References: <sqcer1$8hj$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <0d72426f-cdeb-4ace-aaef-acb4460238edn@googlegroups.com>
Subject: Re: Multi-word addition with a strict Ra = Rb op Rc ISA
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Mon, 27 Dec 2021 14:12:01 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 36
 by: robf...@gmail.com - Mon, 27 Dec 2021 14:12 UTC

On Monday, December 27, 2021 at 8:24:52 AM UTC-5, Thomas Koenig wrote:
> Triggered by this post
>
> https://gmplib.org/list-archives/gmp-devel/2021-September/006013.html
>
> (where somebody discusses, in rather stringent terms, the performance
> of RISC-V for multitword addition in GMP), I wonder:
>
> RISC-V has (from that post) the instruction sequence
>
>
> add t0, a4, a6 // add low words
> sltu t6, t0, a4 // compute carry-out from low add
>
> so one would need to wait for the result of the add to compare
> to generate the carry.
>
> Would it not be better to have an "add and generate the carry only"
> instruction instead, one coud write, hypothetically,
>
> add t0, a4, a6
> addc t6, a4, a6
>
> so there would be no interdependency of the instructions (and it
> would be an obvious candidate for instruction fusion)?
>
> Or am I missing something obvious here?

Not missing anything, except that using SLTU uses an existing instruction and
does not require additional hardware or opcode which may be important if
trying to keep the implementation small. Personally, I think RISCV tries too
hard to be small, leaving out some common features of other architectures.
But RISCV can always be extended using custom instructions.
I would argue for adding ADDC as a custom instruction. It is reminiscent of
the MULH instruction which returns the high order bits. So maybe ADDH
rather than ADDC. When I see ADDC I think of it as an add with carry. Could
also use a SUBH instruction.

Re: Multi-word addition with a strict Ra = Rb op Rc ISA

<2021Dec27.190209@mips.complang.tuwien.ac.at>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=22493&group=comp.arch#22493

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Multi-word addition with a strict Ra = Rb op Rc ISA
Date: Mon, 27 Dec 2021 18:02:09 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 51
Distribution: world
Message-ID: <2021Dec27.190209@mips.complang.tuwien.ac.at>
References: <sqcer1$8hj$1@newsreader4.netcologne.de>
Injection-Info: reader02.eternal-september.org; posting-host="49bb987570c8ec7ca42a64081be61d96";
logging-data="32555"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+ReuoU/yyiseTzt0N9Hby8"
Cancel-Lock: sha1:8Lh3y0efAUmDck3v9NCOZaI78BE=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Mon, 27 Dec 2021 18:02 UTC

Thomas Koenig <tkoenig@netcologne.de> writes:
>Triggered by this post
>
>https://gmplib.org/list-archives/gmp-devel/2021-September/006013.html
>
>(where somebody discusses, in rather stringent terms, the performance
>of RISC-V for multitword addition in GMP), I wonder:
>
>RISC-V has (from that post) the instruction sequence
>
>
> add t0, a4, a6 // add low words
> sltu t6, t0, a4 // compute carry-out from low add

That's only if you don't have a carry-in (i.e., a half-adder). For a
full-adder, it's AFAIK a four-instruction idiom (hmm, the link you
give uses five instructions for the full addition that follows the
half addition at the start).

The original MIPS and Alpha also needed these sequences; AFAIK
recently MIPS has added a carry flag to make this stuff more
efficient.

>so one would need to wait for the result of the add to compare
>to generate the carry.
>
>Would it not be better to have an "add and generate the carry only"
>instruction instead, one coud write, hypothetically,
>
> add t0, a4, a6
> addc t6, a4, a6

That helps a little bit. But I think the RISC-V answer is that
instead of adding that, they fuse the add and the sltu to produce the
two results in one cycle, without needing to add a new instruction.

The full-adder overhead is more worrysome; one could also do that with
instruction fusion, but are they going to do it, and how well will it
work?

In the thread starting at
<2021Mar10.110220@mips.complang.tuwien.ac.at> we have discussed adding
an extra bit to each register to allow an efficient add-with-carry
(and other things), without adding a special-purpose flag register to
the architecture (which is against MIPS-style architectural
principles).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Multi-word addition with a strict Ra = Rb op Rc ISA

<b3a08d16-c310-448b-b8ad-18dc5e2acdc1n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=22497&group=comp.arch#22497

 copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:5cef:: with SMTP id iv15mr16159663qvb.82.1640632165310;
Mon, 27 Dec 2021 11:09:25 -0800 (PST)
X-Received: by 2002:a05:6808:1141:: with SMTP id u1mr13893396oiu.30.1640632165184;
Mon, 27 Dec 2021 11:09:25 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 27 Dec 2021 11:09:24 -0800 (PST)
In-Reply-To: <2021Dec27.190209@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7907:330c:656:c2fa;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7907:330c:656:c2fa
References: <sqcer1$8hj$1@newsreader4.netcologne.de> <2021Dec27.190209@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b3a08d16-c310-448b-b8ad-18dc5e2acdc1n@googlegroups.com>
Subject: Re: Multi-word addition with a strict Ra = Rb op Rc ISA
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 27 Dec 2021 19:09:25 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 56
 by: MitchAlsup - Mon, 27 Dec 2021 19:09 UTC

On Monday, December 27, 2021 at 12:51:20 PM UTC-6, Anton Ertl wrote:
> Thomas Koenig <tko...@netcologne.de> writes:
> >Triggered by this post
> >
> >https://gmplib.org/list-archives/gmp-devel/2021-September/006013.html
> >
> >(where somebody discusses, in rather stringent terms, the performance
> >of RISC-V for multitword addition in GMP), I wonder:
> >
> >RISC-V has (from that post) the instruction sequence
> >
> >
> > add t0, a4, a6 // add low words
> > sltu t6, t0, a4 // compute carry-out from low add
> That's only if you don't have a carry-in (i.e., a half-adder). For a
> full-adder, it's AFAIK a four-instruction idiom (hmm, the link you
> give uses five instructions for the full addition that follows the
> half addition at the start).
>
> The original MIPS and Alpha also needed these sequences; AFAIK
> recently MIPS has added a carry flag to make this stuff more
> efficient.
> >so one would need to wait for the result of the add to compare
> >to generate the carry.
> >
> >Would it not be better to have an "add and generate the carry only"
> >instruction instead, one coud write, hypothetically,
> >
> > add t0, a4, a6
> > addc t6, a4, a6
> That helps a little bit. But I think the RISC-V answer is that
> instead of adding that, they fuse the add and the sltu to produce the
> two results in one cycle, without needing to add a new instruction.
<
Still not as efficient as My 66000:
<
CARRY R16,{{I}{IO}{IO}{O}}
ADD R12,R4,R8 // carry Out only
ADD R13,R5,R9 // Carry In and Out
ADD R14,R6,R10 // Carry In and Out
ADD R15,R7,R11 // Carry In only
>
> The full-adder overhead is more worrysome; one could also do that with
> instruction fusion, but are they going to do it, and how well will it
> work?
>
> In the thread starting at
> <2021Mar1...@mips.complang.tuwien.ac.at> we have discussed adding
> an extra bit to each register to allow an efficient add-with-carry
> (and other things), without adding a special-purpose flag register to
> the architecture (which is against MIPS-style architectural
> principles).
>
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Multi-word addition with a strict Ra = Rb op Rc ISA

<sqdfnv$dt$1@newsreader4.netcologne.de>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=22512&group=comp.arch#22512

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-eb03-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Multi-word addition with a strict Ra = Rb op Rc ISA
Date: Mon, 27 Dec 2021 22:46:23 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <sqdfnv$dt$1@newsreader4.netcologne.de>
References: <sqcer1$8hj$1@newsreader4.netcologne.de>
<2021Dec27.190209@mips.complang.tuwien.ac.at>
Injection-Date: Mon, 27 Dec 2021 22:46:23 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-eb03-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:eb03:0:7285:c2ff:fe6c:992d";
logging-data="445"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Mon, 27 Dec 2021 22:46 UTC

Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:

> In the thread starting at
><2021Mar10.110220@mips.complang.tuwien.ac.at> we have discussed adding
> an extra bit to each register to allow an efficient add-with-carry
> (and other things), without adding a special-purpose flag register to
> the architecture (which is against MIPS-style architectural
> principles).

I remember that discussion, and that is probably the best way.
So, the add with carry would be something like

add t0, r1, r2
add t1, r3, r4
addcf t1, t1, t0

where "addcf" would add the carry flag from t0 to t1.

From a compiler perspective, it would probably be best to
model this as a separate condition register with one bit.

(There is historical precedent, the accumulator on the 704
had something like that).

Re: Multi-word addition with a strict Ra = Rb op Rc ISA

<sqdto6$jbr$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=22520&group=comp.arch#22520

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Multi-word addition with a strict Ra = Rb op Rc ISA
Date: Mon, 27 Dec 2021 20:45:25 -0600
Organization: A noiseless patient Spider
Lines: 89
Message-ID: <sqdto6$jbr$1@dont-email.me>
References: <sqcer1$8hj$1@newsreader4.netcologne.de>
<0d72426f-cdeb-4ace-aaef-acb4460238edn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 28 Dec 2021 02:45:26 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="7a192c0c96e1f18167785330ec8f0ab7";
logging-data="19835"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19GBiip1mBSXQGa5NNgpOMc"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.4.1
Cancel-Lock: sha1:nDelyGDc/cv4IZ7AsKD+OmKf3V4=
In-Reply-To: <0d72426f-cdeb-4ace-aaef-acb4460238edn@googlegroups.com>
Content-Language: en-US
 by: BGB - Tue, 28 Dec 2021 02:45 UTC

On 12/27/2021 8:12 AM, robf...@gmail.com wrote:
> On Monday, December 27, 2021 at 8:24:52 AM UTC-5, Thomas Koenig wrote:
>> Triggered by this post
>>
>> https://gmplib.org/list-archives/gmp-devel/2021-September/006013.html
>>
>> (where somebody discusses, in rather stringent terms, the performance
>> of RISC-V for multitword addition in GMP), I wonder:
>>
>> RISC-V has (from that post) the instruction sequence
>>
>>
>> add t0, a4, a6 // add low words
>> sltu t6, t0, a4 // compute carry-out from low add
>>
>> so one would need to wait for the result of the add to compare
>> to generate the carry.
>>
>> Would it not be better to have an "add and generate the carry only"
>> instruction instead, one coud write, hypothetically,
>>
>> add t0, a4, a6
>> addc t6, a4, a6
>>
>> so there would be no interdependency of the instructions (and it
>> would be an obvious candidate for instruction fusion)?
>>
>> Or am I missing something obvious here?
>
> Not missing anything, except that using SLTU uses an existing instruction and
> does not require additional hardware or opcode which may be important if
> trying to keep the implementation small. Personally, I think RISCV tries too
> hard to be small, leaving out some common features of other architectures.
> But RISCV can always be extended using custom instructions.
> I would argue for adding ADDC as a custom instruction. It is reminiscent of
> the MULH instruction which returns the high order bits. So maybe ADDH
> rather than ADDC. When I see ADDC I think of it as an add with carry. Could
> also use a SUBH instruction.

High on my list:
Scaled-Index Load/Store;
Encodings to allow for efficiently using large immediate values.

Say (large-immediate encodings):
zzzz_IIII_(xzzz-zztt-ttts-ssss_0yyy-dddd-d011-1111) (1)
IIII-IIII_(xzzz-zztt-ttts-ssss_1yyy-dddd-d011-1111) (2)
zzzz-IIII_IIII-IIII_(zzzz-zztt-ttts-ssss_0001-dddd-d111-1111) (3)
IIII-IIII_IIII-IIII_(zzzz-zztt-ttts-ssss_1001-dddd-d111-1111) (4)

1: Represents a 64-bit encoding space with 16/17-bit immediates.
2: Represents a 64-bit encoding space with 32/33-bit immediates.
3: Represents a 96-bit encoding space with 48-bit immediates.
4: Represents a 96-bit encoding space with 64-bit immediates.

Blocks 2 could have ALU operations and larger-displacement Load/Store.
Say, yyy:
000: Load
001: OP_Imm
010: Store
011: -
100: -
101: -
110: Long Branch
111: -

Say (Load/Store):
Lxx Rd, Rs, Rt*Sc, Disp33s //Rd=[Rs+Rt*Sc+Disp33s]
Sxx Rs, Rt*Sc, Disp33s, Rd //[Rs+Rt*Sc+Disp33s]=Rd
xmmm-cc
x: Sign Extension (Bit 31)
mm: Ld/St Type (B/H/W/D/BU/HU/WU/X)
cc: Scale (1/2/4/8)

For OP_Imm:
OP Rd, Rs, Imm32 //Rd=Rs OP Imm33s
With ~ 10 bits for Opcode.

Long branch would combine Bxx, JAL, and JALR, but with a 32 bit
displacement.

Block 4 could have some large-immediate ALU operations and similar.

Probably:
OR Rd, Zero, Imm64

Or similar, could be used as a 64-bit constant load.

Re: Multi-word addition with a strict Ra = Rb op Rc ISA

<2021Dec28.113610@mips.complang.tuwien.ac.at>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=22525&group=comp.arch#22525

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Multi-word addition with a strict Ra = Rb op Rc ISA
Date: Tue, 28 Dec 2021 10:36:10 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 42
Distribution: world
Message-ID: <2021Dec28.113610@mips.complang.tuwien.ac.at>
References: <sqcer1$8hj$1@newsreader4.netcologne.de> <2021Dec27.190209@mips.complang.tuwien.ac.at> <sqdfnv$dt$1@newsreader4.netcologne.de>
Injection-Info: reader02.eternal-september.org; posting-host="601e837a5ee8f3e908f93612dbc27cca";
logging-data="9061"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19+ptoHJZFZAxvcYwJFb2ZH"
Cancel-Lock: sha1:4RA0zYpJcjRbIkQDWEQRarO+G9I=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 28 Dec 2021 10:36 UTC

Thomas Koenig <tkoenig@netcologne.de> writes:
>Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:
>
>> In the thread starting at
>><2021Mar10.110220@mips.complang.tuwien.ac.at> we have discussed adding
>> an extra bit to each register to allow an efficient add-with-carry
>> (and other things), without adding a special-purpose flag register to
>> the architecture (which is against MIPS-style architectural
>> principles).
>
>I remember that discussion, and that is probably the best way.
>So, the add with carry would be something like
>
> add t0, r1, r2
> add t1, r3, r4
> addcf t1, t1, t0
>
>where "addcf" would add the carry flag from t0 to t1.

Yes, that's a good way to do it if you want to stay with
two-source-register instructions. The add for the next pair of words
can be done in parallel with the addcf, so the overall latency is the
latency of the first add plus n times the latency of addcf.

>From a compiler perspective, it would probably be best to
>model this as a separate condition register with one bit.

Why do you think that is best? Compilers are horrible at modeling
single condition-code registers (which is IMO a big reason why
MIPS-style architectures shun condition codes). I think the best part
of the extra-bits idea is that they are part of GPRs, and in most
contexts compilers don't need to model them separately.

Compilers do need to know whether the extra bits are needed across
calls or spilling, because they are not preserved in callee-saved
registers, and spilling the extra bits costs extra. For typical
bigint code the extra bits are not needed across calls.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Multi-word addition with a strict Ra = Rb op Rc ISA

<sqg2bl$o6l$1@newsreader4.netcologne.de>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=22538&group=comp.arch#22538

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!news.dns-netz.com!news.freedyn.de!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-eb03-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Multi-word addition with a strict Ra = Rb op Rc ISA
Date: Tue, 28 Dec 2021 22:16:21 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <sqg2bl$o6l$1@newsreader4.netcologne.de>
References: <sqcer1$8hj$1@newsreader4.netcologne.de>
<2021Dec27.190209@mips.complang.tuwien.ac.at>
<sqdfnv$dt$1@newsreader4.netcologne.de>
<2021Dec28.113610@mips.complang.tuwien.ac.at>
Injection-Date: Tue, 28 Dec 2021 22:16:21 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-eb03-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:eb03:0:7285:c2ff:fe6c:992d";
logging-data="24789"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 28 Dec 2021 22:16 UTC

Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:
> Thomas Koenig <tkoenig@netcologne.de> writes:
>>Anton Ertl <anton@mips.complang.tuwien.ac.at> schrieb:
>>
>>> In the thread starting at
>>><2021Mar10.110220@mips.complang.tuwien.ac.at> we have discussed adding
>>> an extra bit to each register to allow an efficient add-with-carry
>>> (and other things), without adding a special-purpose flag register to
>>> the architecture (which is against MIPS-style architectural
>>> principles).
>>
>>I remember that discussion, and that is probably the best way.
>>So, the add with carry would be something like
>>
>> add t0, r1, r2
>> add t1, r3, r4
>> addcf t1, t1, t0
>>
>>where "addcf" would add the carry flag from t0 to t1.
>
> Yes, that's a good way to do it if you want to stay with
> two-source-register instructions. The add for the next pair of words
> can be done in parallel with the addcf, so the overall latency is the
> latency of the first add plus n times the latency of addcf.

What else would there be to fix with RISC-V if one
wanted to maintain strict two-source-registers?

An indexed store actually accesses three registers, so
if one wanted to remain pure there, array accesses
could be done via a LEA instruction which does

Ra = Rb + Rc << scale

followed by a load or store, leading to two instructions for an
array access to anything other than a byte instead of three.

Re: Multi-word addition with a strict Ra = Rb op Rc ISA

<760bf064-764f-4b00-b7f9-bdf5448f6bc5n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=22539&group=comp.arch#22539

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:9c3:: with SMTP id y3mr15247533qky.367.1640731690775;
Tue, 28 Dec 2021 14:48:10 -0800 (PST)
X-Received: by 2002:a9d:206a:: with SMTP id n97mr17845871ota.142.1640731690501;
Tue, 28 Dec 2021 14:48:10 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 28 Dec 2021 14:48:10 -0800 (PST)
In-Reply-To: <sqg2bl$o6l$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:8d47:d976:4476:a5f6;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:8d47:d976:4476:a5f6
References: <sqcer1$8hj$1@newsreader4.netcologne.de> <2021Dec27.190209@mips.complang.tuwien.ac.at>
<sqdfnv$dt$1@newsreader4.netcologne.de> <2021Dec28.113610@mips.complang.tuwien.ac.at>
<sqg2bl$o6l$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <760bf064-764f-4b00-b7f9-bdf5448f6bc5n@googlegroups.com>
Subject: Re: Multi-word addition with a strict Ra = Rb op Rc ISA
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 28 Dec 2021 22:48:10 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 65
 by: MitchAlsup - Tue, 28 Dec 2021 22:48 UTC

On Tuesday, December 28, 2021 at 4:16:24 PM UTC-6, Thomas Koenig wrote:
> Anton Ertl <an...@mips.complang.tuwien.ac.at> schrieb:
> > Thomas Koenig <tko...@netcologne.de> writes:
> >>Anton Ertl <an...@mips.complang.tuwien.ac.at> schrieb:
> >>
> >>> In the thread starting at
> >>><2021Mar1...@mips.complang.tuwien.ac.at> we have discussed adding
> >>> an extra bit to each register to allow an efficient add-with-carry
> >>> (and other things), without adding a special-purpose flag register to
> >>> the architecture (which is against MIPS-style architectural
> >>> principles).
> >>
> >>I remember that discussion, and that is probably the best way.
> >>So, the add with carry would be something like
> >>
> >> add t0, r1, r2
> >> add t1, r3, r4
> >> addcf t1, t1, t0
> >>
> >>where "addcf" would add the carry flag from t0 to t1.
> >
> > Yes, that's a good way to do it if you want to stay with
> > two-source-register instructions. The add for the next pair of words
> > can be done in parallel with the addcf, so the overall latency is the
> > latency of the first add plus n times the latency of addcf.
> What else would there be to fix with RISC-V if one
> wanted to maintain strict two-source-registers?
<
Solve the FMAC problem using one 2 operand register specifiers !
>
> An indexed store actually accesses three registers, so
<
Err, not what you think:
<
An indexed store is allowed to access the register containing data to
be stored AFTER LD-Align (nominally the WRITE-Back stage). This
alleviates register read pressure, eliminates large numbers of flip-
flops in the pipeline, and simplifies pipeline design. I, personally,
have never had problems "reading the write slot" that is reading the
register file in the clock period where one would normally be writing
the result register. Done this way, there is NO FORWARDING needed
for the read of the ST.data.
<
I. for one, do not consider reading the register data to be stored
"a problem" worthy of an ISA-level "solution". Let the HW guys tell
you what to do here. HP even got a patent on ST pipeline design
(circa 1986) Follow (or at least READ) this patent before you do ISA
design.
<
There are a lot of things I don't like about RISC-V, but STs and LDs
should be symmetrical {and OpCodes should be at the most significant
parts of the container.}
<
> if one wanted to remain pure there, array accesses
> could be done via a LEA instruction which does
>
> Ra = Rb + Rc << scale
<
My 66000 has:
<
Rd = Rb + Ri<<scale + Disp // +'s are unsigned here; Disp optional
<
as its LEA instruction.
>
> followed by a load or store, leading to two instructions for an
> array access to anything other than a byte instead of three.

Re: Multi-word addition with a strict Ra = Rb op Rc ISA

<sqh93u$f9u$2@newsreader4.netcologne.de>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=22552&group=comp.arch#22552

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-eb03-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Multi-word addition with a strict Ra = Rb op Rc ISA
Date: Wed, 29 Dec 2021 09:17:50 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <sqh93u$f9u$2@newsreader4.netcologne.de>
References: <sqcer1$8hj$1@newsreader4.netcologne.de>
<2021Dec27.190209@mips.complang.tuwien.ac.at>
<sqdfnv$dt$1@newsreader4.netcologne.de>
<2021Dec28.113610@mips.complang.tuwien.ac.at>
<sqg2bl$o6l$1@newsreader4.netcologne.de>
<760bf064-764f-4b00-b7f9-bdf5448f6bc5n@googlegroups.com>
Injection-Date: Wed, 29 Dec 2021 09:17:50 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-eb03-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:eb03:0:7285:c2ff:fe6c:992d";
logging-data="15678"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Wed, 29 Dec 2021 09:17 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:
> On Tuesday, December 28, 2021 at 4:16:24 PM UTC-6, Thomas Koenig wrote:

>> An indexed store actually accesses three registers, so
><
> Err, not what you think:
><
> An indexed store is allowed to access the register containing data to
> be stored AFTER LD-Align (nominally the WRITE-Back stage). This
> alleviates register read pressure, eliminates large numbers of flip-
> flops in the pipeline, and simplifies pipeline design. I, personally,
> have never had problems "reading the write slot" that is reading the
> register file in the clock period where one would normally be writing
> the result register. Done this way, there is NO FORWARDING needed
> for the read of the ST.data.

Thanks for the explanation.

> I. for one, do not consider reading the register data to be stored
> "a problem" worthy of an ISA-level "solution". Let the HW guys tell
> you what to do here. HP even got a patent on ST pipeline design
> (circa 1986) Follow (or at least READ) this patent before you do ISA
> design.

I would love to.

Do you have any more information that would make it easier to find,
for example a number, inventor's name, or... ?

Re: Multi-word addition with a strict Ra = Rb op Rc ISA

<sqk58i$kfn$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=22603&group=comp.arch#22603

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: m.del...@this.bitsnbites.eu (Marcus)
Newsgroups: comp.arch
Subject: Re: Multi-word addition with a strict Ra = Rb op Rc ISA
Date: Thu, 30 Dec 2021 12:30:26 +0100
Organization: A noiseless patient Spider
Lines: 63
Message-ID: <sqk58i$kfn$1@dont-email.me>
References: <sqcer1$8hj$1@newsreader4.netcologne.de>
<2021Dec27.190209@mips.complang.tuwien.ac.at>
<b3a08d16-c310-448b-b8ad-18dc5e2acdc1n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 30 Dec 2021 11:30:26 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="6016acade3692638d873f399288fb852";
logging-data="20983"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/crsqeGJYVzUNZeIiQjkDy58QhqHEA7aw="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
Cancel-Lock: sha1:KxpBIVjZD0dpTrEwRHQrIpK0Q5M=
In-Reply-To: <b3a08d16-c310-448b-b8ad-18dc5e2acdc1n@googlegroups.com>
Content-Language: en-US
 by: Marcus - Thu, 30 Dec 2021 11:30 UTC

On 2021-12-27, MitchAlsup wrote:
> On Monday, December 27, 2021 at 12:51:20 PM UTC-6, Anton Ertl wrote:
>> Thomas Koenig <tko...@netcologne.de> writes:
>>> Triggered by this post
>>>
>>> https://gmplib.org/list-archives/gmp-devel/2021-September/006013.html
>>>
>>> (where somebody discusses, in rather stringent terms, the performance
>>> of RISC-V for multitword addition in GMP), I wonder:
>>>
>>> RISC-V has (from that post) the instruction sequence
>>>
>>>
>>> add t0, a4, a6 // add low words
>>> sltu t6, t0, a4 // compute carry-out from low add
>> That's only if you don't have a carry-in (i.e., a half-adder). For a
>> full-adder, it's AFAIK a four-instruction idiom (hmm, the link you
>> give uses five instructions for the full addition that follows the
>> half addition at the start).
>>
>> The original MIPS and Alpha also needed these sequences; AFAIK
>> recently MIPS has added a carry flag to make this stuff more
>> efficient.
>>> so one would need to wait for the result of the add to compare
>>> to generate the carry.
>>>
>>> Would it not be better to have an "add and generate the carry only"
>>> instruction instead, one coud write, hypothetically,
>>>
>>> add t0, a4, a6
>>> addc t6, a4, a6
>> That helps a little bit. But I think the RISC-V answer is that
>> instead of adding that, they fuse the add and the sltu to produce the
>> two results in one cycle, without needing to add a new instruction.
> <
> Still not as efficient as My 66000:
> <
> CARRY R16,{{I}{IO}{IO}{O}}
> ADD R12,R4,R8 // carry Out only
> ADD R13,R5,R9 // Carry In and Out
> ADD R14,R6,R10 // Carry In and Out
> ADD R15,R7,R11 // Carry In only

What are the interrupt semantics here? Is the entire carry chain treated
as an uninterruptible sequence?

>>
>> The full-adder overhead is more worrysome; one could also do that with
>> instruction fusion, but are they going to do it, and how well will it
>> work?
>>
>> In the thread starting at
>> <2021Mar1...@mips.complang.tuwien.ac.at> we have discussed adding
>> an extra bit to each register to allow an efficient add-with-carry
>> (and other things), without adding a special-purpose flag register to
>> the architecture (which is against MIPS-style architectural
>> principles).
>>
>> - anton
>> --
>> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
>> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Multi-word addition with a strict Ra = Rb op Rc ISA

<e7b268cb-6b0c-4b02-9ce7-6abdcaf307cdn@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=22615&group=comp.arch#22615

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:b0e:: with SMTP id t14mr22448455qkg.146.1640887996008;
Thu, 30 Dec 2021 10:13:16 -0800 (PST)
X-Received: by 2002:a05:6808:1914:: with SMTP id bf20mr24939094oib.7.1640887995791;
Thu, 30 Dec 2021 10:13:15 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 30 Dec 2021 10:13:15 -0800 (PST)
In-Reply-To: <sqk58i$kfn$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7c95:1043:36c7:2208;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7c95:1043:36c7:2208
References: <sqcer1$8hj$1@newsreader4.netcologne.de> <2021Dec27.190209@mips.complang.tuwien.ac.at>
<b3a08d16-c310-448b-b8ad-18dc5e2acdc1n@googlegroups.com> <sqk58i$kfn$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <e7b268cb-6b0c-4b02-9ce7-6abdcaf307cdn@googlegroups.com>
Subject: Re: Multi-word addition with a strict Ra = Rb op Rc ISA
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 30 Dec 2021 18:13:15 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 67
 by: MitchAlsup - Thu, 30 Dec 2021 18:13 UTC

On Thursday, December 30, 2021 at 5:30:28 AM UTC-6, Marcus wrote:
> On 2021-12-27, MitchAlsup wrote:
> > On Monday, December 27, 2021 at 12:51:20 PM UTC-6, Anton Ertl wrote:
> >> Thomas Koenig <tko...@netcologne.de> writes:
> >>> Triggered by this post
> >>>
> >>> https://gmplib.org/list-archives/gmp-devel/2021-September/006013.html
> >>>
> >>> (where somebody discusses, in rather stringent terms, the performance
> >>> of RISC-V for multitword addition in GMP), I wonder:
> >>>
> >>> RISC-V has (from that post) the instruction sequence
> >>>
> >>>
> >>> add t0, a4, a6 // add low words
> >>> sltu t6, t0, a4 // compute carry-out from low add
> >> That's only if you don't have a carry-in (i.e., a half-adder). For a
> >> full-adder, it's AFAIK a four-instruction idiom (hmm, the link you
> >> give uses five instructions for the full addition that follows the
> >> half addition at the start).
> >>
> >> The original MIPS and Alpha also needed these sequences; AFAIK
> >> recently MIPS has added a carry flag to make this stuff more
> >> efficient.
> >>> so one would need to wait for the result of the add to compare
> >>> to generate the carry.
> >>>
> >>> Would it not be better to have an "add and generate the carry only"
> >>> instruction instead, one coud write, hypothetically,
> >>>
> >>> add t0, a4, a6
> >>> addc t6, a4, a6
> >> That helps a little bit. But I think the RISC-V answer is that
> >> instead of adding that, they fuse the add and the sltu to produce the
> >> two results in one cycle, without needing to add a new instruction.
> > <
> > Still not as efficient as My 66000:
> > <
> > CARRY R16,{{I}{IO}{IO}{O}}
> > ADD R12,R4,R8 // carry Out only
> > ADD R13,R5,R9 // Carry In and Out
> > ADD R14,R6,R10 // Carry In and Out
> > ADD R15,R7,R11 // Carry In only
> What are the interrupt semantics here?
<
Interrupts are precise.
<
> Is the entire carry chain treated
> as an uninterruptible sequence?
<
No.
<
> >>
> >> The full-adder overhead is more worrysome; one could also do that with
> >> instruction fusion, but are they going to do it, and how well will it
> >> work?
> >>
> >> In the thread starting at
> >> <2021Mar1...@mips.complang.tuwien.ac.at> we have discussed adding
> >> an extra bit to each register to allow an efficient add-with-carry
> >> (and other things), without adding a special-purpose flag register to
> >> the architecture (which is against MIPS-style architectural
> >> principles).
> >>
> >> - anton
> >> --
> >> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> >> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

1
server_pubkey.txt

rocksolid light 0.9.7
clearnet tor