Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Think of your family tonight. Try to crawl home after the computer crashes.


devel / comp.arch / Re: Extended double precision decimal floating point

SubjectAuthor
* Extended double precision decimal floating pointrobf...@gmail.com
+* Re: Extended double precision decimal floating pointTerje Mathisen
|`* Re: Extended double precision decimal floating pointBGB
| `* Re: Extended double precision decimal floating pointTerje Mathisen
|  `* Re: Extended double precision decimal floating pointBGB
|   `* Re: Extended double precision decimal floating pointrobf...@gmail.com
|    `* Re: Extended double precision decimal floating pointIvan Godard
|     `* Re: Extended double precision decimal floating pointThomas Koenig
|      `* Re: Extended double precision decimal floating pointBGB
|       +* Re: Extended double precision decimal floating pointIvan Godard
|       |+* Re: Extended double precision decimal floating pointMitchAlsup
|       ||`* Re: Extended double precision decimal floating pointBGB
|       || `* Re: Extended double precision decimal floating pointTerje Mathisen
|       ||  `- Re: Extended double precision decimal floating pointBGB
|       |+- Re: Extended double precision decimal floating pointBGB
|       |+- Re: Extended double precision decimal floating pointMichael S
|       |`* Re: Extended double precision decimal floating pointTerje Mathisen
|       | `* Re: Extended double precision decimal floating pointJohn Levine
|       |  `- Re: Extended double precision decimal floating pointIvan Godard
|       `- Re: Extended double precision decimal floating pointTerje Mathisen
`* Re: Extended double precision decimal floating pointMichael S
 +* Re: Extended double precision decimal floating pointJohn Levine
 |`* Re: Extended double precision decimal floating pointMichael S
 | `* Re: Extended double precision decimal floating pointJohn Levine
 |  +* Re: Extended double precision decimal floating pointStefan Monnier
 |  |+- Re: Extended double precision decimal floating pointMitchAlsup
 |  |`* Re: Extended double precision decimal floating pointrobf...@gmail.com
 |  | `* Re: Extended double precision decimal floating pointMitchAlsup
 |  |  +* Re: Extended double precision decimal floating pointIvan Godard
 |  |  |`- Re: Extended double precision decimal floating pointMitchAlsup
 |  |  +- Re: Extended double precision decimal floating pointBGB
 |  |  `* Re: Extended double precision decimal floating pointThomas Koenig
 |  |   `* Re: Extended double precision decimal floating pointMichael S
 |  |    +- Re: Extended double precision decimal floating pointTerje Mathisen
 |  |    `- Re: Extended double precision decimal floating pointThomas Koenig
 |  `* Re: Extended double precision decimal floating pointEricP
 |   `* Re: Financial arithmetic, was Extended double precision decimal floating pointJohn Levine
 |    `- Re: Financial arithmetic, was Extended double precision decimalTerje Mathisen
 +- Re: Extended double precision decimal floating pointTerje Mathisen
 `* Re: Extended double precision decimal floating pointQuadibloc
  `* Re: Extended double precision decimal floating pointBGB
   `* Re: Extended double precision decimal floating pointQuadibloc
    `* Re: Extended double precision decimal floating pointMitchAlsup
     +* Re: Extended double precision decimal floating pointIvan Godard
     |+* Re: Extended double precision decimal floating pointJohn Levine
     ||`- Re: Extended double precision decimal floating pointThomas Koenig
     |+* Re: Extended double precision decimal floating pointMitchAlsup
     ||`* Re: Extended double precision decimal floating pointIvan Godard
     || +* Re: Extended double precision decimal floating pointMichael S
     || |`- Re: Extended double precision decimal floating pointIvan Godard
     || `* Re: Extended double precision decimal floating pointMitchAlsup
     ||  `- Re: Extended double precision decimal floating pointIvan Godard
     |`* Re: Extended double precision decimal floating pointAnton Ertl
     | `* Re: Extended double precision decimal floating pointmac
     |  +- Re: Extended double precision decimal floating pointAnton Ertl
     |  `* Re: Extended double precision decimal floating pointQuadibloc
     |   `- Re: Extended double precision decimal floating pointQuadibloc
     +* Re: Extended double precision decimal floating pointQuadibloc
     |+- Re: Extended double precision decimal floating pointrobf...@gmail.com
     |+* Re: Extended double precision decimal floating pointAnton Ertl
     ||`* Re: IBM features, Extended double precision decimal floating pointJohn Levine
     || +* Re: IBM features, Extended double precision decimal floating pointJohn Dallman
     || |`- Re: IBM features, Extended double precision decimal floating pointQuadibloc
     || +* Re: IBM features, Extended double precision decimal floating pointThomas Koenig
     || |+* Re: IBM features, Extended double precision decimal floating pointrobf...@gmail.com
     || ||+* Re: IBM features, Extended double precision decimal floating pointMitchAlsup
     || |||`- Re: IBM features, Extended double precision decimal floating pointJohn Levine
     || ||+- Re: IBM features, Extended double precision decimal floating pointBGB
     || ||`- Re: IBM features, Extended double precision decimal floating pointThomas Koenig
     || |`* Re: IBM features, Extended double precision decimal floating pointJohn Levine
     || | `* Re: IBM features, Extended double precision decimal floating pointThomas Koenig
     || |  +- Re: IBM features, Extended double precision decimal floating pointBGB
     || |  `* Re: IBM features, Extended double precision decimal floating pointJohn Levine
     || |   `- Re: IBM features, Extended double precision decimal floating pointThomas Koenig
     || `* Re: IBM features, Extended double precision decimal floating pointAnton Ertl
     ||  `* Re: IBM features, Extended double precision decimal floating pointMitchAlsup
     ||   +* Re: IBM features, Extended double precision decimal floating pointAnton Ertl
     ||   |`- Re: IBM features, Extended double precision decimal floating pointThomas Koenig
     ||   `* Re: IBM features, Extended double precision decimal floating pointEricP
     ||    `- Re: IBM features, Extended double precision decimal floating pointEricP
     |+* Re: Extended double precision decimal floating pointIvan Godard
     ||+* Re: Extended double precision decimal floating pointMitchAlsup
     |||`* Re: Extended double precision decimal floating pointIvan Godard
     ||| +- Re: Extended double precision decimal floating pointMitchAlsup
     ||| `- Re: Extended double precision decimal floating pointQuadibloc
     ||`- Re: Extended double precision decimal floating pointThomas Koenig
     |`- Re: Extended double precision decimal floating pointMitchAlsup
     `* Re: Extended double precision decimal floating pointTerje Mathisen
      +* Re: Extended double precision decimal floating pointrobf...@gmail.com
      |+* Re: Extended double precision decimal floating pointIvan Godard
      ||`- Re: Extended double precision decimal floating pointMitchAlsup
      |+- Re: Extended double precision decimal floating pointTerje Mathisen
      |+- Re: Extended double precision decimal floating pointEricP
      |`- Re: Extended double precision decimal floating pointMitchAlsup
      +* Re: Extended double precision decimal floating pointMitchAlsup
      |`- Re: Extended double precision decimal floating pointBGB
      `* Re: Extended double precision decimal floating pointStefan Monnier
       `* Re: Extended double precision decimal floating pointTerje Mathisen
        `* Re: Extended double precision decimal floating pointStefan Monnier
         +* Re: Extended double precision decimal floating pointMitchAlsup
         |`* Re: Extended double precision decimal floating pointMichael S
         `* Re: Extended double precision decimal floating pointTerje Mathisen

Pages:12345
Re: IBM features, Extended double precision decimal floating point

<595cf191-fa52-45bd-9659-8b9994ef6ea5n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24909&group=comp.arch#24909

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1a95:b0:2f3:71ce:4a70 with SMTP id s21-20020a05622a1a9500b002f371ce4a70mr2063222qtc.465.1650920210694;
Mon, 25 Apr 2022 13:56:50 -0700 (PDT)
X-Received: by 2002:a05:6808:11ca:b0:2d9:a01a:488b with SMTP id
p10-20020a05680811ca00b002d9a01a488bmr13581578oiv.214.1650920210459; Mon, 25
Apr 2022 13:56:50 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 25 Apr 2022 13:56:50 -0700 (PDT)
In-Reply-To: <de0c1ab0-c41a-4323-913d-e1ed73a03fedn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:b861:adf9:ab38:369f;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:b861:adf9:ab38:369f
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com> <c64714c1-ec97-4ac4-871b-63578dd8cf1dn@googlegroups.com>
<2022Apr25.083641@mips.complang.tuwien.ac.at> <t46mg6$27fl$1@gal.iecc.com>
<t46ujj$7n3$1@newsreader4.netcologne.de> <de0c1ab0-c41a-4323-913d-e1ed73a03fedn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <595cf191-fa52-45bd-9659-8b9994ef6ea5n@googlegroups.com>
Subject: Re: IBM features, Extended double precision decimal floating point
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 25 Apr 2022 20:56:50 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Mon, 25 Apr 2022 20:56 UTC

On Monday, April 25, 2022 at 3:28:13 PM UTC-5, robf...@gmail.com wrote:
> On Monday, April 25, 2022 at 3:59:50 PM UTC-4, Thomas Koenig wrote:
> > John Levine <jo...@taugh.com> schrieb:
> > > According to Anton Ertl <an...@mips.complang.tuwien.ac.at>:
> > >>Quadibloc <jsa...@ecn.ab.ca> writes:
> > >>>However, I assume applications for decimal floating-point do exist,
> > >>>even if I don't know of them, or IBM would never have come up with
> > >>>it.
> > >>
> > >>IBM's application is to have a USP for their expensive machines that
> > >>their salesmen and buyers (however they were convinced to buy these
> > >>machines) can point to as a justification for the decision to buy
> > >>these machines.
> > >
> > > New models of the z series have had some oddly specific addtions,
> > > like the DEFLATE instruction that does the inner part of gzip
> > This is something they had previously as "zEnterprise Data
> > Compression". Apparently, you can use this from Java (among
> > others).
> > > and
> > > the digital signature instruction that does elliptic curve signing
> > > and verifying. I realize those are both reasonably common in web
> > > servers but it'd surprise me if they were enough of a bottleneck
> > > to merit putting them in microcode.
> > Seems like it is possible to create a compressed data set on
> > MVS^H^H^HzOs. If so, it makes sense to have the compression/
> > decompression as fast as possible, not to lose (or even gain)
> > speed.
> >
> > As for elliptic cryptography... if you want high-security
> > data transfer, that's what you need.
> > >Ditto vector packed decimal
> > > and the heapsort instructions we argued about a while ago.
> > zSystem has to fight to stay relevant, they are probably
> > going for each advantage they can throw hardware at.
> > > Beyond the question of putting them in the instruction set,
> > > what application code uses them? I can see that it wouldn't
> > > be too hard to get a web browser to use DEFLATE and elliptic
> > > crypto, but what would use vector decimal? I don't think
> > > parallel COBOL is a thing.
> > Possibly to speed up SAP by half a percent?
<
> I am getting the impression that my time would be better spent working
> on fixed point BCD arithmetic primitives. It would probably be better to
> have those in the ISA than DFP. For a BCD format 128 bits could be used
> for 36 significant digits packed into 120 bits. Then the topmost bit would
> be a sign bit. That leaves six bits which could be used to record the
> decimal location, plus one extra bit. Any use for the extra bit?
> 1 – bit sign
> 1 – bit extra
> 6 – bits decimal point location
> 120 – bits DPD 36 digits
<
You can ALSO consider: that unlike binary integers, the uses of decimal
data tends a lot more to singular than multiple. {Whereas binary integers
are used 1.2 times on average--decimal fixed point tend to be used closer
to 1.03 times on average). This practically eliminates a need for there to
be a register <size> available to contain one of them.
<
Then; since these are essentially use once; might just as well leave them
as a string in memory of known length {no DPD reformating is actually
required -- but you can do it in your architecture if you so choose.} Left
native, conversion to and from ASCII is easy.
<
But I want to comment on something BGB wrote above::
<
When defining various flavors of fixed point data, I prefer the first digit to
specify the total number of digits and the second number to specify the
number of digits to the right of the decimal point. {My notation also prepends
an S (signed) or U (unsigned) marker at the front.}
<
S9.2 has 9 total digits 2 to the right of the decimal point {±nnnnnnn.nn}
S9.12 has 9 total digits 12 places to the right of the decimal point, representing
the values {±0.000,nnn,nnn,nnn}
S9.-3 has values in the range {nnn,nnn,nnn,000}
<
Except for special values, all the Sx.y need 1 more bit than Ux.y
<
This notation makes it easier to define multiplication::
<
S7.5 × S8.2 -> S(7+8),(5+2)
<
Eases addition
<
S7.5 + S8.2 -> S(max(7,8)+1),(max(5,2)) // overflow free
or
S7.5 + S8.2 -> S(max(7,8)),(max(5,2)) // can overflow
<
But does nothing for division.

Re: IBM features, Extended double precision decimal floating point

<t4754j$i4f$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24910&group=comp.arch#24910

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: IBM features, Extended double precision decimal floating point
Date: Mon, 25 Apr 2022 16:51:11 -0500
Organization: A noiseless patient Spider
Lines: 152
Message-ID: <t4754j$i4f$1@dont-email.me>
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com>
<c64714c1-ec97-4ac4-871b-63578dd8cf1dn@googlegroups.com>
<2022Apr25.083641@mips.complang.tuwien.ac.at> <t46mg6$27fl$1@gal.iecc.com>
<t46ujj$7n3$1@newsreader4.netcologne.de>
<de0c1ab0-c41a-4323-913d-e1ed73a03fedn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 25 Apr 2022 21:51:15 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="f41bfcb9b9925cad127e0ba7408e9cb0";
logging-data="18575"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/gFi3b1C9fEwO2pLapy+YX"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.8.0
Cancel-Lock: sha1:mseEm5lcfydSAhksnxHaKWSms9M=
In-Reply-To: <de0c1ab0-c41a-4323-913d-e1ed73a03fedn@googlegroups.com>
Content-Language: en-US
 by: BGB - Mon, 25 Apr 2022 21:51 UTC

On 4/25/2022 3:28 PM, robf...@gmail.com wrote:
> On Monday, April 25, 2022 at 3:59:50 PM UTC-4, Thomas Koenig wrote:
>> John Levine <jo...@taugh.com> schrieb:
>>> According to Anton Ertl <an...@mips.complang.tuwien.ac.at>:
>>>> Quadibloc <jsa...@ecn.ab.ca> writes:
>>>>> However, I assume applications for decimal floating-point do exist,
>>>>> even if I don't know of them, or IBM would never have come up with
>>>>> it.
>>>>
>>>> IBM's application is to have a USP for their expensive machines that
>>>> their salesmen and buyers (however they were convinced to buy these
>>>> machines) can point to as a justification for the decision to buy
>>>> these machines.
>>>
>>> New models of the z series have had some oddly specific addtions,
>>> like the DEFLATE instruction that does the inner part of gzip
>> This is something they had previously as "zEnterprise Data
>> Compression". Apparently, you can use this from Java (among
>> others).
>>> and
>>> the digital signature instruction that does elliptic curve signing
>>> and verifying. I realize those are both reasonably common in web
>>> servers but it'd surprise me if they were enough of a bottleneck
>>> to merit putting them in microcode.
>> Seems like it is possible to create a compressed data set on
>> MVS^H^H^HzOs. If so, it makes sense to have the compression/
>> decompression as fast as possible, not to lose (or even gain)
>> speed.
>>
>> As for elliptic cryptography... if you want high-security
>> data transfer, that's what you need.
>>> Ditto vector packed decimal
>>> and the heapsort instructions we argued about a while ago.
>> zSystem has to fight to stay relevant, they are probably
>> going for each advantage they can throw hardware at.
>>> Beyond the question of putting them in the instruction set,
>>> what application code uses them? I can see that it wouldn't
>>> be too hard to get a web browser to use DEFLATE and elliptic
>>> crypto, but what would use vector decimal? I don't think
>>> parallel COBOL is a thing.
>> Possibly to speed up SAP by half a percent?
>
> I am getting the impression that my time would be better spent working
> on fixed point BCD arithmetic primitives. It would probably be better to
> have those in the ISA than DFP. For a BCD format 128 bits could be used
> for 36 significant digits packed into 120 bits. Then the topmost bit would
> be a sign bit. That leaves six bits which could be used to record the
> decimal location, plus one extra bit. Any use for the extra bit?
> 1 – bit sign
> 1 – bit extra
> 6 – bits decimal point location
> 120 – bits DPD 36 digits

This seems like a possibility, though for decimal fixed-point, one
doesn't really need to store the decimal-point per-se, as it can be
managed entirely by the compiler (so doesn't really need to exist at
runtime).

I have realized that there is a potential viable algorithm for
converting Binary to BCD in hardware:
Do a BCD ADD, adding the register to its self, for N cycles;
At each cycle, use the next input bit as a Carry-In bit for the ADD.

The output would be initialized as 0, but would be "filled in" as the
value is drip-fed via the carry bit.

In hardware, it could conceivably be done in ~ 66 clock cycles for Int64
to BCD64, or ~ 130 cycles for Int128 (but this would require effectively
doing units on 2 lanes and then ganging them).

This could also be done in software as well.

In software (assuming I add a ROTCLS instruction, and/or move BCDADC
back to SR.T), it would be ~ 192 cycles for Int64/BCD64, or 640 cycles
for Int128/BCD128 (assuming it were fully unrolled).

Eg, new op, per pit:
CLRS
ROTCLS R4 //Shift R4 left 1 bit, copying MSB into SR.S
BCDADC R2, R2

Or, 128-bit:
CLRS
ROTCLS R4 //Shift R4 left 1 bit, copying MSB into SR.S
ROTCLS R5 //Shift R4 left 1 bit, copying MSB into SR.S
BCDADC R2, R2
BCDADC R3, R3

Or, if I move BCDADC back to SR.T, per pit:
CLRT
ROTCL R4 //Shift R4 left 1 bit, copying MSB into SR.S
BCDADC R2, R2

Or, 128-bit:
CLRT
ROTCL R4 //Shift R4 left 1 bit, copying MSB into SR.S
ROTCL R5 //Shift R4 left 1 bit, copying MSB into SR.S
BCDADC R2, R2
BCDADC R3, R3

It is likely that it could be unrolled for 8 or so iterations (then just
repeat the loop 8 or 16 times).

Unclear is how performance would compare with the other approaches...

It does, however, avoid any need to perform large multiplies to break
apart the digits.

BCD -> Binary conversion could be done by multiplying the destination by
10 and adding each digit.

This is a case where conceivably LEA could be useful (can do a multiply
by 5 in 1 clock cycle), except that it only works on the low 48 bits
(N/A for 64 or 128 bit values).

Shift+ADD also works, just is slightly less efficient. However,
Shift+ADD is still faster than using generic multiply instructions.
SHLDX R4, 3, R2 //R3:R2 = R5:R4 * 8
SHLDX R4, 1, R16 //R3:R2 = R5:R4 * 2
ADDX R2, R16, R2 //8+2=10

Or, as more relevant for BCD conversion:
SHLDX R2, 3, R16 //R17:R16 = R3:R2 * 8
SHLDX R4, -124, R20 //Get high digit
SHLDX R2, 1, R18 //R19:R18 = R3:R2 * 2
ADDX R16, R20, R16 //Add high digit
SHLDX R4, 4, R4 //Shift input left 1 digit
ADDX R18, R16, R2 //8+2=10

This being the sequence for BCD128 -> Int128 conversion.
Would execute the above sequence 32 times (once per BCD digit).

Best case, ~ 192 clock cycles. Note that the instructions were ordered
to minimize interlock penalties (these instructions have a 2 cycle latency).

....

Though, in any case, these particular BCD ops aren't going to be exactly
"high performance"...

It will be something that will be OK for ADD/SUB, but format-conversion
(casting to/from binary integer types), MUL/DIV, ... are still "gonna suck".

Re: Extended double precision decimal floating point

<02551c9b-af63-42ab-bf79-7c52801dfc66n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24911&group=comp.arch#24911

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a0c:c788:0:b0:444:2c7f:4126 with SMTP id k8-20020a0cc788000000b004442c7f4126mr14192561qvj.50.1650925233284;
Mon, 25 Apr 2022 15:20:33 -0700 (PDT)
X-Received: by 2002:a05:6808:16ac:b0:2f9:52e5:da90 with SMTP id
bb44-20020a05680816ac00b002f952e5da90mr13428959oib.5.1650925233045; Mon, 25
Apr 2022 15:20:33 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 25 Apr 2022 15:20:32 -0700 (PDT)
In-Reply-To: <t46iaf$n5t$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:d8c8:d37a:c56:5acc;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:d8c8:d37a:c56:5acc
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<8f909c18-cfe1-4748-8478-543bac50fdbbn@googlegroups.com> <980ae858-467d-4b0a-8ef8-4a3fe990dd69n@googlegroups.com>
<t432di$5l6$1@dont-email.me> <5188795c-42a7-45d4-8243-f4bbaa3a96a9n@googlegroups.com>
<26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com> <c64714c1-ec97-4ac4-871b-63578dd8cf1dn@googlegroups.com>
<t45iio$i54$1@dont-email.me> <ba487acd-0567-4b0e-8891-e1355e2518bfn@googlegroups.com>
<t46iaf$n5t$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <02551c9b-af63-42ab-bf79-7c52801dfc66n@googlegroups.com>
Subject: Re: Extended double precision decimal floating point
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 25 Apr 2022 22:20:33 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 10
 by: Quadibloc - Mon, 25 Apr 2022 22:20 UTC

On Monday, April 25, 2022 at 10:30:10 AM UTC-6, Ivan Godard wrote:

> I think you mistook my post. OP asserted that the *only* use for DFP was
> spreadsheets; I offered counterexamples. My point was that there are
> other uses for DFP, not that billing and payroll are done on spreadsheets.

However, given that people did billing and payroll back before DFP was
invented, and fixed-point decimal arithmetic suited those applications just
fine, *I* am still in doubt about the validity of your point.

John Savard

Re: IBM features, Extended double precision decimal floating point

<53ec612a-8fc1-4507-9fe5-080a0e71f2d1n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24912&group=comp.arch#24912

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:4627:b0:69f:3328:71cc with SMTP id br39-20020a05620a462700b0069f332871ccmr6891760qkb.689.1650925377005;
Mon, 25 Apr 2022 15:22:57 -0700 (PDT)
X-Received: by 2002:a05:6870:f2a9:b0:e5:8106:4486 with SMTP id
u41-20020a056870f2a900b000e581064486mr12272644oap.109.1650925376750; Mon, 25
Apr 2022 15:22:56 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!feeder1.cambriumusenet.nl!feed.tweak.nl!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 25 Apr 2022 15:22:56 -0700 (PDT)
In-Reply-To: <memo.20220425202553.15208S@jgd.cix.co.uk>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:d8c8:d37a:c56:5acc;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:d8c8:d37a:c56:5acc
References: <t46mg6$27fl$1@gal.iecc.com> <memo.20220425202553.15208S@jgd.cix.co.uk>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <53ec612a-8fc1-4507-9fe5-080a0e71f2d1n@googlegroups.com>
Subject: Re: IBM features, Extended double precision decimal floating point
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 25 Apr 2022 22:22:56 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Quadibloc - Mon, 25 Apr 2022 22:22 UTC

On Monday, April 25, 2022 at 1:25:57 PM UTC-6, John Dallman wrote:
> In article <t46mg6$27fl$1...@gal.iecc.com>, jo...@taugh.com (John Levine)
> wrote:

> > but what would use vector decimal? I don't think parallel COBOL
> > is a thing.

> Bulk database updates? DB/2 can use decimal formats, as can many
> databases.

In which case, perhaps vector COBOL _should_ be a thing, so that they
don't have to keep writing DB/2 in assembler language.

John Savard

Re: IBM features, Extended double precision decimal floating point

<t47icp$2lno$1@gal.iecc.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24913&group=comp.arch#24913

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: IBM features, Extended double precision decimal floating point
Date: Tue, 26 Apr 2022 01:37:29 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <t47icp$2lno$1@gal.iecc.com>
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com> <2022Apr25.083641@mips.complang.tuwien.ac.at> <t46mg6$27fl$1@gal.iecc.com> <t46ujj$7n3$1@newsreader4.netcologne.de>
Injection-Date: Tue, 26 Apr 2022 01:37:29 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="87800"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com> <2022Apr25.083641@mips.complang.tuwien.ac.at> <t46mg6$27fl$1@gal.iecc.com> <t46ujj$7n3$1@newsreader4.netcologne.de>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Tue, 26 Apr 2022 01:37 UTC

According to Thomas Koenig <tkoenig@netcologne.de>:
>> New models of the z series have had some oddly specific addtions,
>> like the DEFLATE instruction that does the inner part of gzip
>
>This is something they had previously as "zEnterprise Data
>Compression". Apparently, you can use this from Java (among
>others).

There aren't very many deflate/gzip libraries so I would hope they'd
have versions of the libraries that use the instruction, but I haven't
been sufficiently interested to go look.

>> and the digital signature instruction that does elliptic curve signing
>> and verifying. I realize those are both reasonably common in web
>> servers but it'd surprise me if they were enough of a bottleneck
>> to merit putting them in microcode.
>
>Seems like it is possible to create a compressed data set on
>MVS^H^H^HzOs.

Possibly, but http potentially does deflate encoding for every request.
If you're using your computer as a web server, you're going to do
a whole lot of deflating.

>As for elliptic cryptography... if you want high-security
>data transfer, that's what you need.

This does signing and verification. Encryption is a different feature.
I'm pretty sure this is for TLS.

>> Beyond the question of putting them in the instruction set,
>> what application code uses them? I can see that it wouldn't
>> be too hard to get a web browser to use DEFLATE and elliptic
>> crypto, but what would use vector decimal? I don't think
>> parallel COBOL is a thing.
>
>Possibly to speed up SAP by half a percent?

I suppose, but as always, what's the tradeoff between this special
feature and something general like a bigger cache?

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: IBM features, Extended double precision decimal floating point

<t47ifm$2lno$2@gal.iecc.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24914&group=comp.arch#24914

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: IBM features, Extended double precision decimal floating point
Date: Tue, 26 Apr 2022 01:39:02 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <t47ifm$2lno$2@gal.iecc.com>
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com> <t46ujj$7n3$1@newsreader4.netcologne.de> <de0c1ab0-c41a-4323-913d-e1ed73a03fedn@googlegroups.com> <595cf191-fa52-45bd-9659-8b9994ef6ea5n@googlegroups.com>
Injection-Date: Tue, 26 Apr 2022 01:39:02 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="87800"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com> <t46ujj$7n3$1@newsreader4.netcologne.de> <de0c1ab0-c41a-4323-913d-e1ed73a03fedn@googlegroups.com> <595cf191-fa52-45bd-9659-8b9994ef6ea5n@googlegroups.com>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Tue, 26 Apr 2022 01:39 UTC

According to MitchAlsup <MitchAlsup@aol.com>:
>You can ALSO consider: that unlike binary integers, the uses of decimal
>data tends a lot more to singular than multiple. {Whereas binary integers
>are used 1.2 times on average--decimal fixed point tend to be used closer
>to 1.03 times on average). This practically eliminates a need for there to
>be a register <size> available to contain one of them.

That's not surprising. The rule of thumb used to be that CVB and CVD were a
win if you were going to do two arithmetic operations.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: IBM features, Extended double precision decimal floating point

<t47vtk$ub6$1@newsreader4.netcologne.de>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24915&group=comp.arch#24915

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd4-f179-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: IBM features, Extended double precision decimal floating point
Date: Tue, 26 Apr 2022 05:28:20 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <t47vtk$ub6$1@newsreader4.netcologne.de>
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com>
<c64714c1-ec97-4ac4-871b-63578dd8cf1dn@googlegroups.com>
<2022Apr25.083641@mips.complang.tuwien.ac.at> <t46mg6$27fl$1@gal.iecc.com>
<t46ujj$7n3$1@newsreader4.netcologne.de>
<de0c1ab0-c41a-4323-913d-e1ed73a03fedn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 26 Apr 2022 05:28:20 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd4-f179-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd4:f179:0:7285:c2ff:fe6c:992d";
logging-data="31078"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 26 Apr 2022 05:28 UTC

robf...@gmail.com <robfi680@gmail.com> schrieb:

> I am getting the impression that my time would be better spent working
> on fixed point BCD arithmetic primitives. It would probably be better to
> have those in the ISA than DFP. For a BCD format 128 bits could be used
> for 36 significant digits packed into 120 bits. Then the topmost bit would
> be a sign bit. That leaves six bits which could be used to record the
> decimal location, plus one extra bit. Any use for the extra bit?
> 1 – bit sign
> 1 – bit extra
> 6 – bits decimal point location
> 120 – bits DPD 36 digits

You described a decimal floating point with more digits and
less exponent range than the ones currently in use.

Re: IBM features, Extended double precision decimal floating point

<t483n3$hf$1@newsreader4.netcologne.de>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24916&group=comp.arch#24916

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd4-f179-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: IBM features, Extended double precision decimal floating point
Date: Tue, 26 Apr 2022 06:33:07 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <t483n3$hf$1@newsreader4.netcologne.de>
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<2022Apr25.083641@mips.complang.tuwien.ac.at> <t46mg6$27fl$1@gal.iecc.com>
<t46ujj$7n3$1@newsreader4.netcologne.de> <t47icp$2lno$1@gal.iecc.com>
Injection-Date: Tue, 26 Apr 2022 06:33:07 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd4-f179-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd4:f179:0:7285:c2ff:fe6c:992d";
logging-data="559"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 26 Apr 2022 06:33 UTC

John Levine <johnl@taugh.com> schrieb:
> According to Thomas Koenig <tkoenig@netcologne.de>:
>>> New models of the z series have had some oddly specific addtions,
>>> like the DEFLATE instruction that does the inner part of gzip
>>
>>This is something they had previously as "zEnterprise Data
>>Compression". Apparently, you can use this from Java (among
>>others).
>
> There aren't very many deflate/gzip libraries so I would hope they'd
> have versions of the libraries that use the instruction, but I haven't
> been sufficiently interested to go look.

zlib is also using this.

>>> and the digital signature instruction that does elliptic curve signing
>>> and verifying. I realize those are both reasonably common in web
>>> servers but it'd surprise me if they were enough of a bottleneck
>>> to merit putting them in microcode.
>>
>>Seems like it is possible to create a compressed data set on
>>MVS^H^H^HzOs.
>
> Possibly, but http potentially does deflate encoding for every request.
> If you're using your computer as a web server, you're going to do
> a whole lot of deflating.

Sure, for a given computer, it could be useful both for file
compression and for http decoding.

Given the huge disparity in price between a computer based on an
Intel or AMD CPU and a zSystem, I simply doubt that zSystems are
much used as web servers, where their high reliability (usually)
does not bring advantages to offset the higher cost.

Going through large amounts of data, though, seems like
something that is right up a mainframe's alley.

POWER systems can also have hardware support for gzip
running AIX, but they have is on an accelerator, not
part of the CPU instructions.

>>As for elliptic cryptography... if you want high-security
>>data transfer, that's what you need.
>
> This does signing and verification. Encryption is a different feature.
> I'm pretty sure this is for TLS.
>
>>> Beyond the question of putting them in the instruction set,
>>> what application code uses them? I can see that it wouldn't
>>> be too hard to get a web browser to use DEFLATE and elliptic
>>> crypto, but what would use vector decimal? I don't think
>>> parallel COBOL is a thing.
>>
>>Possibly to speed up SAP by half a percent?
>
> I suppose, but as always, what's the tradeoff between this special
> feature and something general like a bigger cache?

Obviously something that IBM felt worthwhile doing for their
customers.

Re: IBM features, Extended double precision decimal floating point

<t48d3n$fe7$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24917&group=comp.arch#24917

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: IBM features, Extended double precision decimal floating point
Date: Tue, 26 Apr 2022 04:13:22 -0500
Organization: A noiseless patient Spider
Lines: 161
Message-ID: <t48d3n$fe7$1@dont-email.me>
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<2022Apr25.083641@mips.complang.tuwien.ac.at> <t46mg6$27fl$1@gal.iecc.com>
<t46ujj$7n3$1@newsreader4.netcologne.de> <t47icp$2lno$1@gal.iecc.com>
<t483n3$hf$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 26 Apr 2022 09:13:27 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="2b207abcb286b10df31b545ad24e9b45";
logging-data="15815"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+zHNz4ysTFV9moOZbWZ0gt"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.8.0
Cancel-Lock: sha1:KCAIs+eqAtxqII3X3PGjKVcR2no=
In-Reply-To: <t483n3$hf$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: BGB - Tue, 26 Apr 2022 09:13 UTC

On 4/26/2022 1:33 AM, Thomas Koenig wrote:
> John Levine <johnl@taugh.com> schrieb:
>> According to Thomas Koenig <tkoenig@netcologne.de>:
>>>> New models of the z series have had some oddly specific addtions,
>>>> like the DEFLATE instruction that does the inner part of gzip
>>>
>>> This is something they had previously as "zEnterprise Data
>>> Compression". Apparently, you can use this from Java (among
>>> others).
>>
>> There aren't very many deflate/gzip libraries so I would hope they'd
>> have versions of the libraries that use the instruction, but I haven't
>> been sufficiently interested to go look.
>
> zlib is also using this.
>
>>>> and the digital signature instruction that does elliptic curve signing
>>>> and verifying. I realize those are both reasonably common in web
>>>> servers but it'd surprise me if they were enough of a bottleneck
>>>> to merit putting them in microcode.
>>>
>>> Seems like it is possible to create a compressed data set on
>>> MVS^H^H^HzOs.
>>
>> Possibly, but http potentially does deflate encoding for every request.
>> If you're using your computer as a web server, you're going to do
>> a whole lot of deflating.
>
> Sure, for a given computer, it could be useful both for file
> compression and for http decoding.
>
> Given the huge disparity in price between a computer based on an
> Intel or AMD CPU and a zSystem, I simply doubt that zSystems are
> much used as web servers, where their high reliability (usually)
> does not bring advantages to offset the higher cost.
>
> Going through large amounts of data, though, seems like
> something that is right up a mainframe's alley.
>
> POWER systems can also have hardware support for gzip
> running AIX, but they have is on an accelerator, not
> part of the CPU instructions.
>

Does raise the question if there would be a good set of ISA extensions
to help with codec tasks.

Huffman would be an obvious place to look, but generally this is more
bottlenecked by L1 misses and similar than by the ISA itself.

Most likely option would be some sort of specialized associative cache
with and/or special addressing mode intended specifically for dealing
with Huffman coded bitstreams.

For example, say, for decoding:
Cache uses an (Rb+((Ri>>Rj)&32767)*2) addressing mode;
Cache likely uses extra small cache lines (32 or 64 bit);
Cache is read-only and set-associative;
Probably addressed in terms of 16-bit words.

Maybe also a bit-hash instruction, eg:
Hashes 24 bits down to 8, or similar;
More relevant for encoders.

....

OTOH, for a hardware-accelerated compressor, it seems almost like a
bitwise range-coder could make sense. Some of the stuff which makes
doing a bitwise range-coder suck in software are less relevant to a
hardware implementation.

But, this would be less like hardware-accelerated Deflate and more like
hardware accelerated LZMA or similar.

Though, a lot depends on the relative speed of the codec and the thing
it is connected to. Usually, if one can keep their codec, say, 5x-10x or
more faster than whatever IO device they are connected to, this is good
enough.

Entropy coded formats make this hard though...

As can be noted, my current stack was mostly like:
LZ4: Reasonably OK
RP2: Does slightly better than LZ4 on average
For both decode speed and ratio.
Both LZ4 and RP2 being byte-oriented formats.
FeLZ32: Faster than LZ4 and RP2, worse compression.
FeLZ32 was DWORD oriented (both data and stream).
(Originally, this format assumed aligned-only access)
FeLZ64: Faster than FeLZ32, worse-still compression.
FeLZ64 was QWORD oriented for data, WORD for stream.
(This leverages having fast misaligned QWORD Load/Store)

My LZ4 and RP2 encoders are slower though than the LZ4 command-line tool
at "-1" or similar, I am not sure how the tool is as fast as it is. My
FeLZ encoders were faster than the lz4 command-line tool though (on my
PC, the "lz4 -1" tool seemed to max out at around 600-700 MB/s for
encode; my "fast" encoders having a hard-time getting much over 400
MB/s; decode speeds ~ 1800..3000 MB/s).

The FeLZ64 encoder was generally pulling off ~ 1200 MB/s.
Decode ~ 3800 MB/s for larger buffers, ~ 8000 MB/s for a small buffer.

FeLZ32 is sorta intermediate here.

Likewise, both FeLZ32 and FeLZ64 were faster than SDcard IO speeds on BJX2.
For loading from an SDcard, both RP2 and LZ4 do well, but my encoder was
too slow for pagefile compression (seems like ideally need to stay under
a limit of around "5 clock-cycles per byte" or so to get a useful speedup).

Not really using Deflate on BJX2, mostly because it is pretty
heavyweight compared with the others. Not a lot of cases where "slightly
better compression" justified "significantly slower".

I guess it does bring up a question of the relative performance range of
such an accelerator.

>>> As for elliptic cryptography... if you want high-security
>>> data transfer, that's what you need.
>>
>> This does signing and verification. Encryption is a different feature.
>> I'm pretty sure this is for TLS.
>>
>>>> Beyond the question of putting them in the instruction set,
>>>> what application code uses them? I can see that it wouldn't
>>>> be too hard to get a web browser to use DEFLATE and elliptic
>>>> crypto, but what would use vector decimal? I don't think
>>>> parallel COBOL is a thing.
>>>
>>> Possibly to speed up SAP by half a percent?
>>
>> I suppose, but as always, what's the tradeoff between this special
>> feature and something general like a bigger cache?
>
> Obviously something that IBM felt worthwhile doing for their
> customers.

Maybe they were at a point where bigger cache didn't buy much, so it was
better to spend it on something else...

In other random news, I have realized that my BCDADC instruction can be
used to, among other things, make "printf()" and friends a little faster
(mostly by making things like "%d" formatting conversions faster).

Mostly, because:
It can be used to implement an integer to BCD conversion (in a way that
does not depend on integer divide);
Converting to BCD and then printing the BCD is faster than extracting
digits one-at-a-time via integer division.

Re: Extended double precision decimal floating point

<jwvee1kkoni.fsf-monnier+comp.arch@gnu.org>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24918&group=comp.arch#24918

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Extended double precision decimal floating point
Date: Tue, 26 Apr 2022 08:36:31 -0400
Organization: A noiseless patient Spider
Lines: 9
Message-ID: <jwvee1kkoni.fsf-monnier+comp.arch@gnu.org>
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<8f909c18-cfe1-4748-8478-543bac50fdbbn@googlegroups.com>
<980ae858-467d-4b0a-8ef8-4a3fe990dd69n@googlegroups.com>
<t432di$5l6$1@dont-email.me>
<5188795c-42a7-45d4-8243-f4bbaa3a96a9n@googlegroups.com>
<26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com>
<t45m4t$1057$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="3dedf1fae152a824c4de45992a2604c0";
logging-data="8601"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18KMjnj3clnTgKSAXSBTFSX"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)
Cancel-Lock: sha1:tlM3xcbcelF3VFa/tBrD6YBKfyU=
sha1:SKVMopGwnGuCREbUTpCrxuyy3kk=
 by: Stefan Monnier - Tue, 26 Apr 2022 12:36 UTC

> This is the crux right here: Fixed-point decimal, using unsigned binary
> variables and external scale, will almost always be faster than both binary
> FP and decimal FP, but you need to hand-code (or get your compiler to do it
> for you?) every individual operation.

Why *unsigned* binary?

Stefan

Re: Extended double precision decimal floating point

<t48qqk$1bhh$1@gioia.aioe.org>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24919&group=comp.arch#24919

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!T3F9KNSTSM9ffyC31YXeHw.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Extended double precision decimal floating point
Date: Tue, 26 Apr 2022 15:07:35 +0200
Organization: Aioe.org NNTP Server
Message-ID: <t48qqk$1bhh$1@gioia.aioe.org>
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<8f909c18-cfe1-4748-8478-543bac50fdbbn@googlegroups.com>
<980ae858-467d-4b0a-8ef8-4a3fe990dd69n@googlegroups.com>
<t432di$5l6$1@dont-email.me>
<5188795c-42a7-45d4-8243-f4bbaa3a96a9n@googlegroups.com>
<26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com>
<t45m4t$1057$1@gioia.aioe.org> <jwvee1kkoni.fsf-monnier+comp.arch@gnu.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="44593"; posting-host="T3F9KNSTSM9ffyC31YXeHw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.11.1
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Tue, 26 Apr 2022 13:07 UTC

Stefan Monnier wrote:
>> This is the crux right here: Fixed-point decimal, using unsigned binary
>> variables and external scale, will almost always be faster than both binary
>> FP and decimal FP, but you need to hand-code (or get your compiler to do it
>> for you?) every individual operation.
>
> Why *unsigned* binary?

Because you have to work with arrays of words, and I really don't want
those words to be signed!

There will of course be a sign bit somewhere, but it does not take any
direct part in the core operations which happens on arrays of unsigned.

OK?

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Extended double precision decimal floating point

<jwv1qxjyir3.fsf-monnier+comp.arch@gnu.org>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24920&group=comp.arch#24920

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Extended double precision decimal floating point
Date: Tue, 26 Apr 2022 11:21:19 -0400
Organization: A noiseless patient Spider
Lines: 16
Message-ID: <jwv1qxjyir3.fsf-monnier+comp.arch@gnu.org>
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<8f909c18-cfe1-4748-8478-543bac50fdbbn@googlegroups.com>
<980ae858-467d-4b0a-8ef8-4a3fe990dd69n@googlegroups.com>
<t432di$5l6$1@dont-email.me>
<5188795c-42a7-45d4-8243-f4bbaa3a96a9n@googlegroups.com>
<26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com>
<t45m4t$1057$1@gioia.aioe.org>
<jwvee1kkoni.fsf-monnier+comp.arch@gnu.org>
<t48qqk$1bhh$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="6e8c3574a8c7125ca34cd2c8a6957841";
logging-data="24067"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18QPHMjilyKboQN+wXvStmF"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)
Cancel-Lock: sha1:jL78qodgbee83kznTSp3YqXQcyo=
sha1:a3eB3YqgXwpvMjDEvd6+BsU6+/g=
 by: Stefan Monnier - Tue, 26 Apr 2022 15:21 UTC

Terje Mathisen [2022-04-26 15:07:35] wrote:
> Stefan Monnier wrote:
>>> This is the crux right here: Fixed-point decimal, using unsigned binary
>>> variables and external scale, will almost always be faster than both binary
>>> FP and decimal FP, but you need to hand-code (or get your compiler to do it
>>> for you?) every individual operation.
>> Why *unsigned* binary?
> Because you have to work with arrays of words, and I really don't want those
> words to be signed!

Are sorry. You were talking about the unsigned fixed size integers
inside the bignums (what GMP calls the "limbs" IIRC), where I was
thinking about the bignums themselves.

Stefan

Re: Extended double precision decimal floating point

<bc0883ee-9ea3-404a-8a69-556988acad6cn@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24921&group=comp.arch#24921

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a0c:fd8d:0:b0:456:3481:603c with SMTP id p13-20020a0cfd8d000000b004563481603cmr8858411qvr.69.1650990327674;
Tue, 26 Apr 2022 09:25:27 -0700 (PDT)
X-Received: by 2002:aca:e155:0:b0:325:6d76:da4b with SMTP id
y82-20020acae155000000b003256d76da4bmr829972oig.125.1650990327438; Tue, 26
Apr 2022 09:25:27 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.mixmin.net!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 26 Apr 2022 09:25:27 -0700 (PDT)
In-Reply-To: <jwv1qxjyir3.fsf-monnier+comp.arch@gnu.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7c7a:6620:ce7b:8179;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7c7a:6620:ce7b:8179
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<8f909c18-cfe1-4748-8478-543bac50fdbbn@googlegroups.com> <980ae858-467d-4b0a-8ef8-4a3fe990dd69n@googlegroups.com>
<t432di$5l6$1@dont-email.me> <5188795c-42a7-45d4-8243-f4bbaa3a96a9n@googlegroups.com>
<26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com> <t45m4t$1057$1@gioia.aioe.org>
<jwvee1kkoni.fsf-monnier+comp.arch@gnu.org> <t48qqk$1bhh$1@gioia.aioe.org> <jwv1qxjyir3.fsf-monnier+comp.arch@gnu.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <bc0883ee-9ea3-404a-8a69-556988acad6cn@googlegroups.com>
Subject: Re: Extended double precision decimal floating point
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 26 Apr 2022 16:25:27 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 21
 by: MitchAlsup - Tue, 26 Apr 2022 16:25 UTC

On Tuesday, April 26, 2022 at 10:21:22 AM UTC-5, Stefan Monnier wrote:
> Terje Mathisen [2022-04-26 15:07:35] wrote:
> > Stefan Monnier wrote:
> >>> This is the crux right here: Fixed-point decimal, using unsigned binary
> >>> variables and external scale, will almost always be faster than both binary
> >>> FP and decimal FP, but you need to hand-code (or get your compiler to do it
> >>> for you?) every individual operation.
> >> Why *unsigned* binary?
> > Because you have to work with arrays of words, and I really don't want those
> > words to be signed!
> Are sorry. You were talking about the unsigned fixed size integers
> inside the bignums (what GMP calls the "limbs" IIRC), where I was
> thinking about the bignums themselves.
<
Yes, indeed.
<
Given a vector of integers considered as a single big-num, at most 1 of these is signed
while the n-1 are unsigned. That is one reason unsigned numbers are more important
than signed.
>
>
> Stefan

Re: Extended double precision decimal floating point

<98806614-aa2d-4c7d-b6d4-081dd892c1cen@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24922&group=comp.arch#24922

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:3189:b0:69f:421e:ba00 with SMTP id bi9-20020a05620a318900b0069f421eba00mr8745047qkb.485.1650995038981;
Tue, 26 Apr 2022 10:43:58 -0700 (PDT)
X-Received: by 2002:a05:6808:14d1:b0:322:aee6:f25 with SMTP id
f17-20020a05680814d100b00322aee60f25mr16399073oiw.269.1650995038365; Tue, 26
Apr 2022 10:43:58 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 26 Apr 2022 10:43:58 -0700 (PDT)
In-Reply-To: <bc0883ee-9ea3-404a-8a69-556988acad6cn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<8f909c18-cfe1-4748-8478-543bac50fdbbn@googlegroups.com> <980ae858-467d-4b0a-8ef8-4a3fe990dd69n@googlegroups.com>
<t432di$5l6$1@dont-email.me> <5188795c-42a7-45d4-8243-f4bbaa3a96a9n@googlegroups.com>
<26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com> <t45m4t$1057$1@gioia.aioe.org>
<jwvee1kkoni.fsf-monnier+comp.arch@gnu.org> <t48qqk$1bhh$1@gioia.aioe.org>
<jwv1qxjyir3.fsf-monnier+comp.arch@gnu.org> <bc0883ee-9ea3-404a-8a69-556988acad6cn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <98806614-aa2d-4c7d-b6d4-081dd892c1cen@googlegroups.com>
Subject: Re: Extended double precision decimal floating point
From: already5...@yahoo.com (Michael S)
Injection-Date: Tue, 26 Apr 2022 17:43:58 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 24
 by: Michael S - Tue, 26 Apr 2022 17:43 UTC

On Tuesday, April 26, 2022 at 7:25:29 PM UTC+3, MitchAlsup wrote:
> On Tuesday, April 26, 2022 at 10:21:22 AM UTC-5, Stefan Monnier wrote:
> > Terje Mathisen [2022-04-26 15:07:35] wrote:
> > > Stefan Monnier wrote:
> > >>> This is the crux right here: Fixed-point decimal, using unsigned binary
> > >>> variables and external scale, will almost always be faster than both binary
> > >>> FP and decimal FP, but you need to hand-code (or get your compiler to do it
> > >>> for you?) every individual operation.
> > >> Why *unsigned* binary?
> > > Because you have to work with arrays of words, and I really don't want those
> > > words to be signed!
> > Are sorry. You were talking about the unsigned fixed size integers
> > inside the bignums (what GMP calls the "limbs" IIRC), where I was
> > thinking about the bignums themselves.
> <
> Yes, indeed.
> <
> Given a vector of integers considered as a single big-num, at most 1 of these is signed
> while the n-1 are unsigned. That is one reason unsigned numbers are more important
> than signed.

Another reason to prefer sign-magnitude over two-complement as internal format for
Big Integer library is weak support for mixed signed-unsigned arithmetic in many
instruction sets and even worse support in all popular HLLs that are likely candidates
for implementation language of Big Integer libraries.

Re: IBM features, Extended double precision decimal floating point

<t49kbt$2n4m$1@gal.iecc.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24923&group=comp.arch#24923

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: IBM features, Extended double precision decimal floating point
Date: Tue, 26 Apr 2022 20:23:25 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <t49kbt$2n4m$1@gal.iecc.com>
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com> <t46ujj$7n3$1@newsreader4.netcologne.de> <t47icp$2lno$1@gal.iecc.com> <t483n3$hf$1@newsreader4.netcologne.de>
Injection-Date: Tue, 26 Apr 2022 20:23:25 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="89238"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com> <t46ujj$7n3$1@newsreader4.netcologne.de> <t47icp$2lno$1@gal.iecc.com> <t483n3$hf$1@newsreader4.netcologne.de>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Tue, 26 Apr 2022 20:23 UTC

According to Thomas Koenig <tkoenig@netcologne.de>:
>> Possibly, but http potentially does deflate encoding for every request.
>> If you're using your computer as a web server, you're going to do
>> a whole lot of deflating.
>
>Sure, for a given computer, it could be useful both for file
>compression and for http decoding.
>
>Given the huge disparity in price between a computer based on an
>Intel or AMD CPU and a zSystem, I simply doubt that zSystems are
>much used as web servers, where their high reliability (usually)
>does not bring advantages to offset the higher cost.

IBM sure seems to promote linux on Z as a high performance web server:

https://www.ibm.com/downloads/cas/POB59BLE

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: Extended double precision decimal floating point

<2022Apr26.223557@mips.complang.tuwien.ac.at>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24924&group=comp.arch#24924

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Extended double precision decimal floating point
Date: Tue, 26 Apr 2022 20:35:57 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 18
Message-ID: <2022Apr26.223557@mips.complang.tuwien.ac.at>
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com> <t432di$5l6$1@dont-email.me> <5188795c-42a7-45d4-8243-f4bbaa3a96a9n@googlegroups.com> <26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com> <t45m4t$1057$1@gioia.aioe.org> <jwvee1kkoni.fsf-monnier+comp.arch@gnu.org> <t48qqk$1bhh$1@gioia.aioe.org> <jwv1qxjyir3.fsf-monnier+comp.arch@gnu.org> <bc0883ee-9ea3-404a-8a69-556988acad6cn@googlegroups.com> <98806614-aa2d-4c7d-b6d4-081dd892c1cen@googlegroups.com>
Injection-Info: reader02.eternal-september.org; posting-host="3db2ed2557800fb90797950879335bb2";
logging-data="15715"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18h3dhkYXYXmM4GtKR4opun"
Cancel-Lock: sha1:XhZtdRQzxDCLNLwtoAHZHdVbs7c=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 26 Apr 2022 20:35 UTC

Michael S <already5chosen@yahoo.com> writes:
[reformatted for conventional Usenet line length]
>Another reason to prefer sign-magnitude over two-complement as
>internal format for Big Integer library is weak support for mixed
>signed-unsigned arithmetic in many instruction sets and even worse
>support in all popular HLLs that are likely candidates for
>implementation language of Big Integer libraries.

I don't see why you would need such instructions for implementing
twos-complement big-integers.

I don't see a single reason for sign-magnitude big-integer
representation.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: IBM features, Extended double precision decimal floating point

<2022Apr26.224101@mips.complang.tuwien.ac.at>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24925&group=comp.arch#24925

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: IBM features, Extended double precision decimal floating point
Date: Tue, 26 Apr 2022 20:41:01 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 28
Message-ID: <2022Apr26.224101@mips.complang.tuwien.ac.at>
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com> <26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com> <c64714c1-ec97-4ac4-871b-63578dd8cf1dn@googlegroups.com> <2022Apr25.083641@mips.complang.tuwien.ac.at> <t46mg6$27fl$1@gal.iecc.com>
Injection-Info: reader02.eternal-september.org; posting-host="3db2ed2557800fb90797950879335bb2";
logging-data="15715"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18F3zEQCl7gWWdGQboURaf3"
Cancel-Lock: sha1:SFi7LG+VyU4eFRhFPCnNZNRzGjE=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 26 Apr 2022 20:41 UTC

John Levine <johnl@taugh.com> writes:
>New models of the z series have had some oddly specific addtions,
>like the DEFLATE instruction that does the inner part of gzip and
>the digital signature instruction that does elliptic curve signing
>and verifying. I realize those are both reasonably common in web
>servers but it'd surprise me if they were enough of a bottleneck
>to merit putting them in microcode.

The question is what benefit microcode buys over architectural code.
If you do just the same things, microcode is not any faster on current
machines. I remember one case (don't remember if it was an ARM
instruction or an Intel instruction, but IIRC it is a crypto
instruction) where the microcode could use more registers than the
architected registers, and could therefore avoid the loads and stores
that architectural code would have to make.

>what would use vector decimal? I don't think
>parallel COBOL is a thing.

Languages typically don't use vector extensions, but hope for
auto-vectorization ("rely on" would be a misnomer for such an
unreliable compiler feature). So maybe the idea is that the compiler
would find opportunities to use these instructions.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Extended double precision decimal floating point

<89119ea8-992f-41a1-9e6d-51f6df7712abn@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24926&group=comp.arch#24926

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:b442:0:b0:69a:fc75:ca52 with SMTP id d63-20020a37b442000000b0069afc75ca52mr14320917qkf.730.1651007568852;
Tue, 26 Apr 2022 14:12:48 -0700 (PDT)
X-Received: by 2002:a05:6808:1926:b0:323:3c4:947d with SMTP id
bf38-20020a056808192600b0032303c4947dmr13106568oib.103.1651007568504; Tue, 26
Apr 2022 14:12:48 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 26 Apr 2022 14:12:48 -0700 (PDT)
In-Reply-To: <2022Apr26.223557@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2a0d:6fc2:55b0:ca00:4866:39ee:13d9:1f18;
posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 2a0d:6fc2:55b0:ca00:4866:39ee:13d9:1f18
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<t432di$5l6$1@dont-email.me> <5188795c-42a7-45d4-8243-f4bbaa3a96a9n@googlegroups.com>
<26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com> <t45m4t$1057$1@gioia.aioe.org>
<jwvee1kkoni.fsf-monnier+comp.arch@gnu.org> <t48qqk$1bhh$1@gioia.aioe.org>
<jwv1qxjyir3.fsf-monnier+comp.arch@gnu.org> <bc0883ee-9ea3-404a-8a69-556988acad6cn@googlegroups.com>
<98806614-aa2d-4c7d-b6d4-081dd892c1cen@googlegroups.com> <2022Apr26.223557@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <89119ea8-992f-41a1-9e6d-51f6df7712abn@googlegroups.com>
Subject: Re: Extended double precision decimal floating point
From: already5...@yahoo.com (Michael S)
Injection-Date: Tue, 26 Apr 2022 21:12:48 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 24
 by: Michael S - Tue, 26 Apr 2022 21:12 UTC

On Tuesday, April 26, 2022 at 11:40:55 PM UTC+3, Anton Ertl wrote:
> Michael S <already...@yahoo.com> writes:
> [reformatted for conventional Usenet line length]
> >Another reason to prefer sign-magnitude over two-complement as
> >internal format for Big Integer library is weak support for mixed
> >signed-unsigned arithmetic in many instruction sets and even worse
> >support in all popular HLLs that are likely candidates for
> >implementation language of Big Integer libraries.
> I don't see why you would need such instructions for implementing
> twos-complement big-integers.
>
> I don't see a single reason for sign-magnitude big-integer
> representation.
>
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

So, what would be your code for, say, multiplication, when numbers are stored as
two-complements?
Can you sketch it here? Leave away management, memory allocation, prescaling,
postscaling, etc...
Just show two inner loops of convolution.

Re: IBM features, Extended double precision decimal floating point

<24374d50-bd90-41ac-88e3-8b9fa850d6d3n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24927&group=comp.arch#24927

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1750:b0:2f3:6453:b382 with SMTP id l16-20020a05622a175000b002f36453b382mr10359016qtk.396.1651009911910;
Tue, 26 Apr 2022 14:51:51 -0700 (PDT)
X-Received: by 2002:a05:6870:d1cd:b0:e1:e7ee:faa0 with SMTP id
b13-20020a056870d1cd00b000e1e7eefaa0mr14188980oac.5.1651009911637; Tue, 26
Apr 2022 14:51:51 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 26 Apr 2022 14:51:51 -0700 (PDT)
In-Reply-To: <2022Apr26.224101@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:e4ce:ac1d:c314:dd91;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:e4ce:ac1d:c314:dd91
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com> <c64714c1-ec97-4ac4-871b-63578dd8cf1dn@googlegroups.com>
<2022Apr25.083641@mips.complang.tuwien.ac.at> <t46mg6$27fl$1@gal.iecc.com> <2022Apr26.224101@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <24374d50-bd90-41ac-88e3-8b9fa850d6d3n@googlegroups.com>
Subject: Re: IBM features, Extended double precision decimal floating point
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 26 Apr 2022 21:51:51 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 56
 by: MitchAlsup - Tue, 26 Apr 2022 21:51 UTC

On Tuesday, April 26, 2022 at 3:50:38 PM UTC-5, Anton Ertl wrote:
> John Levine <jo...@taugh.com> writes:
> >New models of the z series have had some oddly specific addtions,
> >like the DEFLATE instruction that does the inner part of gzip and
> >the digital signature instruction that does elliptic curve signing
> >and verifying. I realize those are both reasonably common in web
> >servers but it'd surprise me if they were enough of a bottleneck
> >to merit putting them in microcode.
<
> The question is what benefit microcode buys over architectural code.
> If you do just the same things, microcode is not any faster on current
> machines.
<
You can microcode the execution pipeline
OR
You can microcode a function unit
>
When you microcode a function unit, every other function unit is free to run
other code while one function unit crunches on its current instruction.
<
Microcoding the execution pipeline has (deservedly) fallen out of fashion.
{AND:: One can call microcoding of a function unit to simply be sequencing--
where it gets fuzzy is when you sequence a pair of function units to operate
on one instruction using the joint set of resources.}
<
> I remember one case (don't remember if it was an ARM
> instruction or an Intel instruction, but IIRC it is a crypto
> instruction) where the microcode could use more registers than the
> architected registers, and could therefore avoid the loads and stores
> that architectural code would have to make.
<
All the AMD machines had "more" registers available in microcode than
in the instruction code. This made it easy for microcoded sequences
not to need to save/restore registers needed to carry out the calculation.
I suspect Intel would do similarly.
<
> >what would use vector decimal? I don't think
> >parallel COBOL is a thing.
<
> Languages typically don't use vector extensions, but hope for
> auto-vectorization ("rely on" would be a misnomer for such an
> unreliable compiler feature). So maybe the idea is that the compiler
> would find opportunities to use these instructions.
<
x(*,*) = x(*,*) + Y(*,*) × Z(*,*)
>
Looks pretty vectorish to me !! (modern FORTRAN)
<
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Extended double precision decimal floating point

<e8ae5a46-0fbe-4eae-8987-8c0b1676fef4n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24928&group=comp.arch#24928

 copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:4e46:0:b0:2e1:b933:ec06 with SMTP id e6-20020ac84e46000000b002e1b933ec06mr16875548qtw.684.1651010162290;
Tue, 26 Apr 2022 14:56:02 -0700 (PDT)
X-Received: by 2002:a05:6830:1d93:b0:605:42d1:d911 with SMTP id
y19-20020a0568301d9300b0060542d1d911mr8922738oti.158.1651010162063; Tue, 26
Apr 2022 14:56:02 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 26 Apr 2022 14:56:01 -0700 (PDT)
In-Reply-To: <89119ea8-992f-41a1-9e6d-51f6df7712abn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:e4ce:ac1d:c314:dd91;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:e4ce:ac1d:c314:dd91
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<t432di$5l6$1@dont-email.me> <5188795c-42a7-45d4-8243-f4bbaa3a96a9n@googlegroups.com>
<26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com> <t45m4t$1057$1@gioia.aioe.org>
<jwvee1kkoni.fsf-monnier+comp.arch@gnu.org> <t48qqk$1bhh$1@gioia.aioe.org>
<jwv1qxjyir3.fsf-monnier+comp.arch@gnu.org> <bc0883ee-9ea3-404a-8a69-556988acad6cn@googlegroups.com>
<98806614-aa2d-4c7d-b6d4-081dd892c1cen@googlegroups.com> <2022Apr26.223557@mips.complang.tuwien.ac.at>
<89119ea8-992f-41a1-9e6d-51f6df7712abn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <e8ae5a46-0fbe-4eae-8987-8c0b1676fef4n@googlegroups.com>
Subject: Re: Extended double precision decimal floating point
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 26 Apr 2022 21:56:02 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 80
 by: MitchAlsup - Tue, 26 Apr 2022 21:56 UTC

On Tuesday, April 26, 2022 at 4:12:50 PM UTC-5, Michael S wrote:
> On Tuesday, April 26, 2022 at 11:40:55 PM UTC+3, Anton Ertl wrote:
> > Michael S <already...@yahoo.com> writes:
> > [reformatted for conventional Usenet line length]
> > >Another reason to prefer sign-magnitude over two-complement as
> > >internal format for Big Integer library is weak support for mixed
> > >signed-unsigned arithmetic in many instruction sets and even worse
> > >support in all popular HLLs that are likely candidates for
> > >implementation language of Big Integer libraries.
> > I don't see why you would need such instructions for implementing
> > twos-complement big-integers.
> >
> > I don't see a single reason for sign-magnitude big-integer
> > representation.
> >
> > - anton
> > --
> > 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> > Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>
> So, what would be your code for, say, multiplication, when numbers are stored as
> two-complements?
> Can you sketch it here? Leave away management, memory allocation, prescaling,
> postscaling, etc...
> Just show two inner loops of convolution.

void Long_multiplication( uint64_t multiplicand[],
multiplier[],
sum[],
ilength, jlength )
{ for( uint64_t i = 0;
i < (ilength + jlength);
i++ )
sum[i] = 0;

for( uint64_t acarry = j = 0; j < jlength; j++ )
{
for( uint64_t mcarry = i = 0; i < ilength; i++ )
{
{mcarry, product} = multiplicand[i]*multiplier[j]
+ mcarry;
{acarry,sum[i+j]} = {sum[i+j],acarry} + product;
}
}
}

Or in assembly::

ENTRY long_multilication
GLOBAL long_multilication
long_multiplication:
MOV Ri,#0
ADD Rij,Rilength,Rjlength
VEC Rdummy,{Ri}
s_loop:
ST #0,[Rsum+Ri<<3]
LOOPLT Ri,#1,Rij

MOV Rca,#0
MOV Rj,#0
j_loop:
MOV Rcm,#0
MOV Ri,#0
LDD Rmp,[Rmpp+Rj<<3] // multiplier[j]
VEC Rdummy,{Ri}
i_loop:
ADD Rij,Ri,Rj
LDD Rmc,[Rmpc+Rj<<3] // multiplicand[i]
LDD Rsm,[Rsmp+Rij<<3] // sum[i+j]
CARRY Rcm,{IO}
MUL Rpr,Rmp,Rmc // MAC {Rcm,Rpr},{Rmp,Rcm},Rmc
CARRY Rca,{IO}
ADD Rsum,Rsum,Rpr // ADC {Rca,Rsum},{Rsum,Rca},Rpr
STD Rsum,[Rsum+Rij<<3] // sum[i+j]
LOOPLT Ri,#1,Rilength,

ADD Rj,Rj,#1 // j loop iterator
CMP Rt,Rj,Rjlength
BLT Rt,j_loop

RET

Re: Extended double precision decimal floating point

<2479aaa4-1054-4aaf-bfe4-f96c2238db74n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24929&group=comp.arch#24929

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:48b:b0:69f:789b:c478 with SMTP id 11-20020a05620a048b00b0069f789bc478mr3140928qkr.111.1651010950656;
Tue, 26 Apr 2022 15:09:10 -0700 (PDT)
X-Received: by 2002:a05:6830:907:b0:605:81d5:9eaa with SMTP id
v7-20020a056830090700b0060581d59eaamr8935340ott.58.1651010950281; Tue, 26 Apr
2022 15:09:10 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 26 Apr 2022 15:09:10 -0700 (PDT)
In-Reply-To: <e8ae5a46-0fbe-4eae-8987-8c0b1676fef4n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2a0d:6fc2:55b0:ca00:4866:39ee:13d9:1f18;
posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 2a0d:6fc2:55b0:ca00:4866:39ee:13d9:1f18
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com>
<t432di$5l6$1@dont-email.me> <5188795c-42a7-45d4-8243-f4bbaa3a96a9n@googlegroups.com>
<26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com> <t45m4t$1057$1@gioia.aioe.org>
<jwvee1kkoni.fsf-monnier+comp.arch@gnu.org> <t48qqk$1bhh$1@gioia.aioe.org>
<jwv1qxjyir3.fsf-monnier+comp.arch@gnu.org> <bc0883ee-9ea3-404a-8a69-556988acad6cn@googlegroups.com>
<98806614-aa2d-4c7d-b6d4-081dd892c1cen@googlegroups.com> <2022Apr26.223557@mips.complang.tuwien.ac.at>
<89119ea8-992f-41a1-9e6d-51f6df7712abn@googlegroups.com> <e8ae5a46-0fbe-4eae-8987-8c0b1676fef4n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2479aaa4-1054-4aaf-bfe4-f96c2238db74n@googlegroups.com>
Subject: Re: Extended double precision decimal floating point
From: already5...@yahoo.com (Michael S)
Injection-Date: Tue, 26 Apr 2022 22:09:10 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 83
 by: Michael S - Tue, 26 Apr 2022 22:09 UTC

On Wednesday, April 27, 2022 at 12:56:03 AM UTC+3, MitchAlsup wrote:
> On Tuesday, April 26, 2022 at 4:12:50 PM UTC-5, Michael S wrote:
> > On Tuesday, April 26, 2022 at 11:40:55 PM UTC+3, Anton Ertl wrote:
> > > Michael S <already...@yahoo.com> writes:
> > > [reformatted for conventional Usenet line length]
> > > >Another reason to prefer sign-magnitude over two-complement as
> > > >internal format for Big Integer library is weak support for mixed
> > > >signed-unsigned arithmetic in many instruction sets and even worse
> > > >support in all popular HLLs that are likely candidates for
> > > >implementation language of Big Integer libraries.
> > > I don't see why you would need such instructions for implementing
> > > twos-complement big-integers.
> > >
> > > I don't see a single reason for sign-magnitude big-integer
> > > representation.
> > >
> > > - anton
> > > --
> > > 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> > > Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>
> > So, what would be your code for, say, multiplication, when numbers are stored as
> > two-complements?
> > Can you sketch it here? Leave away management, memory allocation, prescaling,
> > postscaling, etc...
> > Just show two inner loops of convolution.
> void Long_multiplication( uint64_t multiplicand[],
> multiplier[],
> sum[],
> ilength, jlength )
> {
> for( uint64_t i = 0;
> i < (ilength + jlength);
> i++ )
> sum[i] = 0;
>
> for( uint64_t acarry = j = 0; j < jlength; j++ )
> {
> for( uint64_t mcarry = i = 0; i < ilength; i++ )
> {
> {mcarry, product} = multiplicand[i]*multiplier[j]
> + mcarry;
> {acarry,sum[i+j]} = {sum[i+j],acarry} + product;
> }
> }
> }
>
> Or in assembly::
>
> ENTRY long_multilication
> GLOBAL long_multilication
> long_multiplication:
> MOV Ri,#0
> ADD Rij,Rilength,Rjlength
> VEC Rdummy,{Ri}
> s_loop:
> ST #0,[Rsum+Ri<<3]
> LOOPLT Ri,#1,Rij
>
> MOV Rca,#0
> MOV Rj,#0
> j_loop:
> MOV Rcm,#0
> MOV Ri,#0
> LDD Rmp,[Rmpp+Rj<<3] // multiplier[j]
> VEC Rdummy,{Ri}
> i_loop:
> ADD Rij,Ri,Rj
> LDD Rmc,[Rmpc+Rj<<3] // multiplicand[i]
> LDD Rsm,[Rsmp+Rij<<3] // sum[i+j]
> CARRY Rcm,{IO}
> MUL Rpr,Rmp,Rmc // MAC {Rcm,Rpr},{Rmp,Rcm},Rmc
> CARRY Rca,{IO}
> ADD Rsum,Rsum,Rpr // ADC {Rca,Rsum},{Rsum,Rca},Rpr
> STD Rsum,[Rsum+Rij<<3] // sum[i+j]
> LOOPLT Ri,#1,Rilength,
>
> ADD Rj,Rj,#1 // j loop iterator
> CMP Rt,Rj,Rjlength
> BLT Rt,j_loop
>
> RET

Did you read my question?

Re: IBM features, Extended double precision decimal floating point

<2022Apr27.131221@mips.complang.tuwien.ac.at>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24931&group=comp.arch#24931

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: IBM features, Extended double precision decimal floating point
Date: Wed, 27 Apr 2022 11:12:21 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 41
Message-ID: <2022Apr27.131221@mips.complang.tuwien.ac.at>
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com> <26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com> <c64714c1-ec97-4ac4-871b-63578dd8cf1dn@googlegroups.com> <2022Apr25.083641@mips.complang.tuwien.ac.at> <t46mg6$27fl$1@gal.iecc.com> <2022Apr26.224101@mips.complang.tuwien.ac.at> <24374d50-bd90-41ac-88e3-8b9fa850d6d3n@googlegroups.com>
Injection-Info: reader02.eternal-september.org; posting-host="f90fc482fbc16649be01c1f23dcc5982";
logging-data="19973"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+ANgNVMntyvUTSI7Vae44M"
Cancel-Lock: sha1:/GMqobdx8AQ6omBjG4Wz4a+oBFA=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Wed, 27 Apr 2022 11:12 UTC

MitchAlsup <MitchAlsup@aol.com> writes:
>On Tuesday, April 26, 2022 at 3:50:38 PM UTC-5, Anton Ertl wrote:
>> Languages typically don't use vector extensions, but hope for=20
>> auto-vectorization ("rely on" would be a misnomer for such an=20
>> unreliable compiler feature). So maybe the idea is that the compiler=20
>> would find opportunities to use these instructions.
><
> x(*,*) =3D x(*,*) + Y(*,*) =C3=97 Z(*,*)
>>
>Looks pretty vectorish to me !! (modern FORTRAN)

Fortran accepts the operator "×"?

Fortran's array sub-language goes mostly in the right direction, but
is not typical of programming languages.

Also, Thomas Koenig tells us that gcc converts the array notation into
loops of scalar operations, and then, if you are lucky, the
auto-vectorizer produces SIMD code from that.

As for the syntax, the whole-array notation is

x = x+y*z;

but you can also use array slice notation (where the slices constiture
the whole array):

x(1:n) = x(1:n)+y(1:n)*z(1:n);

When I tried:

x(*) = x(*)+y(*)*z(*);

the f95 compiler reported:

Error: Expected array subscript at (1)

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: IBM features, Extended double precision decimal floating point

<AvcaK.379162$f2a5.77570@fx48.iad>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24932&group=comp.arch#24932

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!feeder1.feed.usenet.farm!feed.usenet.farm!peer01.ams4!peer.am4.highwinds-media.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx48.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: IBM features, Extended double precision decimal floating point
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com> <26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com> <c64714c1-ec97-4ac4-871b-63578dd8cf1dn@googlegroups.com> <2022Apr25.083641@mips.complang.tuwien.ac.at> <t46mg6$27fl$1@gal.iecc.com> <2022Apr26.224101@mips.complang.tuwien.ac.at> <24374d50-bd90-41ac-88e3-8b9fa850d6d3n@googlegroups.com>
In-Reply-To: <24374d50-bd90-41ac-88e3-8b9fa850d6d3n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 58
Message-ID: <AvcaK.379162$f2a5.77570@fx48.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 27 Apr 2022 14:34:08 UTC
Date: Wed, 27 Apr 2022 10:33:53 -0400
X-Received-Bytes: 3832
 by: EricP - Wed, 27 Apr 2022 14:33 UTC

MitchAlsup wrote:
> On Tuesday, April 26, 2022 at 3:50:38 PM UTC-5, Anton Ertl wrote:
>> John Levine <jo...@taugh.com> writes:
>>> New models of the z series have had some oddly specific addtions,
>>> like the DEFLATE instruction that does the inner part of gzip and
>>> the digital signature instruction that does elliptic curve signing
>>> and verifying. I realize those are both reasonably common in web
>>> servers but it'd surprise me if they were enough of a bottleneck
>>> to merit putting them in microcode.
> <
>> The question is what benefit microcode buys over architectural code.
>> If you do just the same things, microcode is not any faster on current
>> machines.
> <
> You can microcode the execution pipeline
> OR
> You can microcode a function unit
> When you microcode a function unit, every other function unit is free to run
> other code while one function unit crunches on its current instruction.
> <
> Microcoding the execution pipeline has (deservedly) fallen out of fashion.
> {AND:: One can call microcoding of a function unit to simply be sequencing--
> where it gets fuzzy is when you sequence a pair of function units to operate
> on one instruction using the joint set of resources.}

In this case IBM refers to both Deflate and Crypto features as
"accelerators" which I suspect is IBM-speak for "even though the hardware
is already present in your machine, we charge extra to enable this".

While their operations may be triggered by a single ISA instruction,
given the complex nature of each function it seems possible they can be
internally implemented to be operating on multiple macro-instructions
at once from separate threads, and within each could be multiple
pipelines states, and use logic to accelerate each state.

Certainly DEFLATE operates on a block of memory.
While Crypto might operate on a large register blob,
it seems most likely to be a memory block too.

In which case the amount of concurrency available would be dictated
by the local memory coherency rules for individual instructions
similar to block string move instructions.

How many of these fancy memory block operations can be going
at once would be decided by how much LSQ disambiguation can be done,
against normal loads and stores, and against other memory block operations.

For example, if each memory block instruction stuffs an
"I'm working on this 4kB block" entry into the LSQ,
then it could be disambiguated similar to a 64B cache line,
and operations on separate blocks allowed to proceed concurrently
without violating any of the local memory consistency rules.

That in turn decides how many concurrent macro-instructions
the accelerators can potentially handle.

Re: Extended double precision decimal floating point

<2022Apr27.154724@mips.complang.tuwien.ac.at>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24933&group=comp.arch#24933

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Extended double precision decimal floating point
Date: Wed, 27 Apr 2022 13:47:24 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 83
Message-ID: <2022Apr27.154724@mips.complang.tuwien.ac.at>
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com> <26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com> <t45m4t$1057$1@gioia.aioe.org> <jwvee1kkoni.fsf-monnier+comp.arch@gnu.org> <t48qqk$1bhh$1@gioia.aioe.org> <jwv1qxjyir3.fsf-monnier+comp.arch@gnu.org> <bc0883ee-9ea3-404a-8a69-556988acad6cn@googlegroups.com> <98806614-aa2d-4c7d-b6d4-081dd892c1cen@googlegroups.com> <2022Apr26.223557@mips.complang.tuwien.ac.at> <89119ea8-992f-41a1-9e6d-51f6df7712abn@googlegroups.com>
Injection-Info: reader02.eternal-september.org; posting-host="f90fc482fbc16649be01c1f23dcc5982";
logging-data="13166"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19WwocWNYEJdwJjWWftfn5K"
Cancel-Lock: sha1:z7Z7N+KV21lpYfikOaWs9k5ReVQ=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Wed, 27 Apr 2022 13:47 UTC

Michael S <already5chosen@yahoo.com> writes:
>On Tuesday, April 26, 2022 at 11:40:55 PM UTC+3, Anton Ertl wrote:
>> Michael S <already...@yahoo.com> writes:
>> [reformatted for conventional Usenet line length]
>> >Another reason to prefer sign-magnitude over two-complement as
>> >internal format for Big Integer library is weak support for mixed
>> >signed-unsigned arithmetic in many instruction sets and even worse
>> >support in all popular HLLs that are likely candidates for
>> >implementation language of Big Integer libraries.
>> I don't see why you would need such instructions for implementing
>> twos-complement big-integers.
>>
>> I don't see a single reason for sign-magnitude big-integer
>> representation.
....
[Again, reformatted]
>So, what would be your code for, say, multiplication, when numbers
>are stored as two-complements? Can you sketch it here? Leave away
>management, memory allocation, prescaling, postscaling, etc... Just
>show two inner loops of convolution.

Why would big-integer arithmetic have any scaling?

For multiplication, multiplying a two-word integer a with a two-word
integer b (giving a four-word integer c) should display all the
interesting cases, so I'll show that.

c1,c2,c3,c4 are two-word values
al and ah are the low and high words of a; likewise for bl and bh
umul is an unsigned multiplication of two words, with a two-word result

c1 = umul(al,bl);
c2 = umul(ah,bl);
c3 = umul(al,bh);
c4 = umul(ah,bh);
if (signbit(ah))
c4 -= b;
if (signbit(bh))
c4 -= a;
c = c4<<(2*wordwidth) + (c3+c2)<<wordwidth + c1;

I leave it to you to map the last line efficiently onto the
architecture at hand. And of course you want to eliminate the top
word of c if it is just a sign extension of the next word.

For comparison, here is the sign-magnitude counterpart:

ahu = ah & ~SIGNBIT;
bhu = bh & ~SIGNBIT;
c1 = umul(al,bl);
c2 = umul(ahu,bl);
c3 = umul(al,bhu);
c4 = umul(ahu,bhu);
c = c4<<(2*wordwidth) + (c3+c2)<<wordwidth + c1;
c = c | (((ah^bh)&SIGNBIT)<<(3*wordwidth);

Admittedly the overhead of sign-magnitude is independent of the width
of a and b, while the subtractions get more expensive with longer a
and longer b (but the number of multiplies grows with
length(a)*length(b), while the overhead of the subtractions grows only
with length(a)+length(b). OTOH two's-complement addition is cheaper
than sign-magnitude addition. Plus, I think that for most signed
bigint operations the involved lengths are 1; one could special-case
that by using smul and leaving the subtractions away.

I am too lazy to think through how a mixed signed-unsigned would help.
Would it be as simple as:

c1 = uumul(al,bl);
c2 = sumul(ah,bl);
c3 = usmul(al,bh);
c4 = ssmul(ah,bh);
c = c4<<(2*wordwidth) + (c3+c2)<<wordwidth + c1;

?

In that case, the absense of such an instruction is probably due to
the shortness of typical signed big integers.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Extended double precision decimal floating point

<bbcbad97-c906-42e6-9d11-ee3f6bb69d60n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24934&group=comp.arch#24934

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a0c:fb4d:0:b0:456:3a15:30d7 with SMTP id b13-20020a0cfb4d000000b004563a1530d7mr10502543qvq.93.1651074672502; Wed, 27 Apr 2022 08:51:12 -0700 (PDT)
X-Received: by 2002:a05:6808:11ca:b0:2d9:a01a:488b with SMTP id p10-20020a05680811ca00b002d9a01a488bmr17943822oiv.214.1651074672295; Wed, 27 Apr 2022 08:51:12 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!tr2.eu1.usenetexpress.com!feeder.usenetexpress.com!tr1.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 27 Apr 2022 08:51:12 -0700 (PDT)
In-Reply-To: <2022Apr27.154724@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <d1790e73-11b1-497d-abd0-a349fedf750cn@googlegroups.com> <26231a57-6317-4c08-8c37-4508ac3e0a6en@googlegroups.com> <t45m4t$1057$1@gioia.aioe.org> <jwvee1kkoni.fsf-monnier+comp.arch@gnu.org> <t48qqk$1bhh$1@gioia.aioe.org> <jwv1qxjyir3.fsf-monnier+comp.arch@gnu.org> <bc0883ee-9ea3-404a-8a69-556988acad6cn@googlegroups.com> <98806614-aa2d-4c7d-b6d4-081dd892c1cen@googlegroups.com> <2022Apr26.223557@mips.complang.tuwien.ac.at> <89119ea8-992f-41a1-9e6d-51f6df7712abn@googlegroups.com> <2022Apr27.154724@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <bbcbad97-c906-42e6-9d11-ee3f6bb69d60n@googlegroups.com>
Subject: Re: Extended double precision decimal floating point
From: already5...@yahoo.com (Michael S)
Injection-Date: Wed, 27 Apr 2022 15:51:12 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 130
 by: Michael S - Wed, 27 Apr 2022 15:51 UTC

On Wednesday, April 27, 2022 at 5:36:59 PM UTC+3, Anton Ertl wrote:
> Michael S <already...@yahoo.com> writes:
> >On Tuesday, April 26, 2022 at 11:40:55 PM UTC+3, Anton Ertl wrote:
> >> Michael S <already...@yahoo.com> writes:
> >> [reformatted for conventional Usenet line length]
> >> >Another reason to prefer sign-magnitude over two-complement as
> >> >internal format for Big Integer library is weak support for mixed
> >> >signed-unsigned arithmetic in many instruction sets and even worse
> >> >support in all popular HLLs that are likely candidates for
> >> >implementation language of Big Integer libraries.
> >> I don't see why you would need such instructions for implementing
> >> twos-complement big-integers.
> >>
> >> I don't see a single reason for sign-magnitude big-integer
> >> representation.
> ...
> [Again, reformatted]
> >So, what would be your code for, say, multiplication, when numbers
> >are stored as two-complements? Can you sketch it here? Leave away
> >management, memory allocation, prescaling, postscaling, etc... Just
> >show two inner loops of convolution.
> Why would big-integer arithmetic have any scaling?

Pre-scaling likely not needed, at least for multiplication.
Post-scaling is because upfront you don't know an exact number of words in
result. If the number is stored with LS word first then post-scaling only needs
to update metadata. But for data stored with MS word first it can require
a word shift.

>
> For multiplication, multiplying a two-word integer a with a two-word
> integer b (giving a four-word integer c) should display all the
> interesting cases, so I'll show that.
>
> c1,c2,c3,c4 are two-word values
> al and ah are the low and high words of a; likewise for bl and bh
> umul is an unsigned multiplication of two words, with a two-word result

Complication of 2c is more obvious when length of inputs and outputs is variable.
But as example, 2x2 will do.

>
> c1 = umul(al,bl);
> c2 = umul(ah,bl);
> c3 = umul(al,bh);
> c4 = umul(ah,bh);
> if (signbit(ah))
> c4 -= b;
> if (signbit(bh))
> c4 -= a;
> c = c4<<(2*wordwidth) + (c3+c2)<<wordwidth + c1;
>
> I leave it to you to map the last line efficiently onto the
> architecture at hand. And of course you want to eliminate the top
> word of c if it is just a sign extension of the next word.

Yes, that's one possible trick. Relatively simple, but not free in runtime cost.
Esp. non-free when at least one of operands is short.

>
> For comparison, here is the sign-magnitude counterpart:
>
> ahu = ah & ~SIGNBIT;
> bhu = bh & ~SIGNBIT;

I'd expect a sign to be stored separately rather than with MS word.
It shouldn't be hard to find a "free" room for a single bit in header/metadata
area.

> c1 = umul(al,bl);
> c2 = umul(ahu,bl);
> c3 = umul(al,bhu);
> c4 = umul(ahu,bhu);
> c = c4<<(2*wordwidth) + (c3+c2)<<wordwidth + c1;
> c = c | (((ah^bh)&SIGNBIT)<<(3*wordwidth);
>
> Admittedly the overhead of sign-magnitude is independent of the width
> of a and b,

Or no overhead at all, when sign not stored with limb.

> while the subtractions get more expensive with longer a
> and longer b (but the number of multiplies grows with
> length(a)*length(b), while the overhead of the subtractions grows only
> with length(a)+length(b).

I am more concerned with complication than with speed impact.

> OTOH two's-complement addition is cheaper
> than sign-magnitude addition. Plus, I think that for most signed
> bigint operations the involved lengths are 1;

Does not sound obvious.

> one could special-case
> that by using smul and leaving the subtractions away.
>
> I am too lazy to think through how a mixed signed-unsigned would help.
> Would it be as simple as:
>
> c1 = uumul(al,bl);
> c2 = sumul(ah,bl);
> c3 = usmul(al,bh);
> c4 = ssmul(ah,bh);
> c = c4<<(2*wordwidth) + (c3+c2)<<wordwidth + c1;
>
> ?

Yes, you got the idea.
For arbitrary length, utilizing mixed multiplication would be rather
complicated, with main complication related to handling of carry/borrow,
but if done right it can spare you of O(n+m) overhead.
However, why bother? Just use sign-magnitude format!

>
> In that case, the absense of such an instruction is probably due to
> the shortness of typical signed big integers.

IMO, , at least for architectures that were defined more recently (30-35 years),
a difficulty of mapping it into HLL was more important factor.
Another factor is absence of obvious applications. Big Integer does not count
as application for more than a single reason.
May be, there are cases where it helps elliptic curves cryptography?
Few years ago I played rather intensely with ECDSA verification, but already
forgot details.

> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Pages:12345
server_pubkey.txt

rocksolid light 0.9.7
clearnet tor