Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Perfection is achieved only on the point of collapse. -- C. N. Parkinson


devel / comp.arch / Re: Useful floating point instructions

SubjectAuthor
* Approximate reciprocalsMarcus
+* Re: Approximate reciprocalsTerje Mathisen
|+- Re: Approximate reciprocalsrobf...@gmail.com
|+* Re: Approximate reciprocalsMarcus
||+- Re: Approximate reciprocalsMitchAlsup
||`* Re: Approximate reciprocalsTerje Mathisen
|| +- Re: Approximate reciprocalsMarcus
|| `- Re: Approximate reciprocalsMitchAlsup
|`* Re: Approximate reciprocalsQuadibloc
| `- Re: Approximate reciprocalsTerje Mathisen
+* Re: Approximate reciprocalsMitchAlsup
|+* Re: Approximate reciprocalsMarcus
||`* Re: Approximate reciprocalsMitchAlsup
|| `- Re: Approximate reciprocalsBGB
|`* Re: Approximate reciprocalsThomas Koenig
| `* Re: Approximate reciprocalsMitchAlsup
|  `* Re: Approximate reciprocalsThomas Koenig
|   +* Re: Approximate reciprocalsMichael S
|   |`* Re: Approximate reciprocalsThomas Koenig
|   | `* Re: Approximate reciprocalsMichael S
|   |  `* Re: Approximate reciprocalsThomas Koenig
|   |   `* Re: Approximate reciprocalsMichael S
|   |    `* Re: Approximate reciprocalsThomas Koenig
|   |     `* Re: Approximate reciprocalsMichael S
|   |      `* Re: Approximate reciprocalsMichael S
|   |       +* Re: Approximate reciprocalsTerje Mathisen
|   |       |+* Re: Approximate reciprocalsMitchAlsup
|   |       ||`* Re: Approximate reciprocalsTerje Mathisen
|   |       || `* Re: Approximate reciprocalsMitchAlsup
|   |       ||  +- Re: Approximate reciprocalsTerje Mathisen
|   |       ||  `- Re: Approximate reciprocalsQuadibloc
|   |       |`- Re: Approximate reciprocalsMichael S
|   |       `* Re: Approximate reciprocalsThomas Koenig
|   |        `* Re: Approximate reciprocalsMichael S
|   |         `* Re: Approximate reciprocalsThomas Koenig
|   |          `* Re: Approximate reciprocalsMichael S
|   |           `* Re: Approximate reciprocalsMichael S
|   |            +* Re: Approximate reciprocalsMitchAlsup
|   |            |`* Re: Approximate reciprocalsJames Van Buskirk
|   |            | `- Re: Approximate reciprocalsMitchAlsup
|   |            `* Re: Approximate reciprocalsThomas Koenig
|   |             `* Re: Approximate reciprocalsMichael S
|   |              +- Re: Approximate reciprocalsMichael S
|   |              +* Re: Approximate reciprocalsMitchAlsup
|   |              |`* Re: Approximate reciprocalsTerje Mathisen
|   |              | `* Re: Approximate reciprocalsMitchAlsup
|   |              |  +- Re: Approximate reciprocalsMichael S
|   |              |  `* Re: Approximate reciprocalsTerje Mathisen
|   |              |   `* Re: Approximate reciprocalsMitchAlsup
|   |              |    +- Re: Approximate reciprocalsMichael S
|   |              |    +- Re: Approximate reciprocalsMichael S
|   |              |    `- Re: Approximate reciprocalsTerje Mathisen
|   |              +* Re: Approximate reciprocalsMichael S
|   |              |`* Re: Approximate reciprocalsThomas Koenig
|   |              | +- Re: Approximate reciprocalsMichael S
|   |              | `* Re: Approximate reciprocalsTerje Mathisen
|   |              |  +- Re: Approximate reciprocalsQuadibloc
|   |              |  +* Re: Approximate reciprocalsThomas Koenig
|   |              |  |+- Re: Approximate reciprocalsMichael S
|   |              |  |+- Re: Approximate reciprocalsTerje Mathisen
|   |              |  |`* Re: Approximate reciprocalsMichael S
|   |              |  | `* Re: Approximate reciprocalsThomas Koenig
|   |              |  |  +- Re: Approximate reciprocalsMichael S
|   |              |  |  `* Re: Approximate reciprocalsMichael S
|   |              |  |   `* Re: Approximate reciprocalsThomas Koenig
|   |              |  |    `* Re: Approximate reciprocalsMichael S
|   |              |  |     `* Re: Approximate reciprocalsMichael S
|   |              |  |      `* Re: Approximate reciprocalsThomas Koenig
|   |              |  |       `* Re: Approximate reciprocalsMichael S
|   |              |  |        +* Re: Approximate reciprocalsrobf...@gmail.com
|   |              |  |        |`* Useful floating point instructions (was: Approximate reciprocals)Thomas Koenig
|   |              |  |        | `* Re: Useful floating point instructionsTerje Mathisen
|   |              |  |        |  `* Re: Useful floating point instructionsStephen Fuld
|   |              |  |        |   `* Re: Useful floating point instructionsMitchAlsup
|   |              |  |        |    `* Re: Useful floating point instructionsStephen Fuld
|   |              |  |        |     +- Re: Useful floating point instructionsMitchAlsup
|   |              |  |        |     +* Re: Useful floating point instructionsMichael S
|   |              |  |        |     |+- Re: Useful floating point instructionsStephen Fuld
|   |              |  |        |     |`- Re: Useful floating point instructionsTerje Mathisen
|   |              |  |        |     `* Re: Useful floating point instructionsTerje Mathisen
|   |              |  |        |      `- Re: Useful floating point instructionsStefan Monnier
|   |              |  |        +* Re: Approximate reciprocalsMichael S
|   |              |  |        |`* Re: Approximate reciprocalsGeorge Neuner
|   |              |  |        | +* Re: Approximate reciprocalsAnton Ertl
|   |              |  |        | |+* Re: Approximate reciprocalsMichael S
|   |              |  |        | ||`* Re: Approximate reciprocalsAnton Ertl
|   |              |  |        | || `- Re: Approximate reciprocalsMichael S
|   |              |  |        | |`* Re: Approximate reciprocalsGeorge Neuner
|   |              |  |        | | `* Re: Approximate reciprocalsAnton Ertl
|   |              |  |        | |  `* Re: Approximate reciprocalsMichael S
|   |              |  |        | |   `* Re: Approximate reciprocalsTerje Mathisen
|   |              |  |        | |    `* Re: Approximate reciprocalsMichael S
|   |              |  |        | |     `* Re: Approximate reciprocalsTerje Mathisen
|   |              |  |        | |      `- Re: Approximate reciprocalsMitchAlsup
|   |              |  |        | +- Re: Approximate reciprocalsMichael S
|   |              |  |        | `* Re: Approximate reciprocalsJohn Dallman
|   |              |  |        |  +- Re: Approximate reciprocalsMitchAlsup
|   |              |  |        |  `* Re: Approximate reciprocalsGeorge Neuner
|   |              |  |        |   +* Re: Approximate reciprocalsMichael S
|   |              |  |        |   |+* Re: Approximate reciprocalsEricP
|   |              |  |        |   ||`* Re: Approximate reciprocalsAnton Ertl
|   |              |  |        |   |`* Re: Approximate reciprocalsAnton Ertl
|   |              |  |        |   `* Re: Approximate reciprocalsJohn Dallman
|   |              |  |        +- Re: Approximate reciprocalsMichael S
|   |              |  |        `- Re: Approximate reciprocalsMichael S
|   |              |  `* Re: Approximate reciprocalsMichael S
|   |              `- Re: Approximate reciprocalsMichael S
|   `- Re: Approximate reciprocalsTerje Mathisen
+* Re: Approximate reciprocalsElijah Stone
+* Re: Approximate reciprocalsMarcus
`* Re: Approximate reciprocalsMarcus

Pages:12345678910111213
Useful floating point instructions (was: Approximate reciprocals)

<t2f7ms$q18$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24575&group=comp.arch#24575

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-ec41-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Useful floating point instructions (was: Approximate reciprocals)
Date: Mon, 4 Apr 2022 16:51:40 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <t2f7ms$q18$1@newsreader4.netcologne.de>
References: <t1c154$j5t$1@dont-email.me>
<t1o3sm$977$1@newsreader4.netcologne.de>
<b48d1398-004f-4c6e-8c27-94c635d23c52n@googlegroups.com>
<cc1f77e5-2cfe-4d7d-b65b-4a2627818b43n@googlegroups.com>
<6e7449c4-7134-44ea-a550-7c130d88b975n@googlegroups.com>
<cfd303cb-9107-4887-a71f-dd593d7698ebn@googlegroups.com>
<t1vnme$dhs$1@newsreader4.netcologne.de> <t20t87$1k64$1@gioia.aioe.org>
<t215f3$9o7$1@newsreader4.netcologne.de>
<a86c92ab-149c-44f2-90e1-36496b35e9c4n@googlegroups.com>
<t22c73$5b4$1@newsreader4.netcologne.de>
<b9a7bb45-110a-4652-8f99-d46f32691958n@googlegroups.com>
<10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com>
<t2cbdf$srr$1@newsreader4.netcologne.de>
<e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com>
<051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com>
<t2cp1n$6ji$1@newsreader4.netcologne.de>
<1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com>
<633e483f-007e-4d32-9b51-4e4727dfe495n@googlegroups.com>
Injection-Date: Mon, 4 Apr 2022 16:51:40 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-ec41-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:ec41:0:7285:c2ff:fe6c:992d";
logging-data="26664"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Mon, 4 Apr 2022 16:51 UTC

robf...@gmail.com <robfi680@gmail.com> schrieb:
>
> Is there any use for approximately equals?

Absolutely. In numerical code,

if (abs(a-b) < eps) call hooray("Heureka!")

is ubiquitous.

> And how to go about determining
> approximate equality.

That is rather difficult and depends a lot on your application.
Assume you are looking for a solution to f(x) = 0, which is
well-behaved (i.e. has a large derivative around the root), then
a few times machine precision would serve.

If you are looking for a minimum which has the "usual" behavior
like f(x) = a*(x-x0)**2 + b, then the optimum will be far less
precise - hoping for better accurady than sqrt(epsilon), where
1+epsilon is the smallest number which does not equal 1, will
not be fulfilled.

>I was thinking of an equality operator that takes in a
> number of significant bits that must be equal. So, if there was a double
> precision comparison, it could say equal to within 23 significant bits.

Even an abdiff instruction like r = abs(a-b) could save an
instruction.

Or, depending on how many bits there are for specifying
a register and if there are implicit flags, an instruction
that sets a flag depending on the sign of abs(a-b) < eps
could be helpful.

Also quite useful could be a hypot function, which would calculate
sqrt(a**2 + b** 2) to machine accuracy. People are still writing
papers about how to do it in the absence of such an instruction
(Article 9 in ACM Transactions on Mathematical Software 47, 1
(2021))

Re: Useful floating point instructions

<t2f9b6$1160$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24576&group=comp.arch#24576

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!EhtdJS5E9ITDZpJm3Uerlg.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Useful floating point instructions
Date: Mon, 4 Apr 2022 19:19:40 +0200
Organization: Aioe.org NNTP Server
Message-ID: <t2f9b6$1160$1@gioia.aioe.org>
References: <t1c154$j5t$1@dont-email.me>
<t1o3sm$977$1@newsreader4.netcologne.de>
<b48d1398-004f-4c6e-8c27-94c635d23c52n@googlegroups.com>
<cc1f77e5-2cfe-4d7d-b65b-4a2627818b43n@googlegroups.com>
<6e7449c4-7134-44ea-a550-7c130d88b975n@googlegroups.com>
<cfd303cb-9107-4887-a71f-dd593d7698ebn@googlegroups.com>
<t1vnme$dhs$1@newsreader4.netcologne.de> <t20t87$1k64$1@gioia.aioe.org>
<t215f3$9o7$1@newsreader4.netcologne.de>
<a86c92ab-149c-44f2-90e1-36496b35e9c4n@googlegroups.com>
<t22c73$5b4$1@newsreader4.netcologne.de>
<b9a7bb45-110a-4652-8f99-d46f32691958n@googlegroups.com>
<10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com>
<t2cbdf$srr$1@newsreader4.netcologne.de>
<e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com>
<051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com>
<t2cp1n$6ji$1@newsreader4.netcologne.de>
<1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com>
<633e483f-007e-4d32-9b51-4e4727dfe495n@googlegroups.com>
<t2f7ms$q18$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="33984"; posting-host="EhtdJS5E9ITDZpJm3Uerlg.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.11.1
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Mon, 4 Apr 2022 17:19 UTC

Thomas Koenig wrote:
> robf...@gmail.com <robfi680@gmail.com> schrieb:
>>
>> Is there any use for approximately equals?
>
> Absolutely. In numerical code,
>
> if (abs(a-b) < eps) call hooray("Heureka!")
>
> is ubiquitous.
>
>> And how to go about determining
>> approximate equality.
>
> That is rather difficult and depends a lot on your application.
> Assume you are looking for a solution to f(x) = 0, which is
> well-behaved (i.e. has a large derivative around the root), then
> a few times machine precision would serve.
>
> If you are looking for a minimum which has the "usual" behavior
> like f(x) = a*(x-x0)**2 + b, then the optimum will be far less
> precise - hoping for better accurady than sqrt(epsilon), where
> 1+epsilon is the smallest number which does not equal 1, will
> not be fulfilled.
>
>> I was thinking of an equality operator that takes in a
>> number of significant bits that must be equal. So, if there was a double
>> precision comparison, it could say equal to within 23 significant bits.

Actually, having a compare operation which was capable of returning the
number of identical bits, even when near a flipping boundary (0.99999..
vs 1.0000...) would be quite useful, and probably faster than the
traditional relative offset measure:

rel_err = (estimate-exact)/exact.

>
> Even an abdiff instruction like r = abs(a-b) could save an
> instruction.
>
> Or, depending on how many bits there are for specifying
> a register and if there are implicit flags, an instruction
> that sets a flag depending on the sign of abs(a-b) < eps
> could be helpful.
>
> Also quite useful could be a hypot function, which would calculate
> sqrt(a**2 + b** 2) to machine accuracy. People are still writing
> papers about how to do it in the absence of such an instruction
> (Article 9 in ACM Transactions on Mathematical Software 47, 1
> (2021))

The new Augmented(Addition/Multiplication) fp operations would make this
a lot easier, i.e. everything inside the sqrt() call would be exact, you
just need to create a not-quite-quad sqrt function to get exact rounding.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Useful floating point instructions

<t2ff7u$rc0$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24577&group=comp.arch#24577

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Useful floating point instructions
Date: Mon, 4 Apr 2022 12:00:12 -0700
Organization: A noiseless patient Spider
Lines: 46
Message-ID: <t2ff7u$rc0$1@dont-email.me>
References: <t1c154$j5t$1@dont-email.me>
<b48d1398-004f-4c6e-8c27-94c635d23c52n@googlegroups.com>
<cc1f77e5-2cfe-4d7d-b65b-4a2627818b43n@googlegroups.com>
<6e7449c4-7134-44ea-a550-7c130d88b975n@googlegroups.com>
<cfd303cb-9107-4887-a71f-dd593d7698ebn@googlegroups.com>
<t1vnme$dhs$1@newsreader4.netcologne.de> <t20t87$1k64$1@gioia.aioe.org>
<t215f3$9o7$1@newsreader4.netcologne.de>
<a86c92ab-149c-44f2-90e1-36496b35e9c4n@googlegroups.com>
<t22c73$5b4$1@newsreader4.netcologne.de>
<b9a7bb45-110a-4652-8f99-d46f32691958n@googlegroups.com>
<10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com>
<t2cbdf$srr$1@newsreader4.netcologne.de>
<e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com>
<051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com>
<t2cp1n$6ji$1@newsreader4.netcologne.de>
<1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com>
<633e483f-007e-4d32-9b51-4e4727dfe495n@googlegroups.com>
<t2f7ms$q18$1@newsreader4.netcologne.de> <t2f9b6$1160$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 4 Apr 2022 19:00:15 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="c900ef89991ebb806250fde801a92854";
logging-data="28032"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19UsCGZGCBOIIt8Ck57atRRvwc18aNZoUk="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.7.0
Cancel-Lock: sha1:TZm8ZKHLYrROiuEzvLTY1G7ff9U=
In-Reply-To: <t2f9b6$1160$1@gioia.aioe.org>
Content-Language: en-US
 by: Stephen Fuld - Mon, 4 Apr 2022 19:00 UTC

On 4/4/2022 10:19 AM, Terje Mathisen wrote:
> Thomas Koenig wrote:
>> robf...@gmail.com <robfi680@gmail.com> schrieb:
>>>
>>> Is there any use for approximately equals?
>>
>> Absolutely.  In numerical code,
>>
>>    if (abs(a-b) < eps) call hooray("Heureka!")
>>
>> is ubiquitous.
>>
>>> And how to go about determining
>>> approximate equality.
>>
>> That is rather difficult and depends a lot on your application.
>> Assume you are looking for a solution to f(x) = 0, which is
>> well-behaved (i.e. has a large derivative around the root), then
>> a few times machine precision would serve.
>>
>> If you are looking for a minimum which has the "usual" behavior
>> like f(x) = a*(x-x0)**2 + b, then the optimum will be far less
>> precise - hoping for better accurady than sqrt(epsilon), where
>> 1+epsilon is the smallest number which does not equal 1, will
>> not be fulfilled.
>>
>>> I was thinking of an equality operator that takes in a
>>> number of significant bits that must be equal. So, if there was a double
>>> precision comparison, it could say equal to within 23 significant bits.
>
> Actually, having a compare operation which was capable of returning the
> number of identical bits, even when near a flipping boundary (0.99999..
> vs 1.0000...) would be quite useful, and probably faster than the
> traditional relative offset measure:
>
>  rel_err = (estimate-exact)/exact.

I noticed that there is space for that in the results register of the
FMCP instruction in Mitch's My 66000. Since it seem pretty simple to do
in hardware, and it can be done in parallel with determining the other
bits in the result, perhaps you can convince Mitch to add it.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Approximate reciprocals

<c8c6ba2b-1314-48b7-8732-c7df882f0f3en@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24578&group=comp.arch#24578

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:1c83:b0:443:6749:51f8 with SMTP id ib3-20020a0562141c8300b00443674951f8mr226602qvb.74.1649111138106;
Mon, 04 Apr 2022 15:25:38 -0700 (PDT)
X-Received: by 2002:a05:6870:9590:b0:de:27ca:c60c with SMTP id
k16-20020a056870959000b000de27cac60cmr237987oao.108.1649111137797; Mon, 04
Apr 2022 15:25:37 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 4 Apr 2022 15:25:37 -0700 (PDT)
In-Reply-To: <t22705$2jl$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:d1d9:d50a:f13c:423c;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:d1d9:d50a:f13c:423c
References: <t1c154$j5t$1@dont-email.me> <t1qf0u$oko$1@dont-email.me>
<t1qkql$ui0$1@newsreader4.netcologne.de> <394168eb-53ed-49c2-a349-4035c3177361n@googlegroups.com>
<t1rm34$pg9$1@gioia.aioe.org> <7029a173-963d-402b-a184-642120b5e1b8n@googlegroups.com>
<4bdfaba8-898f-4c1e-8ca1-234bf4d3ffc8n@googlegroups.com> <t1vd17$5bj$1@newsreader4.netcologne.de>
<1d99080f-3c84-4a44-b2cf-271c2f3f7e90n@googlegroups.com> <t1vkm4$ar2$1@newsreader4.netcologne.de>
<dc571956-dddd-469a-8b8e-30017e37d5bbn@googlegroups.com> <t20qrb$4lp$1@newsreader4.netcologne.de>
<1b5bd111-40f0-41e7-9025-787e49f0fd02n@googlegroups.com> <t22705$2jl$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c8c6ba2b-1314-48b7-8732-c7df882f0f3en@googlegroups.com>
Subject: Re: Approximate reciprocals
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 04 Apr 2022 22:25:38 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 83
 by: MitchAlsup - Mon, 4 Apr 2022 22:25 UTC

On Wednesday, March 30, 2022 at 1:19:52 PM UTC-5, Thomas Koenig wrote:
> MitchAlsup <Mitch...@aol.com> schrieb:
> > On Wednesday, March 30, 2022 at 12:46:22 AM UTC-5, Thomas Koenig wrote:
> >> MitchAlsup <Mitch...@aol.com> schrieb:
> >> > On Tuesday, March 29, 2022 at 1:55:03 PM UTC-5, Thomas Koenig wrote:
> >> >> MitchAlsup <Mitch...@aol.com> schrieb:
> >> >> > On Tuesday, March 29, 2022 at 11:45:08 AM UTC-5, Thomas Koenig wrote:
> >> >>
> >> >> >> I've looked at the optimum Remez polynomial for 1/x in the range
> >> >> >> of 1..2. It is 70/33 - 16/11 * x + 32/99 * x**2, with a weight of x
> >> >> >> (so optimizing for the relative error).
> >> >> >>
> >> >> >> The maximum relative error is 1/99, reached at four points in the
> >> >> >> interval [1..2], so around 6.62 bits of minimum accuracy.
> >> >> ><
> >> >> > I stated way above::---------------------------------------------------------------------------------------
> >> >> > Equivalent to::
> >> >> ><
> >> >> > .33333×x^2 - 1.5×x + 2.1666
> >> >> ><
> >> >> > The Chebychev 2nd order Coefficients on the interval [1..2) are:
> >> >> ><
> >> >> > 0.32323232×x^2 -0.48484848×x + 0.66666667
> This is the formula we are discussing.
> >> >> There is somethig wrong with your formula. f(1) would be
> >> >> 0.32323232 - 0.48484848 + 0.66666667 = 0.50505051, which does
> >> >> not even closely approximate 1/1 = 1. Is there some rescaling
> >> >> somewhere?
> >> >> ><
> >> >> > with 6.63 bits of accuracy.
> >> >> ><-------------------------------------------------------------------------------------------------------------------
> >> >> > Why does your Remez get less precision than Chebychev ?
> >> >> Absolute or relative precision? As I wrote above, I used relative
> >> ><
> >> > Mine was relative (too)
> >> Easy enough to check. What was the actual formula you
> >> arrived at? Not the one you wrote above, obviously.
> ><
> >=IF(AA29=0,-99,LOG(ABS(AA29),2))
> ... and this is a formula for calculating an error.
>
> We seem to have a disconnect here somewhere.
>
> We were discussing a second-degree polynomial approximation to 1/x.
> The formula you gave doesn't have anything close to 6 bits of
> accuracy, it has _zero_ bits of accuracy (as I showed for the case
> of x=1 above).
>
> So I'm not sure what you are in fact approximating. I would, however,
> like to compare your Chebyshev formula for 1/x with the Remez formula
> I gave above.
>
> Could you give that Chebyshev formula for 1/x?
<
p(x) = 0.32323232×x^2 -0.48484848×x + 0.66666667
<
on the interval 1.0..2.0
<
Which is compared to 1000 randomized points against 1/x
<
The highest error encountered has 6.63 bits of precision.
<---------------------------------
But the x I put into the polynomial was the distance from the mid-point of the interval = 1.5
<---------------------------------
Making the formula::
<
p(x) = 0.32323232×(x-1.5)^2 -0.48484848×(x-1.5) + 0.66666667
<
x in the interval {1.0..2.0} polynomial argument in the range {-0.5..+0.5}
<
<
<
It is so easy to get lost in eXcel spreadsheet equations.

Re: Approximate reciprocals

<75195d59-15dd-4e20-ad9e-28b7854c5cddn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24579&group=comp.arch#24579

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:178a:b0:2e1:e7b8:e52e with SMTP id s10-20020a05622a178a00b002e1e7b8e52emr539653qtk.464.1649111232269;
Mon, 04 Apr 2022 15:27:12 -0700 (PDT)
X-Received: by 2002:a05:6870:1607:b0:de:984:496d with SMTP id
b7-20020a056870160700b000de0984496dmr217971oae.253.1649111231543; Mon, 04 Apr
2022 15:27:11 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!nntp.club.cc.cmu.edu!45.76.7.193.MISMATCH!3.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 4 Apr 2022 15:27:11 -0700 (PDT)
In-Reply-To: <fabfec7e-27ab-45be-a728-3879b82da3a7n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:d1d9:d50a:f13c:423c;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:d1d9:d50a:f13c:423c
References: <t1c154$j5t$1@dont-email.me> <t1qf0u$oko$1@dont-email.me>
<t1qkql$ui0$1@newsreader4.netcologne.de> <394168eb-53ed-49c2-a349-4035c3177361n@googlegroups.com>
<t1rm34$pg9$1@gioia.aioe.org> <7029a173-963d-402b-a184-642120b5e1b8n@googlegroups.com>
<4bdfaba8-898f-4c1e-8ca1-234bf4d3ffc8n@googlegroups.com> <t1vd17$5bj$1@newsreader4.netcologne.de>
<1d99080f-3c84-4a44-b2cf-271c2f3f7e90n@googlegroups.com> <t1vkm4$ar2$1@newsreader4.netcologne.de>
<dc571956-dddd-469a-8b8e-30017e37d5bbn@googlegroups.com> <t20qrb$4lp$1@newsreader4.netcologne.de>
<1b5bd111-40f0-41e7-9025-787e49f0fd02n@googlegroups.com> <t22705$2jl$1@newsreader4.netcologne.de>
<fabfec7e-27ab-45be-a728-3879b82da3a7n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <75195d59-15dd-4e20-ad9e-28b7854c5cddn@googlegroups.com>
Subject: Re: Approximate reciprocals
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 04 Apr 2022 22:27:12 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 75
 by: MitchAlsup - Mon, 4 Apr 2022 22:27 UTC

On Wednesday, March 30, 2022 at 2:48:44 PM UTC-5, Michael S wrote:
> On Wednesday, March 30, 2022 at 9:19:52 PM UTC+3, Thomas Koenig wrote:
> > MitchAlsup <Mitch...@aol.com> schrieb:
> > > On Wednesday, March 30, 2022 at 12:46:22 AM UTC-5, Thomas Koenig wrote:
> > >> MitchAlsup <Mitch...@aol.com> schrieb:
> > >> > On Tuesday, March 29, 2022 at 1:55:03 PM UTC-5, Thomas Koenig wrote:
> > >> >> MitchAlsup <Mitch...@aol.com> schrieb:
> > >> >> > On Tuesday, March 29, 2022 at 11:45:08 AM UTC-5, Thomas Koenig wrote:
> > >> >>
> > >> >> >> I've looked at the optimum Remez polynomial for 1/x in the range
> > >> >> >> of 1..2. It is 70/33 - 16/11 * x + 32/99 * x**2, with a weight of x
> > >> >> >> (so optimizing for the relative error).
> > >> >> >>
> > >> >> >> The maximum relative error is 1/99, reached at four points in the
> > >> >> >> interval [1..2], so around 6.62 bits of minimum accuracy.
> > >> >> ><
> > >> >> > I stated way above::---------------------------------------------------------------------------------------
> > >> >> > Equivalent to::
> > >> >> ><
> > >> >> > .33333×x^2 - 1.5×x + 2.1666
> > >> >> ><
> > >> >> > The Chebychev 2nd order Coefficients on the interval [1..2) are:
> > >> >> ><
> > >> >> > 0.32323232×x^2 -0.48484848×x + 0.66666667
> > This is the formula we are discussing.
> > >> >> There is somethig wrong with your formula. f(1) would be
> > >> >> 0.32323232 - 0.48484848 + 0.66666667 = 0.50505051, which does
> > >> >> not even closely approximate 1/1 = 1. Is there some rescaling
> > >> >> somewhere?
> > >> >> ><
> > >> >> > with 6.63 bits of accuracy.
> > >> >> ><-------------------------------------------------------------------------------------------------------------------
> > >> >> > Why does your Remez get less precision than Chebychev ?
> > >> >> Absolute or relative precision? As I wrote above, I used relative
> > >> ><
> > >> > Mine was relative (too)
> > >> Easy enough to check. What was the actual formula you
> > >> arrived at? Not the one you wrote above, obviously.
> > ><
> > >=IF(AA29=0,-99,LOG(ABS(AA29),2))
> > ... and this is a formula for calculating an error.
> >
> > We seem to have a disconnect here somewhere.
> >
> > We were discussing a second-degree polynomial approximation to 1/x.
> > The formula you gave doesn't have anything close to 6 bits of
> > accuracy, it has _zero_ bits of accuracy (as I showed for the case
> > of x=1 above).
> >
> He gave poly on interval [2:1] instead of [1:2].
> After affine transformation (in Matlab/Octave polyaffine(pp, [3 -1]) ) it's
> the same poly as yours, except that Mitch rounded coefficients to 8 decimal places.
<
I choose only to "print" the first 8 decimal digits, eXcel choose to round.
<
> > So I'm not sure what you are in fact approximating. I would, however,
> > like to compare your Chebyshev formula for 1/x with the Remez formula
> > I gave above.
> >
> > Could you give that Chebyshev formula for 1/x?

Re: Useful floating point instructions

<a390f3f8-3e4e-4804-9208-cdbad57ead38n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24580&group=comp.arch#24580

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:3184:b0:67d:cce9:bab4 with SMTP id bi4-20020a05620a318400b0067dcce9bab4mr345241qkb.685.1649111552398;
Mon, 04 Apr 2022 15:32:32 -0700 (PDT)
X-Received: by 2002:a4a:d28b:0:b0:324:7eb6:c6f5 with SMTP id
h11-20020a4ad28b000000b003247eb6c6f5mr141751oos.35.1649111552054; Mon, 04 Apr
2022 15:32:32 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!3.eu.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 4 Apr 2022 15:32:31 -0700 (PDT)
In-Reply-To: <t2ff7u$rc0$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:d1d9:d50a:f13c:423c;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:d1d9:d50a:f13c:423c
References: <t1c154$j5t$1@dont-email.me> <b48d1398-004f-4c6e-8c27-94c635d23c52n@googlegroups.com>
<cc1f77e5-2cfe-4d7d-b65b-4a2627818b43n@googlegroups.com> <6e7449c4-7134-44ea-a550-7c130d88b975n@googlegroups.com>
<cfd303cb-9107-4887-a71f-dd593d7698ebn@googlegroups.com> <t1vnme$dhs$1@newsreader4.netcologne.de>
<t20t87$1k64$1@gioia.aioe.org> <t215f3$9o7$1@newsreader4.netcologne.de>
<a86c92ab-149c-44f2-90e1-36496b35e9c4n@googlegroups.com> <t22c73$5b4$1@newsreader4.netcologne.de>
<b9a7bb45-110a-4652-8f99-d46f32691958n@googlegroups.com> <10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com>
<t2cbdf$srr$1@newsreader4.netcologne.de> <e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com>
<051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com> <t2cp1n$6ji$1@newsreader4.netcologne.de>
<1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com> <633e483f-007e-4d32-9b51-4e4727dfe495n@googlegroups.com>
<t2f7ms$q18$1@newsreader4.netcologne.de> <t2f9b6$1160$1@gioia.aioe.org> <t2ff7u$rc0$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <a390f3f8-3e4e-4804-9208-cdbad57ead38n@googlegroups.com>
Subject: Re: Useful floating point instructions
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 04 Apr 2022 22:32:32 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 49
 by: MitchAlsup - Mon, 4 Apr 2022 22:32 UTC

On Monday, April 4, 2022 at 2:00:18 PM UTC-5, Stephen Fuld wrote:
> On 4/4/2022 10:19 AM, Terje Mathisen wrote:
> > Thomas Koenig wrote:
> >> robf...@gmail.com <robf...@gmail.com> schrieb:
> >>>
> >>> Is there any use for approximately equals?
> >>
> >> Absolutely. In numerical code,
> >>
> >> if (abs(a-b) < eps) call hooray("Heureka!")
> >>
> >> is ubiquitous.
> >>
> >>> And how to go about determining
> >>> approximate equality.
> >>
> >> That is rather difficult and depends a lot on your application.
> >> Assume you are looking for a solution to f(x) = 0, which is
> >> well-behaved (i.e. has a large derivative around the root), then
> >> a few times machine precision would serve.
> >>
> >> If you are looking for a minimum which has the "usual" behavior
> >> like f(x) = a*(x-x0)**2 + b, then the optimum will be far less
> >> precise - hoping for better accurady than sqrt(epsilon), where
> >> 1+epsilon is the smallest number which does not equal 1, will
> >> not be fulfilled.
> >>
> >>> I was thinking of an equality operator that takes in a
> >>> number of significant bits that must be equal. So, if there was a double
> >>> precision comparison, it could say equal to within 23 significant bits.
> >
> > Actually, having a compare operation which was capable of returning the
> > number of identical bits, even when near a flipping boundary (0.99999..
> > vs 1.0000...) would be quite useful, and probably faster than the
> > traditional relative offset measure:
> >
> > rel_err = (estimate-exact)/exact.
> I noticed that there is space for that in the results register of the
> FMCP instruction in Mitch's My 66000. Since it seem pretty simple to do
> in hardware, and it can be done in parallel with determining the other
> bits in the result, perhaps you can convince Mitch to add it.
>
Yes, space if there, in fact only 20-ish of the 64-bits are defined.
<
But I am not sure at all how to do a divide in the FCMP instruction--
That is I don't see what you want put there.
>
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: Useful floating point instructions

<t2fs64$59t$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24581&group=comp.arch#24581

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Useful floating point instructions
Date: Mon, 4 Apr 2022 15:41:06 -0700
Organization: A noiseless patient Spider
Lines: 57
Message-ID: <t2fs64$59t$1@dont-email.me>
References: <t1c154$j5t$1@dont-email.me>
<6e7449c4-7134-44ea-a550-7c130d88b975n@googlegroups.com>
<cfd303cb-9107-4887-a71f-dd593d7698ebn@googlegroups.com>
<t1vnme$dhs$1@newsreader4.netcologne.de> <t20t87$1k64$1@gioia.aioe.org>
<t215f3$9o7$1@newsreader4.netcologne.de>
<a86c92ab-149c-44f2-90e1-36496b35e9c4n@googlegroups.com>
<t22c73$5b4$1@newsreader4.netcologne.de>
<b9a7bb45-110a-4652-8f99-d46f32691958n@googlegroups.com>
<10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com>
<t2cbdf$srr$1@newsreader4.netcologne.de>
<e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com>
<051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com>
<t2cp1n$6ji$1@newsreader4.netcologne.de>
<1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com>
<633e483f-007e-4d32-9b51-4e4727dfe495n@googlegroups.com>
<t2f7ms$q18$1@newsreader4.netcologne.de> <t2f9b6$1160$1@gioia.aioe.org>
<t2ff7u$rc0$1@dont-email.me>
<a390f3f8-3e4e-4804-9208-cdbad57ead38n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 4 Apr 2022 22:41:08 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="29e74ab80cb63573a47c444114e7f580";
logging-data="5437"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/4EhxtJdQRDUekus+3m3A71MTAYtX95vg="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.7.0
Cancel-Lock: sha1:cjZfYtDltpXqmrLVULHrUii8asE=
In-Reply-To: <a390f3f8-3e4e-4804-9208-cdbad57ead38n@googlegroups.com>
Content-Language: en-US
 by: Stephen Fuld - Mon, 4 Apr 2022 22:41 UTC

On 4/4/2022 3:32 PM, MitchAlsup wrote:
> On Monday, April 4, 2022 at 2:00:18 PM UTC-5, Stephen Fuld wrote:
>> On 4/4/2022 10:19 AM, Terje Mathisen wrote:
>>> Thomas Koenig wrote:
>>>> robf...@gmail.com <robf...@gmail.com> schrieb:
>>>>>
>>>>> Is there any use for approximately equals?
>>>>
>>>> Absolutely. In numerical code,
>>>>
>>>> if (abs(a-b) < eps) call hooray("Heureka!")
>>>>
>>>> is ubiquitous.
>>>>
>>>>> And how to go about determining
>>>>> approximate equality.
>>>>
>>>> That is rather difficult and depends a lot on your application.
>>>> Assume you are looking for a solution to f(x) = 0, which is
>>>> well-behaved (i.e. has a large derivative around the root), then
>>>> a few times machine precision would serve.
>>>>
>>>> If you are looking for a minimum which has the "usual" behavior
>>>> like f(x) = a*(x-x0)**2 + b, then the optimum will be far less
>>>> precise - hoping for better accurady than sqrt(epsilon), where
>>>> 1+epsilon is the smallest number which does not equal 1, will
>>>> not be fulfilled.
>>>>
>>>>> I was thinking of an equality operator that takes in a
>>>>> number of significant bits that must be equal. So, if there was a double
>>>>> precision comparison, it could say equal to within 23 significant bits.
>>>
>>> Actually, having a compare operation which was capable of returning the
>>> number of identical bits, even when near a flipping boundary (0.99999..
>>> vs 1.0000...) would be quite useful, and probably faster than the
>>> traditional relative offset measure:
>>>
>>> rel_err = (estimate-exact)/exact.
>> I noticed that there is space for that in the results register of the
>> FMCP instruction in Mitch's My 66000. Since it seem pretty simple to do
>> in hardware, and it can be done in parallel with determining the other
>> bits in the result, perhaps you can convince Mitch to add it.
>>
> Yes, space if there, in fact only 20-ish of the 64-bits are defined.
> <
> But I am not sure at all how to do a divide in the FCMP instruction--
> That is I don't see what you want put there.

I may have misunderstood Terje. I thought he wanted the number of
mantissa bits that are identical between the two operands. This would
just be an integer value up to 52.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Useful floating point instructions

<0d58fa9f-3777-4c56-a222-f2a47e1e0967n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24585&group=comp.arch#24585

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:21a3:b0:441:35fd:920e with SMTP id t3-20020a05621421a300b0044135fd920emr357047qvc.41.1649113866056;
Mon, 04 Apr 2022 16:11:06 -0700 (PDT)
X-Received: by 2002:a05:6808:218a:b0:2f9:65d4:898a with SMTP id
be10-20020a056808218a00b002f965d4898amr269656oib.27.1649113865779; Mon, 04
Apr 2022 16:11:05 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 4 Apr 2022 16:11:05 -0700 (PDT)
In-Reply-To: <t2fs64$59t$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:d1d9:d50a:f13c:423c;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:d1d9:d50a:f13c:423c
References: <t1c154$j5t$1@dont-email.me> <6e7449c4-7134-44ea-a550-7c130d88b975n@googlegroups.com>
<cfd303cb-9107-4887-a71f-dd593d7698ebn@googlegroups.com> <t1vnme$dhs$1@newsreader4.netcologne.de>
<t20t87$1k64$1@gioia.aioe.org> <t215f3$9o7$1@newsreader4.netcologne.de>
<a86c92ab-149c-44f2-90e1-36496b35e9c4n@googlegroups.com> <t22c73$5b4$1@newsreader4.netcologne.de>
<b9a7bb45-110a-4652-8f99-d46f32691958n@googlegroups.com> <10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com>
<t2cbdf$srr$1@newsreader4.netcologne.de> <e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com>
<051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com> <t2cp1n$6ji$1@newsreader4.netcologne.de>
<1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com> <633e483f-007e-4d32-9b51-4e4727dfe495n@googlegroups.com>
<t2f7ms$q18$1@newsreader4.netcologne.de> <t2f9b6$1160$1@gioia.aioe.org>
<t2ff7u$rc0$1@dont-email.me> <a390f3f8-3e4e-4804-9208-cdbad57ead38n@googlegroups.com>
<t2fs64$59t$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <0d58fa9f-3777-4c56-a222-f2a47e1e0967n@googlegroups.com>
Subject: Re: Useful floating point instructions
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 04 Apr 2022 23:11:06 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Mon, 4 Apr 2022 23:11 UTC

On Monday, April 4, 2022 at 5:41:12 PM UTC-5, Stephen Fuld wrote:
> On 4/4/2022 3:32 PM, MitchAlsup wrote:
> > On Monday, April 4, 2022 at 2:00:18 PM UTC-5, Stephen Fuld wrote:
> >> On 4/4/2022 10:19 AM, Terje Mathisen wrote:
> >>> Thomas Koenig wrote:
> >>>> robf...@gmail.com <robf...@gmail.com> schrieb:
> >>>>>
> >>>>> Is there any use for approximately equals?
> >>>>
> >>>> Absolutely. In numerical code,
> >>>>
> >>>> if (abs(a-b) < eps) call hooray("Heureka!")
> >>>>
> >>>> is ubiquitous.
> >>>>
> >>>>> And how to go about determining
> >>>>> approximate equality.
> >>>>
> >>>> That is rather difficult and depends a lot on your application.
> >>>> Assume you are looking for a solution to f(x) = 0, which is
> >>>> well-behaved (i.e. has a large derivative around the root), then
> >>>> a few times machine precision would serve.
> >>>>
> >>>> If you are looking for a minimum which has the "usual" behavior
> >>>> like f(x) = a*(x-x0)**2 + b, then the optimum will be far less
> >>>> precise - hoping for better accurady than sqrt(epsilon), where
> >>>> 1+epsilon is the smallest number which does not equal 1, will
> >>>> not be fulfilled.
> >>>>
> >>>>> I was thinking of an equality operator that takes in a
> >>>>> number of significant bits that must be equal. So, if there was a double
> >>>>> precision comparison, it could say equal to within 23 significant bits.
> >>>
> >>> Actually, having a compare operation which was capable of returning the
> >>> number of identical bits, even when near a flipping boundary (0.99999..
> >>> vs 1.0000...) would be quite useful, and probably faster than the
> >>> traditional relative offset measure:
> >>>
> >>> rel_err = (estimate-exact)/exact.
> >> I noticed that there is space for that in the results register of the
> >> FMCP instruction in Mitch's My 66000. Since it seem pretty simple to do
> >> in hardware, and it can be done in parallel with determining the other
> >> bits in the result, perhaps you can convince Mitch to add it.
> >>
> > Yes, space if there, in fact only 20-ish of the 64-bits are defined.
> > <
> > But I am not sure at all how to do a divide in the FCMP instruction--
> > That is I don't see what you want put there.
> I may have misunderstood Terje. I thought he wanted the number of
> mantissa bits that are identical between the two operands. This would
> just be an integer value up to 52.
<
64 XOR gate 1-delay
64-bit FF1.... 4-delays
64-unary to 6-bit binary converter ... 4-delays
So, it seems to fit
<
But some lower end models are dependent on the comparison not taking more
than 8 gates because CMP-BB is CoIssued.
<
Still, its close to fitting.
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: Useful floating point instructions

<26802501-fdf8-41f7-9a74-9d55169b4703n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24587&group=comp.arch#24587

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5a95:0:b0:2e2:e4f:63c with SMTP id c21-20020ac85a95000000b002e20e4f063cmr671688qtc.537.1649114546988;
Mon, 04 Apr 2022 16:22:26 -0700 (PDT)
X-Received: by 2002:a05:6870:c595:b0:da:4ea1:991f with SMTP id
ba21-20020a056870c59500b000da4ea1991fmr328565oab.147.1649114546650; Mon, 04
Apr 2022 16:22:26 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 4 Apr 2022 16:22:26 -0700 (PDT)
In-Reply-To: <t2fs64$59t$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2a0d:6fc2:55b0:ca00:1fd:71eb:4ffc:4e9e;
posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 2a0d:6fc2:55b0:ca00:1fd:71eb:4ffc:4e9e
References: <t1c154$j5t$1@dont-email.me> <6e7449c4-7134-44ea-a550-7c130d88b975n@googlegroups.com>
<cfd303cb-9107-4887-a71f-dd593d7698ebn@googlegroups.com> <t1vnme$dhs$1@newsreader4.netcologne.de>
<t20t87$1k64$1@gioia.aioe.org> <t215f3$9o7$1@newsreader4.netcologne.de>
<a86c92ab-149c-44f2-90e1-36496b35e9c4n@googlegroups.com> <t22c73$5b4$1@newsreader4.netcologne.de>
<b9a7bb45-110a-4652-8f99-d46f32691958n@googlegroups.com> <10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com>
<t2cbdf$srr$1@newsreader4.netcologne.de> <e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com>
<051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com> <t2cp1n$6ji$1@newsreader4.netcologne.de>
<1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com> <633e483f-007e-4d32-9b51-4e4727dfe495n@googlegroups.com>
<t2f7ms$q18$1@newsreader4.netcologne.de> <t2f9b6$1160$1@gioia.aioe.org>
<t2ff7u$rc0$1@dont-email.me> <a390f3f8-3e4e-4804-9208-cdbad57ead38n@googlegroups.com>
<t2fs64$59t$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <26802501-fdf8-41f7-9a74-9d55169b4703n@googlegroups.com>
Subject: Re: Useful floating point instructions
From: already5...@yahoo.com (Michael S)
Injection-Date: Mon, 04 Apr 2022 23:22:26 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 66
 by: Michael S - Mon, 4 Apr 2022 23:22 UTC

On Tuesday, April 5, 2022 at 1:41:12 AM UTC+3, Stephen Fuld wrote:
> On 4/4/2022 3:32 PM, MitchAlsup wrote:
> > On Monday, April 4, 2022 at 2:00:18 PM UTC-5, Stephen Fuld wrote:
> >> On 4/4/2022 10:19 AM, Terje Mathisen wrote:
> >>> Thomas Koenig wrote:
> >>>> robf...@gmail.com <robf...@gmail.com> schrieb:
> >>>>>
> >>>>> Is there any use for approximately equals?
> >>>>
> >>>> Absolutely. In numerical code,
> >>>>
> >>>> if (abs(a-b) < eps) call hooray("Heureka!")
> >>>>
> >>>> is ubiquitous.
> >>>>
> >>>>> And how to go about determining
> >>>>> approximate equality.
> >>>>
> >>>> That is rather difficult and depends a lot on your application.
> >>>> Assume you are looking for a solution to f(x) = 0, which is
> >>>> well-behaved (i.e. has a large derivative around the root), then
> >>>> a few times machine precision would serve.
> >>>>
> >>>> If you are looking for a minimum which has the "usual" behavior
> >>>> like f(x) = a*(x-x0)**2 + b, then the optimum will be far less
> >>>> precise - hoping for better accurady than sqrt(epsilon), where
> >>>> 1+epsilon is the smallest number which does not equal 1, will
> >>>> not be fulfilled.
> >>>>
> >>>>> I was thinking of an equality operator that takes in a
> >>>>> number of significant bits that must be equal. So, if there was a double
> >>>>> precision comparison, it could say equal to within 23 significant bits.
> >>>
> >>> Actually, having a compare operation which was capable of returning the
> >>> number of identical bits, even when near a flipping boundary (0.99999..
> >>> vs 1.0000...) would be quite useful, and probably faster than the
> >>> traditional relative offset measure:
> >>>
> >>> rel_err = (estimate-exact)/exact.
> >> I noticed that there is space for that in the results register of the
> >> FMCP instruction in Mitch's My 66000. Since it seem pretty simple to do
> >> in hardware, and it can be done in parallel with determining the other
> >> bits in the result, perhaps you can convince Mitch to add it.
> >>
> > Yes, space if there, in fact only 20-ish of the 64-bits are defined.
> > <
> > But I am not sure at all how to do a divide in the FCMP instruction--
> > That is I don't see what you want put there.
> I may have misunderstood Terje. I thought he wanted the number of
> mantissa bits that are identical between the two operands. This would
> just be an integer value up to 52.
> --

I.e. __clz(bits_of(a) xor bits_of(b)) ?
I'm not sure how useful it is. Such criterion misses case of very close FP numbers that belong to different octaves.

__clz(abs(bits_of(a) - bits_of(b))) is more useful but even here we have sore cases of tiny positives vs tiny negatives that
are going to report big difference when in reality the difference is small.

So, what we really want is a bit more complicated. During subtraction we want to treat bits of FP numbers as integers in sign-magnitude
format. Then abs() and finally CLZ.

Thinking about it, nearly all required logic should be preset anyway, as part of FP adder, but it can be located in such way that
if we reuse it then our proximity estimator will have a "FP-like" latency of 2-3 clock cycles, instead of desirable latency of 1.

> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: Useful floating point instructions

<t2gn84$gn$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24590&group=comp.arch#24590

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Useful floating point instructions
Date: Mon, 4 Apr 2022 23:22:58 -0700
Organization: A noiseless patient Spider
Lines: 77
Message-ID: <t2gn84$gn$1@dont-email.me>
References: <t1c154$j5t$1@dont-email.me>
<cfd303cb-9107-4887-a71f-dd593d7698ebn@googlegroups.com>
<t1vnme$dhs$1@newsreader4.netcologne.de> <t20t87$1k64$1@gioia.aioe.org>
<t215f3$9o7$1@newsreader4.netcologne.de>
<a86c92ab-149c-44f2-90e1-36496b35e9c4n@googlegroups.com>
<t22c73$5b4$1@newsreader4.netcologne.de>
<b9a7bb45-110a-4652-8f99-d46f32691958n@googlegroups.com>
<10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com>
<t2cbdf$srr$1@newsreader4.netcologne.de>
<e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com>
<051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com>
<t2cp1n$6ji$1@newsreader4.netcologne.de>
<1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com>
<633e483f-007e-4d32-9b51-4e4727dfe495n@googlegroups.com>
<t2f7ms$q18$1@newsreader4.netcologne.de> <t2f9b6$1160$1@gioia.aioe.org>
<t2ff7u$rc0$1@dont-email.me>
<a390f3f8-3e4e-4804-9208-cdbad57ead38n@googlegroups.com>
<t2fs64$59t$1@dont-email.me>
<26802501-fdf8-41f7-9a74-9d55169b4703n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 5 Apr 2022 06:23:01 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="29e74ab80cb63573a47c444114e7f580";
logging-data="535"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1//2p9sKtjiH9DxB5f5cQadBKP+3+OyUyA="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.7.0
Cancel-Lock: sha1:F9jvjyQsltqkOb0RkKf1GfiEhNA=
In-Reply-To: <26802501-fdf8-41f7-9a74-9d55169b4703n@googlegroups.com>
Content-Language: en-US
 by: Stephen Fuld - Tue, 5 Apr 2022 06:22 UTC

On 4/4/2022 4:22 PM, Michael S wrote:
> On Tuesday, April 5, 2022 at 1:41:12 AM UTC+3, Stephen Fuld wrote:
>> On 4/4/2022 3:32 PM, MitchAlsup wrote:
>>> On Monday, April 4, 2022 at 2:00:18 PM UTC-5, Stephen Fuld wrote:
>>>> On 4/4/2022 10:19 AM, Terje Mathisen wrote:
>>>>> Thomas Koenig wrote:
>>>>>> robf...@gmail.com <robf...@gmail.com> schrieb:
>>>>>>>
>>>>>>> Is there any use for approximately equals?
>>>>>>
>>>>>> Absolutely. In numerical code,
>>>>>>
>>>>>> if (abs(a-b) < eps) call hooray("Heureka!")
>>>>>>
>>>>>> is ubiquitous.
>>>>>>
>>>>>>> And how to go about determining
>>>>>>> approximate equality.
>>>>>>
>>>>>> That is rather difficult and depends a lot on your application.
>>>>>> Assume you are looking for a solution to f(x) = 0, which is
>>>>>> well-behaved (i.e. has a large derivative around the root), then
>>>>>> a few times machine precision would serve.
>>>>>>
>>>>>> If you are looking for a minimum which has the "usual" behavior
>>>>>> like f(x) = a*(x-x0)**2 + b, then the optimum will be far less
>>>>>> precise - hoping for better accurady than sqrt(epsilon), where
>>>>>> 1+epsilon is the smallest number which does not equal 1, will
>>>>>> not be fulfilled.
>>>>>>
>>>>>>> I was thinking of an equality operator that takes in a
>>>>>>> number of significant bits that must be equal. So, if there was a double
>>>>>>> precision comparison, it could say equal to within 23 significant bits.
>>>>>
>>>>> Actually, having a compare operation which was capable of returning the
>>>>> number of identical bits, even when near a flipping boundary (0.99999..
>>>>> vs 1.0000...) would be quite useful, and probably faster than the
>>>>> traditional relative offset measure:
>>>>>
>>>>> rel_err = (estimate-exact)/exact.
>>>> I noticed that there is space for that in the results register of the
>>>> FMCP instruction in Mitch's My 66000. Since it seem pretty simple to do
>>>> in hardware, and it can be done in parallel with determining the other
>>>> bits in the result, perhaps you can convince Mitch to add it.
>>>>
>>> Yes, space if there, in fact only 20-ish of the 64-bits are defined.
>>> <
>>> But I am not sure at all how to do a divide in the FCMP instruction--
>>> That is I don't see what you want put there.
>> I may have misunderstood Terje. I thought he wanted the number of
>> mantissa bits that are identical between the two operands. This would
>> just be an integer value up to 52.
>> --
>
> I.e. __clz(bits_of(a) xor bits_of(b)) ?
> I'm not sure how useful it is. Such criterion misses case of very close FP numbers that belong to different octaves.
>
> __clz(abs(bits_of(a) - bits_of(b))) is more useful but even here we have sore cases of tiny positives vs tiny negatives that
> are going to report big difference when in reality the difference is small.
>
> So, what we really want is a bit more complicated. During subtraction we want to treat bits of FP numbers as integers in sign-magnitude
> format. Then abs() and finally CLZ.
>
> Thinking about it, nearly all required logic should be preset anyway, as part of FP adder, but it can be located in such way that
> if we reuse it then our proximity estimator will have a "FP-like" latency of 2-3 clock cycles, instead of desirable latency of 1.

I am not qualified to talk about the numerical analysis parts of this.
I was only reacting to Terje's and I assumed the utility of what he
asked for. Whether it is the right solution or not - I can only say
"Let's you and him fight!". :-)

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Useful floating point instructions

<t2gpfn$1ma6$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24591&group=comp.arch#24591

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!EhtdJS5E9ITDZpJm3Uerlg.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Useful floating point instructions
Date: Tue, 5 Apr 2022 09:01:14 +0200
Organization: Aioe.org NNTP Server
Message-ID: <t2gpfn$1ma6$1@gioia.aioe.org>
References: <t1c154$j5t$1@dont-email.me>
<6e7449c4-7134-44ea-a550-7c130d88b975n@googlegroups.com>
<cfd303cb-9107-4887-a71f-dd593d7698ebn@googlegroups.com>
<t1vnme$dhs$1@newsreader4.netcologne.de> <t20t87$1k64$1@gioia.aioe.org>
<t215f3$9o7$1@newsreader4.netcologne.de>
<a86c92ab-149c-44f2-90e1-36496b35e9c4n@googlegroups.com>
<t22c73$5b4$1@newsreader4.netcologne.de>
<b9a7bb45-110a-4652-8f99-d46f32691958n@googlegroups.com>
<10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com>
<t2cbdf$srr$1@newsreader4.netcologne.de>
<e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com>
<051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com>
<t2cp1n$6ji$1@newsreader4.netcologne.de>
<1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com>
<633e483f-007e-4d32-9b51-4e4727dfe495n@googlegroups.com>
<t2f7ms$q18$1@newsreader4.netcologne.de> <t2f9b6$1160$1@gioia.aioe.org>
<t2ff7u$rc0$1@dont-email.me>
<a390f3f8-3e4e-4804-9208-cdbad57ead38n@googlegroups.com>
<t2fs64$59t$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="55622"; posting-host="EhtdJS5E9ITDZpJm3Uerlg.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.11.1
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Tue, 5 Apr 2022 07:01 UTC

Stephen Fuld wrote:
> On 4/4/2022 3:32 PM, MitchAlsup wrote:
>> On Monday, April 4, 2022 at 2:00:18 PM UTC-5, Stephen Fuld wrote:
>>> On 4/4/2022 10:19 AM, Terje Mathisen wrote:
>>>> Thomas Koenig wrote:
>>>>> robf...@gmail.com <robf...@gmail.com> schrieb:
>>>>>>
>>>>>> Is there any use for approximately equals?
>>>>>
>>>>> Absolutely.  In numerical code,
>>>>>
>>>>>     if (abs(a-b) < eps) call hooray("Heureka!")
>>>>>
>>>>> is ubiquitous.
>>>>>
>>>>>> And how to go about determining
>>>>>> approximate equality.
>>>>>
>>>>> That is rather difficult and depends a lot on your application.
>>>>> Assume you are looking for a solution to f(x) = 0, which is
>>>>> well-behaved (i.e. has a large derivative around the root), then
>>>>> a few times machine precision would serve.
>>>>>
>>>>> If you are looking for a minimum which has the "usual" behavior
>>>>> like f(x) = a*(x-x0)**2 + b, then the optimum will be far less
>>>>> precise - hoping for better accurady than sqrt(epsilon), where
>>>>> 1+epsilon is the smallest number which does not equal 1, will
>>>>> not be fulfilled.
>>>>>
>>>>>> I was thinking of an equality operator that takes in a
>>>>>> number of significant bits that must be equal. So, if there was a
>>>>>> double
>>>>>> precision comparison, it could say equal to within 23 significant
>>>>>> bits.
>>>>
>>>> Actually, having a compare operation which was capable of returning the
>>>> number of identical bits, even when near a flipping boundary (0.99999..
>>>> vs 1.0000...) would be quite useful, and probably faster than the
>>>> traditional relative offset measure:
>>>>
>>>>   rel_err = (estimate-exact)/exact.
>>> I noticed that there is space for that in the results register of the
>>> FMCP instruction in Mitch's My 66000. Since it seem pretty simple to do
>>> in hardware, and it can be done in parallel with determining the other
>>> bits in the result, perhaps you can convince Mitch to add it.
>>>
>> Yes, space if there, in fact only 20-ish of the 64-bits are defined.
>> <
>> But I am not sure at all how to do a divide in the FCMP instruction--
>> That is I don't see what you want put there.
>
> I may have misunderstood Terje.  I thought he wanted the number of
> mantissa bits that are identical between the two operands. This would
> just be an integer value up to 52.

A bit harder: I want an approximate binary log of the difference between
two fp numbers, scaled by the first. I.e. for comparing 1.000 vs 0.999 I
want something around 10 (exact value=9.965784285) indicating the number
of significant bits.

So
double similarity(double a, double b)
{ // Return approximately log2(a/(a-b)) = log2(a)-log2(a-b)
return log2(a)-log2(abs(a-b))
}

This function would return 0 when b is twice as large as a, i.e. zero
equal mantissa bits, then negative numbers for being further away in
magnitude, with similarity(1,100) returning -6.629, or close to the
negative of similarity(1,1.01) which is 6.644. This indicates that the
exponents differ by 6 or 7 bits which is correct.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Useful floating point instructions

<t2gqai$2gm$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24593&group=comp.arch#24593

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!EhtdJS5E9ITDZpJm3Uerlg.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Useful floating point instructions
Date: Tue, 5 Apr 2022 09:15:34 +0200
Organization: Aioe.org NNTP Server
Message-ID: <t2gqai$2gm$1@gioia.aioe.org>
References: <t1c154$j5t$1@dont-email.me>
<cfd303cb-9107-4887-a71f-dd593d7698ebn@googlegroups.com>
<t1vnme$dhs$1@newsreader4.netcologne.de> <t20t87$1k64$1@gioia.aioe.org>
<t215f3$9o7$1@newsreader4.netcologne.de>
<a86c92ab-149c-44f2-90e1-36496b35e9c4n@googlegroups.com>
<t22c73$5b4$1@newsreader4.netcologne.de>
<b9a7bb45-110a-4652-8f99-d46f32691958n@googlegroups.com>
<10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com>
<t2cbdf$srr$1@newsreader4.netcologne.de>
<e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com>
<051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com>
<t2cp1n$6ji$1@newsreader4.netcologne.de>
<1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com>
<633e483f-007e-4d32-9b51-4e4727dfe495n@googlegroups.com>
<t2f7ms$q18$1@newsreader4.netcologne.de> <t2f9b6$1160$1@gioia.aioe.org>
<t2ff7u$rc0$1@dont-email.me>
<a390f3f8-3e4e-4804-9208-cdbad57ead38n@googlegroups.com>
<t2fs64$59t$1@dont-email.me>
<26802501-fdf8-41f7-9a74-9d55169b4703n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="2582"; posting-host="EhtdJS5E9ITDZpJm3Uerlg.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.11.1
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Tue, 5 Apr 2022 07:15 UTC

Michael S wrote:
> I.e. __clz(bits_of(a) xor bits_of(b)) ? I'm not sure how useful it
> is. Such criterion misses case of very close FP numbers that belong
> to different octaves.
>
> __clz(abs(bits_of(a) - bits_of(b))) is more useful but even here we
> have sore cases of tiny positives vs tiny negatives that are going to
> report big difference when in reality the difference is small.
>
> So, what we really want is a bit more complicated. During subtraction
> we want to treat bits of FP numbers as integers in sign-magnitude
> format. Then abs() and finally CLZ.
>
> Thinking about it, nearly all required logic should be preset anyway,
> as part of FP adder, but it can be located in such way that if we
> reuse it then our proximity estimator will have a "FP-like" latency
> of 2-3 clock cycles, instead of desirable latency of 1.

This is exactly right, and will at least be far faster than the real
thing which requires first a real subtraction, then one or two binary
logs. :-)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Approximate reciprocals

<vofo4hh9npgd0vaefo51khtndt80g440if@4ax.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24595&group=comp.arch#24595

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: gneun...@comcast.net (George Neuner)
Newsgroups: comp.arch
Subject: Re: Approximate reciprocals
Date: Tue, 05 Apr 2022 09:26:19 -0400
Organization: A noiseless patient Spider
Lines: 29
Message-ID: <vofo4hh9npgd0vaefo51khtndt80g440if@4ax.com>
References: <t22c73$5b4$1@newsreader4.netcologne.de> <b9a7bb45-110a-4652-8f99-d46f32691958n@googlegroups.com> <10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com> <t2cbdf$srr$1@newsreader4.netcologne.de> <e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com> <051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com> <t2cp1n$6ji$1@newsreader4.netcologne.de> <1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com> <c65c0f4b-e939-43ea-ab44-c09af20ee4fbn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Info: reader02.eternal-september.org; posting-host="b72cee86e1de1cafc3f9fd90494fe450";
logging-data="21980"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/WWSzkTUM0SSpdnZXNBbm00d4VllrbWbU="
User-Agent: ForteAgent/8.00.32.1272
Cancel-Lock: sha1:NM3PMrdKj+VYZHPuud+TtthP7Gg=
 by: George Neuner - Tue, 5 Apr 2022 13:26 UTC

On Sun, 3 Apr 2022 15:22:34 -0700 (PDT), Michael S
<already5chosen@yahoo.com> wrote:

>I'm starting to understand the problem.
>It is related to implementation of long double, more specifically, to
>settings of x87 control world. Somehow, under WSL, x87 control world
>is set to 53-bit precision (default Windows settings). Of course,
>quadmath, being Linux-originated, expects x87 control word to be set
>to 64-bit precision. Who is at fault, kernel (Microsoft) or userland
>(Fedora/SUSE) ? I'd guess, one is going to blame another and vice
>versa.

Blame should fall on the library - if code needs control flags set in
some particular way, it should make sure they are set correctly.
Relying on system defaults because they happen to line up with
expectations is just lazy.

Also the Windows default x87 setting IS 64-bit (full width) precision.
There are differences in how Windows and Linux handle save/restore of
x87 state, but if it's all working correctly, I don't see how that
would cause any problem.

I know nothing of quadmath, so this may be a stupid question ... but
documentation seems to indicate the library dates from ~2010. x86
compilers were using SSE (SIMD) for floating point long before that -
at least since Pentium 4. So why does quadmath use the x87?

George

Re: Useful floating point instructions

<jwv8rsjvf4x.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24596&group=comp.arch#24596

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Useful floating point instructions
Date: Tue, 05 Apr 2022 09:28:44 -0400
Organization: A noiseless patient Spider
Lines: 28
Message-ID: <jwv8rsjvf4x.fsf-monnier+comp.arch@gnu.org>
References: <t1c154$j5t$1@dont-email.me>
<cfd303cb-9107-4887-a71f-dd593d7698ebn@googlegroups.com>
<t1vnme$dhs$1@newsreader4.netcologne.de>
<t20t87$1k64$1@gioia.aioe.org>
<t215f3$9o7$1@newsreader4.netcologne.de>
<a86c92ab-149c-44f2-90e1-36496b35e9c4n@googlegroups.com>
<t22c73$5b4$1@newsreader4.netcologne.de>
<b9a7bb45-110a-4652-8f99-d46f32691958n@googlegroups.com>
<10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com>
<t2cbdf$srr$1@newsreader4.netcologne.de>
<e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com>
<051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com>
<t2cp1n$6ji$1@newsreader4.netcologne.de>
<1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com>
<633e483f-007e-4d32-9b51-4e4727dfe495n@googlegroups.com>
<t2f7ms$q18$1@newsreader4.netcologne.de>
<t2f9b6$1160$1@gioia.aioe.org> <t2ff7u$rc0$1@dont-email.me>
<a390f3f8-3e4e-4804-9208-cdbad57ead38n@googlegroups.com>
<t2fs64$59t$1@dont-email.me> <t2gpfn$1ma6$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="98974c52c69ac640aaa1a920a67d8b96";
logging-data="12542"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+wgxFRFA8NpC8I4jLvZbTn"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)
Cancel-Lock: sha1:qikhalqOdz565otMHeJfWb8mq1k=
sha1:89NAOQh9XOF8MtSvm5HdJV7gC3M=
 by: Stefan Monnier - Tue, 5 Apr 2022 13:28 UTC

> So
> double similarity(double a, double b)
> { // Return approximately log2(a/(a-b)) = log2(a)-log2(a-b)
> return log2(a)-log2(abs(a-b))
> }

Matches my intuition, indeed.

> This function would return 0 when b is twice as large as a, i.e. zero equal
> mantissa bits, then negative numbers for being further away in magnitude,
> with similarity(1,100) returning -6.629, or close to the negative of
> similarity(1,1.01) which is 6.644. This indicates that the exponents differ
> by 6 or 7 bits which is correct.

I suspect it would be good enough to focus on positive values (and
return 0 rather than a negative value). But admittedly, negative values
are probably a small matter of subtracting the two exponents, which
should be easy to do in a single cycle.

Regarding stuffing it into My66k's generic FMCP instruction, I'm not
sure it's such a great idea since you then have to extract the
corresponding bits to make use of them. AFAIK the current FMCP returns
a vector of booleans (and these can be extracted "for free" as part of
the PRED instruction, for example) so this "small int" would feel out
of place.

Stefan

Re: Approximate reciprocals

<2022Apr5.181651@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24601&group=comp.arch#24601

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Approximate reciprocals
Date: Tue, 05 Apr 2022 16:16:51 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 38
Message-ID: <2022Apr5.181651@mips.complang.tuwien.ac.at>
References: <t22c73$5b4$1@newsreader4.netcologne.de> <10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com> <t2cbdf$srr$1@newsreader4.netcologne.de> <e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com> <051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com> <t2cp1n$6ji$1@newsreader4.netcologne.de> <1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com> <c65c0f4b-e939-43ea-ab44-c09af20ee4fbn@googlegroups.com> <vofo4hh9npgd0vaefo51khtndt80g440if@4ax.com>
Injection-Info: reader02.eternal-september.org; posting-host="99feda7513dc74fce6b974d616718049";
logging-data="11455"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/L1bTtUd4mgkg2YYPwuWkK"
Cancel-Lock: sha1:mIgb3RLgnO7c7g2/eLbbYhTylCw=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 5 Apr 2022 16:16 UTC

George Neuner <gneuner2@comcast.net> writes:
>On Sun, 3 Apr 2022 15:22:34 -0700 (PDT), Michael S
><already5chosen@yahoo.com> wrote:
>
>>I'm starting to understand the problem.
>>It is related to implementation of long double, more specifically, to
>>settings of x87 control world. Somehow, under WSL, x87 control world
>>is set to 53-bit precision (default Windows settings). Of course,
>>quadmath, being Linux-originated, expects x87 control word to be set
>>to 64-bit precision. Who is at fault, kernel (Microsoft) or userland
>>(Fedora/SUSE) ? I'd guess, one is going to blame another and vice
>>versa.
>
>Blame should fall on the library - if code needs control flags set in
>some particular way, it should make sure they are set correctly.
>Relying on system defaults because they happen to line up with
>expectations is just lazy.

The ABI
<https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf>
clearly specifies in Table 3.3 that at process initialization the PC
bits in the x87 Floating-Point control word are set to 11 ("Double
extended precision", i.e., 64-bit rounding of the mantissa).

If WSL does not satisfy this requirement, it's obviously WSL that's to
blame. If a user-level program relies on the specification instead of
unnecessarily setting the bits itself, one may consider it lazy, but I
consider it smart not to do unnecessary busywork.

Apparently Microsoft was not willing to do the grunt work of
implementing every bit of the Linux kernel/user interface, that's why
they gave up on the WSL approach and switched to the VM approach of
WSL2. Are Microsoft lazy, or are they smart?

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Approximate reciprocals

<86r16ba23j.fsf@linuxsc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24602&group=comp.arch#24602

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: tr.17...@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: Approximate reciprocals
Date: Tue, 05 Apr 2022 10:09:52 -0700
Organization: A noiseless patient Spider
Lines: 28
Message-ID: <86r16ba23j.fsf@linuxsc.com>
References: <t1c154$j5t$1@dont-email.me> <t1qf0u$oko$1@dont-email.me> <t1qkql$ui0$1@newsreader4.netcologne.de> <394168eb-53ed-49c2-a349-4035c3177361n@googlegroups.com> <t1rm34$pg9$1@gioia.aioe.org> <7029a173-963d-402b-a184-642120b5e1b8n@googlegroups.com> <4bdfaba8-898f-4c1e-8ca1-234bf4d3ffc8n@googlegroups.com> <t1vd17$5bj$1@newsreader4.netcologne.de> <1d99080f-3c84-4a44-b2cf-271c2f3f7e90n@googlegroups.com> <t1vkm4$ar2$1@newsreader4.netcologne.de> <dc571956-dddd-469a-8b8e-30017e37d5bbn@googlegroups.com> <t20qrb$4lp$1@newsreader4.netcologne.de> <1b5bd111-40f0-41e7-9025-787e49f0fd02n@googlegroups.com> <t22705$2jl$1@newsreader4.netcologne.de> <c8c6ba2b-1314-48b7-8732-c7df882f0f3en@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: reader02.eternal-september.org; posting-host="956f008b87aca58e2fb87c0a820a02e3";
logging-data="12527"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/sjG0txrqZj19PRESvbd+GrEiBiFSOm/E="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:zAKwJvX+i5qZ42/L1HTTNyTBRoY=
sha1:9HUKENWBqwhOEtGpbq3r8IOg7vo=
 by: Tim Rentsch - Tue, 5 Apr 2022 17:09 UTC

MitchAlsup <MitchAlsup@aol.com> writes:

> On Wednesday, March 30, 2022 at 1:19:52 PM UTC-5, Thomas Koenig wrote:

[...]

>> Could you give that Chebyshev formula for 1/x?
>
> p(x) = 0.32323232 x^2 -0.48484848 x + 0.66666667
> on the interval 1.0..2.0
> Which is compared to 1000 randomized points against 1/x
> The highest error encountered has 6.63 bits of precision.

If p(x) =
0.32323232 * (x-1.5) * (x-1.5)
- 0.48484848 * (x-1.5)
+ 0.66666667

observe that, at x = 1, so 1/x = 1, the expression

- log( (1 - p(1)) / 1 ) / log( 2 )

is less than 6.6294, and that,
at x = 2, so 1/x = 0.5, the expression

- log( (p(2) - 0.5) / 0.5) / log( 2 )

also is less than 6.6294

Re: Approximate reciprocals

<31aaa9ec-0a1f-4414-a1c3-a91281561a33n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24603&group=comp.arch#24603

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5889:0:b0:2e1:afa2:65a9 with SMTP id t9-20020ac85889000000b002e1afa265a9mr3980011qta.268.1649181371053;
Tue, 05 Apr 2022 10:56:11 -0700 (PDT)
X-Received: by 2002:a05:6808:2185:b0:2d9:ebf0:fb66 with SMTP id
be5-20020a056808218500b002d9ebf0fb66mr1920127oib.69.1649181370812; Tue, 05
Apr 2022 10:56:10 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 5 Apr 2022 10:56:10 -0700 (PDT)
In-Reply-To: <2022Apr5.181651@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <t22c73$5b4$1@newsreader4.netcologne.de> <10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com>
<t2cbdf$srr$1@newsreader4.netcologne.de> <e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com>
<051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com> <t2cp1n$6ji$1@newsreader4.netcologne.de>
<1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com> <c65c0f4b-e939-43ea-ab44-c09af20ee4fbn@googlegroups.com>
<vofo4hh9npgd0vaefo51khtndt80g440if@4ax.com> <2022Apr5.181651@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <31aaa9ec-0a1f-4414-a1c3-a91281561a33n@googlegroups.com>
Subject: Re: Approximate reciprocals
From: already5...@yahoo.com (Michael S)
Injection-Date: Tue, 05 Apr 2022 17:56:11 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 50
 by: Michael S - Tue, 5 Apr 2022 17:56 UTC

On Tuesday, April 5, 2022 at 7:32:10 PM UTC+3, Anton Ertl wrote:
> George Neuner <gneu...@comcast.net> writes:
> >On Sun, 3 Apr 2022 15:22:34 -0700 (PDT), Michael S
> ><already...@yahoo.com> wrote:
> >
> >>I'm starting to understand the problem.
> >>It is related to implementation of long double, more specifically, to
> >>settings of x87 control world. Somehow, under WSL, x87 control world
> >>is set to 53-bit precision (default Windows settings). Of course,
> >>quadmath, being Linux-originated, expects x87 control word to be set
> >>to 64-bit precision. Who is at fault, kernel (Microsoft) or userland
> >>(Fedora/SUSE) ? I'd guess, one is going to blame another and vice
> >>versa.
> >
> >Blame should fall on the library - if code needs control flags set in
> >some particular way, it should make sure they are set correctly.
> >Relying on system defaults because they happen to line up with
> >expectations is just lazy.
> The ABI
> <https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf>
> clearly specifies in Table 3.3 that at process initialization the PC
> bits in the x87 Floating-Point control word are set to 11 ("Double
> extended precision", i.e., 64-bit rounding of the mantissa).
>
> If WSL does not satisfy this requirement, it's obviously WSL that's to
> blame. If a user-level program relies on the specification instead of
> unnecessarily setting the bits itself, one may consider it lazy, but I
> consider it smart not to do unnecessary busywork.
>

Except that WSL didn't promise ABI-level compatibility with iAMD64 Linux.
All it promised is a decent source-level compatibility, say, to the same or slightly
better level as between iAMD64 Linux and aarch64 Linux.
And, in this case, they didn't deliver on the promise.
The correct and the simplest thing to do for them would be, probably, to force
"long double==double" at the level of compiler and standard headers.
Then, at least for this library, everything will work correctly, same way it is
working correctly on aarch64 or POWER-LE.

> Apparently Microsoft was not willing to do the grunt work of
> implementing every bit of the Linux kernel/user interface, that's why
> they gave up on the WSL approach and switched to the VM approach of
> WSL2. Are Microsoft lazy, or are they smart?

Both, IMHO, but more the former.

>
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Approximate reciprocals

<3e016940-f867-4a45-bb31-ac23a2da4f22n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24606&group=comp.arch#24606

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:5bc1:0:b0:42c:3700:a6df with SMTP id t1-20020ad45bc1000000b0042c3700a6dfmr4643900qvt.94.1649189932811;
Tue, 05 Apr 2022 13:18:52 -0700 (PDT)
X-Received: by 2002:a05:6870:c595:b0:da:4ea1:991f with SMTP id
ba21-20020a056870c59500b000da4ea1991fmr2533146oab.147.1649189932620; Tue, 05
Apr 2022 13:18:52 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 5 Apr 2022 13:18:52 -0700 (PDT)
In-Reply-To: <vofo4hh9npgd0vaefo51khtndt80g440if@4ax.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2a0d:6fc2:55b0:ca00:4f2:5b71:6689:cd1e;
posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 2a0d:6fc2:55b0:ca00:4f2:5b71:6689:cd1e
References: <t22c73$5b4$1@newsreader4.netcologne.de> <b9a7bb45-110a-4652-8f99-d46f32691958n@googlegroups.com>
<10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com> <t2cbdf$srr$1@newsreader4.netcologne.de>
<e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com> <051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com>
<t2cp1n$6ji$1@newsreader4.netcologne.de> <1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com>
<c65c0f4b-e939-43ea-ab44-c09af20ee4fbn@googlegroups.com> <vofo4hh9npgd0vaefo51khtndt80g440if@4ax.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <3e016940-f867-4a45-bb31-ac23a2da4f22n@googlegroups.com>
Subject: Re: Approximate reciprocals
From: already5...@yahoo.com (Michael S)
Injection-Date: Tue, 05 Apr 2022 20:18:52 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 51
 by: Michael S - Tue, 5 Apr 2022 20:18 UTC

On Tuesday, April 5, 2022 at 4:26:24 PM UTC+3, George Neuner wrote:
> On Sun, 3 Apr 2022 15:22:34 -0700 (PDT), Michael S
> <already...@yahoo.com> wrote:
>
> >I'm starting to understand the problem.
> >It is related to implementation of long double, more specifically, to
> >settings of x87 control world. Somehow, under WSL, x87 control world
> >is set to 53-bit precision (default Windows settings). Of course,
> >quadmath, being Linux-originated, expects x87 control word to be set
> >to 64-bit precision. Who is at fault, kernel (Microsoft) or userland
> >(Fedora/SUSE) ? I'd guess, one is going to blame another and vice
> >versa.
>
> Blame should fall on the library - if code needs control flags set in
> some particular way, it should make sure they are set correctly.
> Relying on system defaults because they happen to line up with
> expectations is just lazy.

The author of the library relied on autoconf infrastructure that
is supposed to provides HAVE_SQRTL.
I was not able to find out what HAVE_SQRTL should mean, but
obviously it does not mean what the author thought it means.
But even if the author was more pedantic and did things in a Standard C
way, i.e. doing something like '#if LDBL_MANT_DIG >= 64' it would
still produce a wrong result, because LDBL_MANT_DIG is defined as 64!

>
> Also the Windows default x87 setting IS 64-bit (full width) precision.
> There are differences in how Windows and Linux handle save/restore of
> x87 state, but if it's all working correctly, I don't see how that
> would cause any problem.
>
>
> I know nothing of quadmath, so this may be a stupid question ... but
> documentation seems to indicate the library dates from ~2010.

But very likely is built on earlier work.

> x86 compilers were using SSE (SIMD) for floating point long before that -
> at least since Pentium 4. So why does quadmath use the x87?
>

That, may be, an interesting and important question, but it is
related to performance rather than correctness. I.e. it is of lower importance.

Also, it's not fare to say that quadmath *use x87*.
It tries to use whatever compiler provides for a 'long double' type on given platform.
Which, on x86-64 Linux and Windows/MSYS2 happens to be x87 and on "native
x86-64 Windows" happens to be SSE2.
But on WSL it happenned to be 'castrated x87'.

> George

Re: Approximate reciprocals

<2022Apr6.085405@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24608&group=comp.arch#24608

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Approximate reciprocals
Date: Wed, 06 Apr 2022 06:54:05 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 41
Message-ID: <2022Apr6.085405@mips.complang.tuwien.ac.at>
References: <t22c73$5b4$1@newsreader4.netcologne.de> <t2cbdf$srr$1@newsreader4.netcologne.de> <e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com> <051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com> <t2cp1n$6ji$1@newsreader4.netcologne.de> <1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com> <c65c0f4b-e939-43ea-ab44-c09af20ee4fbn@googlegroups.com> <vofo4hh9npgd0vaefo51khtndt80g440if@4ax.com> <2022Apr5.181651@mips.complang.tuwien.ac.at> <31aaa9ec-0a1f-4414-a1c3-a91281561a33n@googlegroups.com>
Injection-Info: reader02.eternal-september.org; posting-host="f60c0aab4b32b175d1c86946b0f03722";
logging-data="440"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+qiE7ox1/hK/2dI7qe6xGZ"
Cancel-Lock: sha1:kYuw87XJ4mlCnWKCsQNKZOUh1UQ=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Wed, 6 Apr 2022 06:54 UTC

Michael S <already5chosen@yahoo.com> writes:
>On Tuesday, April 5, 2022 at 7:32:10 PM UTC+3, Anton Ertl wrote:
>> The ABI
>> <https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf>
>> clearly specifies in Table 3.3 that at process initialization the PC
>> bits in the x87 Floating-Point control word are set to 11 ("Double
>> extended precision", i.e., 64-bit rounding of the mantissa).
>>
>> If WSL does not satisfy this requirement, it's obviously WSL that's to
>> blame. If a user-level program relies on the specification instead of
>> unnecessarily setting the bits itself, one may consider it lazy, but I
>> consider it smart not to do unnecessary busywork.
>>
>
>Except that WSL didn't promise ABI-level compatibility with iAMD64 Linux.

They didn't? If that is the case, then anyone who runs a Linux binary
on WSL is to blame.

However, I certainly had the impression that binary compatibility was
the point and the goal of WSL. Admittedly, my interest in Windows 10
features is close to zero, so I could easily have misunderstood
something.

>All it promised is a decent source-level compatibility, say, to the same or slightly
>better level as between iAMD64 Linux and aarch64 Linux.

[Citation needed] Assuming this claim is true, they made a whole
toolchain for WSL, and every user of WSL would have to rebuild every
piece of code with that toolchain? I never heard of that. It also
looks to me like something that very few would use; those who wanted
to build Unix programs for Windows already had Cygwin, and that
produces applications that run across many more Windows versions, and
without the user having to install WSL. I don't see a reason why
Microsoft would go for source-level compatibility, especially given
that they provided little or no support to Cygwin AFAIK.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Approximate reciprocals

<224a9b93-1fe9-4ba6-88c6-4aaae826403bn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24609&group=comp.arch#24609

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:c3:b0:2e3:4bd0:16c2 with SMTP id p3-20020a05622a00c300b002e34bd016c2mr6488382qtw.575.1649239219179;
Wed, 06 Apr 2022 03:00:19 -0700 (PDT)
X-Received: by 2002:a05:6870:204c:b0:da:b3f:2b86 with SMTP id
l12-20020a056870204c00b000da0b3f2b86mr3484677oad.293.1649239218873; Wed, 06
Apr 2022 03:00:18 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!3.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 6 Apr 2022 03:00:18 -0700 (PDT)
In-Reply-To: <2022Apr6.085405@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <t22c73$5b4$1@newsreader4.netcologne.de> <t2cbdf$srr$1@newsreader4.netcologne.de>
<e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com> <051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com>
<t2cp1n$6ji$1@newsreader4.netcologne.de> <1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com>
<c65c0f4b-e939-43ea-ab44-c09af20ee4fbn@googlegroups.com> <vofo4hh9npgd0vaefo51khtndt80g440if@4ax.com>
<2022Apr5.181651@mips.complang.tuwien.ac.at> <31aaa9ec-0a1f-4414-a1c3-a91281561a33n@googlegroups.com>
<2022Apr6.085405@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <224a9b93-1fe9-4ba6-88c6-4aaae826403bn@googlegroups.com>
Subject: Re: Approximate reciprocals
From: already5...@yahoo.com (Michael S)
Injection-Date: Wed, 06 Apr 2022 10:00:19 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 121
 by: Michael S - Wed, 6 Apr 2022 10:00 UTC

On Wednesday, April 6, 2022 at 10:15:04 AM UTC+3, Anton Ertl wrote:
> Michael S <already...@yahoo.com> writes:
> >On Tuesday, April 5, 2022 at 7:32:10 PM UTC+3, Anton Ertl wrote:
> >> The ABI
> >> <https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf>
> >> clearly specifies in Table 3.3 that at process initialization the PC
> >> bits in the x87 Floating-Point control word are set to 11 ("Double
> >> extended precision", i.e., 64-bit rounding of the mantissa).
> >>
> >> If WSL does not satisfy this requirement, it's obviously WSL that's to
> >> blame. If a user-level program relies on the specification instead of
> >> unnecessarily setting the bits itself, one may consider it lazy, but I
> >> consider it smart not to do unnecessary busywork.
> >>
> >
> >Except that WSL didn't promise ABI-level compatibility with iAMD64 Linux..
> They didn't? If that is the case, then anyone who runs a Linux binary
> on WSL is to blame.
>
> However, I certainly had the impression that binary compatibility was
> the point and the goal of WSL. Admittedly, my interest in Windows 10
> features is close to zero, so I could easily have misunderstood
> something.
> >All it promised is a decent source-level compatibility, say, to the same or slightly
> >better level as between iAMD64 Linux and aarch64 Linux.
> [Citation needed]

It's hard to find any official statements about anything WSL.
Even docs are often wrong. It looks like the whole enterprise was
never thought out to the ultimate end.
I wouldn't be surprised that [if WSL2 works well then] in a year or two
Microsoft will try to pretend that WSL never existed.

On one hand, they did promise to run "Linux binaries".
On the other hand, they admitted that syscall-level compatibility is not 100%.
Which means that you can't reliably run Linux binaries, doesn't it?

> Assuming this claim is true, they made a whole
> toolchain for WSL, and every user of WSL would have to rebuild every
> piece of code with that toolchain?

It seems, you're right.
$ gcc -v
Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/10/lto-wrapper
OFFLOAD_TARGET_NAMES=nvptx-none:amdgcn-amdhsa:hsa
OFFLOAD_TARGET_DEFAULT=1
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Debian 10.2.1-6' --with-bugurl=file:///usr/share/doc/gcc-10/README.Bugs --enable-languages=c,ada,c++,go,brig,d,fortran,objc,obj-c++,m2 --prefix=/usr --with-gcc-major-version-only --program-suffix=-10 --program-prefix=x86_64-linux-gnu- --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --enable-bootstrap --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-plugin --enable-default-pie --with-system-zlib --enable-libphobos-checking=release --with-target-system-zlib=auto --enable-objc-gc=auto --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-offload-targets=nvptx-none=/build/gcc-10-Km9U7s/gcc-10-10.2.1/debian/tmp-nvptx/usr,amdgcn-amdhsa=/build/gcc-10-Km9U7s/gcc-10-10.2.1/debian/tmp-gcn/usr,hsa --without-cuda-driver --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu --with-build-config=bootstrap-lto-lean --enable-link-mutex
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 10.2.1 20210110 (Debian 10.2.1-6)

So, the tool is a regular x86_64 Linux tool.

> I never heard of that. It also
> looks to me like something that very few would use; those who wanted
> to build Unix programs for Windows already had Cygwin, and that
> produces applications that run across many more Windows versions, and
> without the user having to install WSL.

Cygwin is compatible, but very slow.
Its user interface (cat&past, drag&drop etc) feels old.
And its package manager sucks.

MSYS2 is quite fast and its user interface is not bad.
But it is very significantly incompatible.
And its package manager sucks.
Or, may be, by now, after many breakages and fixes, package manager
by itself does not suck, but relatively to popular distros too few
packages available in pre-build form.

WSL was supposed to be as fast as MSYS2, at least as long as files are accessed
on its native filesystem and even access to files on Windows side was supposed
to be not much slower than MSYS2.
Its user interface is about the same as MSYS2.
And WSL was supposed to be at least as compatible at source level as Cygwin..
But most importantly, it was supposed to give you package management of major distros.

> I don't see a reason why
> Microsoft would go for source-level compatibility, especially given
> that they provided little or no support to Cygwin AFAIK.

It looks like you are correct with regard to Microsoft's intentions.
At least since ~2017.

> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Approximate reciprocals

<kojr4h96ootmmqrm6hdkbijgce4dfuu36s@4ax.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24616&group=comp.arch#24616

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: gneun...@comcast.net (George Neuner)
Newsgroups: comp.arch
Subject: Re: Approximate reciprocals
Date: Wed, 06 Apr 2022 13:45:17 -0400
Organization: A noiseless patient Spider
Lines: 29
Message-ID: <kojr4h96ootmmqrm6hdkbijgce4dfuu36s@4ax.com>
References: <10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com> <t2cbdf$srr$1@newsreader4.netcologne.de> <e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com> <051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com> <t2cp1n$6ji$1@newsreader4.netcologne.de> <1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com> <c65c0f4b-e939-43ea-ab44-c09af20ee4fbn@googlegroups.com> <vofo4hh9npgd0vaefo51khtndt80g440if@4ax.com> <2022Apr5.181651@mips.complang.tuwien.ac.at>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Info: reader02.eternal-september.org; posting-host="1eb87397a49a1f3d4c9ef24ce2ca7646";
logging-data="6487"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18UuuCE/v6P8cLG4A0DnNFg8ghtWaBMHO0="
User-Agent: ForteAgent/8.00.32.1272
Cancel-Lock: sha1:ARvH+512SdQUmxXFARN5t0z0Se0=
 by: George Neuner - Wed, 6 Apr 2022 17:45 UTC

On Tue, 05 Apr 2022 16:16:51 GMT, anton@mips.complang.tuwien.ac.at
(Anton Ertl) wrote:

>George Neuner <gneuner2@comcast.net> writes:
>
>>Blame should fall on the library - if code needs control flags set in
>>some particular way, it should make sure they are set correctly.
>>Relying on system defaults because they happen to line up with
>>expectations is just lazy.
>
>The ABI
><https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf>
>clearly specifies in Table 3.3 that at process initialization the PC
>bits in the x87 Floating-Point control word are set to 11 ("Double
>extended precision", i.e., 64-bit rounding of the mantissa).
>
>If WSL does not satisfy this requirement, it's obviously WSL that's to
>blame. If a user-level program relies on the specification instead of
>unnecessarily setting the bits itself, one may consider it lazy, but I
>consider it smart not to do unnecessary busywork.

It's hardly "busywork". We're not talking about a program - we're
talking about a /library/, which can't know whether the user program
has changed from the default.

A library MUST explicitly set the environment it needs, and it must
save/restore the program environment in case that is different.

George

Re: Approximate reciprocals

<memo.20220406220122.22520M@jgd.cix.co.uk>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24621&group=comp.arch#24621

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: jgd...@cix.co.uk (John Dallman)
Newsgroups: comp.arch
Subject: Re: Approximate reciprocals
Date: Wed, 6 Apr 2022 22:01 +0100 (BST)
Organization: A noiseless patient Spider
Lines: 73
Message-ID: <memo.20220406220122.22520M@jgd.cix.co.uk>
References: <vofo4hh9npgd0vaefo51khtndt80g440if@4ax.com>
Reply-To: jgd@cix.co.uk
Injection-Info: reader02.eternal-september.org; posting-host="535ba0cd9f6f8f78d2bd367b6606425b";
logging-data="18975"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/+1FUOOy5uszs82/N7331jIcoRRUxnwsE="
Cancel-Lock: sha1:JmNJ9kUD52ZSecWwrqeln1kyzTI=
 by: John Dallman - Wed, 6 Apr 2022 21:01 UTC

In article <vofo4hh9npgd0vaefo51khtndt80g440if@4ax.com>,
gneuner2@comcast.net (George Neuner) wrote:
> On Sun, 3 Apr 2022 15:22:34 -0700 (PDT), Michael S
> <already5chosen@yahoo.com> wrote:
> >I'm starting to understand the problem.
> >It is related to implementation of long double, more specifically,
> >to settings of x87 control world. Somehow, under WSL, x87 control
> >world is set to 53-bit precision (default Windows settings). Of
> >course, quadmath, being Linux-originated, expects x87 control word
> >to be set to 64-bit precision. Who is at fault, kernel (Microsoft)
> >or userland (Fedora/SUSE)? I'd guess, one is going to blame
> >another and vice versa.
>
> Blame should fall on the library - if code needs control flags set
> in some particular way, it should make sure they are set correctly.
> Relying on system defaults because they happen to line up with
> expectations is just lazy.

Changing those flags takes time - on the microsecond scale,because it
often requires the pipeline to empty before the change - so you don't
want to be doing it at every entry to a quad-precision library, where
you'd hope that some operations would be timed on the nanosecond scale.
It's probably better to document what's needed, so that the application
can control things.

> Also the Windows default x87 setting IS 64-bit (full width)
> precision.

Default in what circumstances? In the Microsoft C/C++ run-time
environment, the default has been ordinary double precision, with a
53-bit mantissa, since 1996 to my certain knowledge. The hardware default
for the x87 registers is long double, with a 64-bit mantissa.

I've had painful experiences with the precision being set wrong in both
directions:

A customer for the modeller I work on had bought in a Visual Basic
interpreter from a third party supplier, which they used as a macro
language in their application. (This was before Microsoft offered an
embeddable interpreter.) This third-party interpreter changed the default
precision to long double "because it was more accurate" although they
didn't seem to have any idea about its effects. Because this was new to
us, it took us a week or so to figure out what the problem was, during
which the customer was chasing us twice a day with a lot of strong
language. When we could prove what the problem was, they rolled all that
anger back up, called the head of the Basic company, and let it all off
at him. The interpreter was changed, swiftly, and gave no further trouble.

A couple of years later, I was updating the x86 Linux build standard.
Linux had shown lots of fiddling small differences in test results, which
were a nuisance. The C/C++ on Linux was inheriting the hardware default
of long double evaluation, but variables saves into memory were saved as
doubles. This adds enough jitter to be noticeable: setting it to use
double evaluation made it far more consistent with other platforms. We
did that in the test harness and documented it for customers.

A few years after that, someone was writing example code for Microsoft's
DirectX viewing library. But everything you viewed in it was distorted,
and rapidly got smushed into really weird shapes. I grabbed the processor
manual, broke into the debugger, checked the control flags, found they
were set to single-precision evaluation, and got accused of witchcraft.
DirectX on 32-bit x86 sets the precision to single on entry, and by
default doesn't change it back. This naturally messes up any other
libraries that were expecting double. Microsoft could not explain why
they were doing this, but I suspect there was some assembler in there,
written by someone who had left. They did point out an option that would
make DirectX save and restore the floating-point controls, but warned us
that it would make things slow. They hadn't understood the timescales:
humans interact with DirectX on timescales of tenths of seconds, so a
couple of microseconds is irrelevant.

John

Re: Approximate reciprocals

<8bac8eb6-1282-49c4-9a30-09dc44fe3c14n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24624&group=comp.arch#24624

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5889:0:b0:2e1:afa2:65a9 with SMTP id t9-20020ac85889000000b002e1afa265a9mr9179508qta.268.1649283829135;
Wed, 06 Apr 2022 15:23:49 -0700 (PDT)
X-Received: by 2002:a4a:b343:0:b0:324:512e:e340 with SMTP id
n3-20020a4ab343000000b00324512ee340mr3537727ooo.59.1649283828899; Wed, 06 Apr
2022 15:23:48 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 6 Apr 2022 15:23:48 -0700 (PDT)
In-Reply-To: <memo.20220406220122.22520M@jgd.cix.co.uk>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:1c5a:131a:c7bf:75c6;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:1c5a:131a:c7bf:75c6
References: <vofo4hh9npgd0vaefo51khtndt80g440if@4ax.com> <memo.20220406220122.22520M@jgd.cix.co.uk>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <8bac8eb6-1282-49c4-9a30-09dc44fe3c14n@googlegroups.com>
Subject: Re: Approximate reciprocals
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 06 Apr 2022 22:23:49 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 88
 by: MitchAlsup - Wed, 6 Apr 2022 22:23 UTC

On Wednesday, April 6, 2022 at 4:01:27 PM UTC-5, John Dallman wrote:
> In article <vofo4hh9npgd0vaef...@4ax.com>,
> gneu...@comcast.net (George Neuner) wrote:
> > On Sun, 3 Apr 2022 15:22:34 -0700 (PDT), Michael S
> > <already...@yahoo.com> wrote:
> > >I'm starting to understand the problem.
> > >It is related to implementation of long double, more specifically,
> > >to settings of x87 control world. Somehow, under WSL, x87 control
> > >world is set to 53-bit precision (default Windows settings). Of
> > >course, quadmath, being Linux-originated, expects x87 control word
> > >to be set to 64-bit precision. Who is at fault, kernel (Microsoft)
> > >or userland (Fedora/SUSE)? I'd guess, one is going to blame
> > >another and vice versa.
> >
> > Blame should fall on the library - if code needs control flags set
> > in some particular way, it should make sure they are set correctly.
> > Relying on system defaults because they happen to line up with
> > expectations is just lazy.
<
> Changing those flags takes time - on the microsecond scale,because it
> often requires the pipeline to empty before the change - so you don't
> want to be doing it at every entry to a quad-precision library, where
> you'd hope that some operations would be timed on the nanosecond scale.
> It's probably better to document what's needed, so that the application
> can control things.
<
Yes, the x87 flag design is a true debacle..........
<
> > Also the Windows default x87 setting IS 64-bit (full width)
> > precision.
> Default in what circumstances? In the Microsoft C/C++ run-time
> environment, the default has been ordinary double precision, with a
> 53-bit mantissa, since 1996 to my certain knowledge. The hardware default
> for the x87 registers is long double, with a 64-bit mantissa.
>
> I've had painful experiences with the precision being set wrong in both
> directions:
<
>--------------------------------------Computer architecture Lesson-------------------------------------
>
> A customer for the modeller I work on had bought in a Visual Basic
> interpreter from a third party supplier, which they used as a macro
> language in their application. (This was before Microsoft offered an
> embeddable interpreter.) This third-party interpreter changed the default
> precision to long double "because it was more accurate" although they
> didn't seem to have any idea about its effects. Because this was new to
> us, it took us a week or so to figure out what the problem was, during
> which the customer was chasing us twice a day with a lot of strong
> language. When we could prove what the problem was, they rolled all that
> anger back up, called the head of the Basic company, and let it all off
> at him. The interpreter was changed, swiftly, and gave no further trouble.
>
>
> A couple of years later, I was updating the x86 Linux build standard.
> Linux had shown lots of fiddling small differences in test results, which
> were a nuisance. The C/C++ on Linux was inheriting the hardware default
> of long double evaluation, but variables saves into memory were saved as
> doubles. This adds enough jitter to be noticeable: setting it to use
> double evaluation made it far more consistent with other platforms. We
> did that in the test harness and documented it for customers.
>
> A few years after that, someone was writing example code for Microsoft's
> DirectX viewing library. But everything you viewed in it was distorted,
> and rapidly got smushed into really weird shapes. I grabbed the processor
> manual, broke into the debugger, checked the control flags, found they
> were set to single-precision evaluation, and got accused of witchcraft.
> DirectX on 32-bit x86 sets the precision to single on entry, and by
> default doesn't change it back. This naturally messes up any other
> libraries that were expecting double. Microsoft could not explain why
> they were doing this, but I suspect there was some assembler in there,
> written by someone who had left. They did point out an option that would
> make DirectX save and restore the floating-point controls, but warned us
> that it would make things slow. They hadn't understood the timescales:
> humans interact with DirectX on timescales of tenths of seconds, so a
> couple of microseconds is irrelevant.
<
>-------------------------------------------End of Lesson---------------------------------------------------------
<
3 (more) anecdotal reasons one does not want to change FP widths with
mode bits. You want to change FP widths by using a different OpCode !!
<
>--------------------------------------------------------------------------------------------------------------------------
<
But notice that the FPU can accept a new set of mode bits every cycle !!
This is the only way for HyperThreading to work (reasonably) !!!
<
But nobody ever got around to fixing how slow updating the mode bits is.
>
> John

Re: Approximate reciprocals

<2022Apr7.104701@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24630&group=comp.arch#24630

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Approximate reciprocals
Date: Thu, 07 Apr 2022 08:47:01 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 39
Message-ID: <2022Apr7.104701@mips.complang.tuwien.ac.at>
References: <10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com> <e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com> <051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com> <t2cp1n$6ji$1@newsreader4.netcologne.de> <1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com> <c65c0f4b-e939-43ea-ab44-c09af20ee4fbn@googlegroups.com> <vofo4hh9npgd0vaefo51khtndt80g440if@4ax.com> <2022Apr5.181651@mips.complang.tuwien.ac.at> <kojr4h96ootmmqrm6hdkbijgce4dfuu36s@4ax.com>
Injection-Info: reader02.eternal-september.org; posting-host="f2d9ab6ec6063496157e55acdd76b14a";
logging-data="11344"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18sZbPSk4FtotqdxpvMGIDq"
Cancel-Lock: sha1:9euxMxGzBb7X2KssbHjhEjlovQ4=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Thu, 7 Apr 2022 08:47 UTC

George Neuner <gneuner2@comcast.net> writes:
>On Tue, 05 Apr 2022 16:16:51 GMT, anton@mips.complang.tuwien.ac.at
>(Anton Ertl) wrote:
>>The ABI
>><https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf>
>>clearly specifies in Table 3.3 that at process initialization the PC
>>bits in the x87 Floating-Point control word are set to 11 ("Double
>>extended precision", i.e., 64-bit rounding of the mantissa).
>>
>>If WSL does not satisfy this requirement, it's obviously WSL that's to
>>blame. If a user-level program relies on the specification instead of
>>unnecessarily setting the bits itself, one may consider it lazy, but I
>>consider it smart not to do unnecessary busywork.
>
>It's hardly "busywork". We're not talking about a program - we're
>talking about a /library/, which can't know whether the user program
>has changed from the default.

We were actually talking about the program that worked nicely on Linux,
and failed on WSL.

But yes, the library may not be doing what it should do in a program
that actively sets the precision to some other values than long
double.

I can imagine some measures that would deal with the problem in the
usual case: E.g., on first invocation the function for setting the
precision control changes the vector of the quad-math functions to
versions that set the precision control for their own needs (and save
and restore the x87 control word). However, that would not work for
programs that don't use these functions, but set the control word
directly (in assembly language), so unless the library is documented
as requiring the function calls for changing the precision control,
the library would still be deficient.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Approximate reciprocals

<1398a4bd-bd48-4e60-ab45-383e1bcc0750n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24635&group=comp.arch#24635

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:e66:b0:441:7695:8eb7 with SMTP id jz6-20020a0562140e6600b0044176958eb7mr12224236qvb.127.1649348975469;
Thu, 07 Apr 2022 09:29:35 -0700 (PDT)
X-Received: by 2002:a05:6870:42c5:b0:db:ec20:9879 with SMTP id
z5-20020a05687042c500b000dbec209879mr6725089oah.136.1649348975209; Thu, 07
Apr 2022 09:29:35 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!nntp.club.cc.cmu.edu!45.76.7.193.MISMATCH!3.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 7 Apr 2022 09:29:34 -0700 (PDT)
In-Reply-To: <2022Apr7.104701@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2a0d:6fc2:55b0:ca00:11f5:aa60:264d:1be9;
posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 2a0d:6fc2:55b0:ca00:11f5:aa60:264d:1be9
References: <10f7aa7f-00db-4ade-9e2e-e71602654f49n@googlegroups.com>
<e3cd8ed7-de1d-40ee-a21d-798cd2d3a3b6n@googlegroups.com> <051bdc59-4b63-4a31-b898-fe9b700dbfc5n@googlegroups.com>
<t2cp1n$6ji$1@newsreader4.netcologne.de> <1196d0e2-98bd-4fb0-a98f-4c1662e75f0en@googlegroups.com>
<c65c0f4b-e939-43ea-ab44-c09af20ee4fbn@googlegroups.com> <vofo4hh9npgd0vaefo51khtndt80g440if@4ax.com>
<2022Apr5.181651@mips.complang.tuwien.ac.at> <kojr4h96ootmmqrm6hdkbijgce4dfuu36s@4ax.com>
<2022Apr7.104701@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1398a4bd-bd48-4e60-ab45-383e1bcc0750n@googlegroups.com>
Subject: Re: Approximate reciprocals
From: already5...@yahoo.com (Michael S)
Injection-Date: Thu, 07 Apr 2022 16:29:35 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 51
 by: Michael S - Thu, 7 Apr 2022 16:29 UTC

On Thursday, April 7, 2022 at 12:03:32 PM UTC+3, Anton Ertl wrote:
> George Neuner <gneu...@comcast.net> writes:
> >On Tue, 05 Apr 2022 16:16:51 GMT, an...@mips.complang.tuwien.ac.at
> >(Anton Ertl) wrote:
> >>The ABI
> >><https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf>
> >>clearly specifies in Table 3.3 that at process initialization the PC
> >>bits in the x87 Floating-Point control word are set to 11 ("Double
> >>extended precision", i.e., 64-bit rounding of the mantissa).
> >>
> >>If WSL does not satisfy this requirement, it's obviously WSL that's to
> >>blame. If a user-level program relies on the specification instead of
> >>unnecessarily setting the bits itself, one may consider it lazy, but I
> >>consider it smart not to do unnecessary busywork.
> >
> >It's hardly "busywork". We're not talking about a program - we're
> >talking about a /library/, which can't know whether the user program
> >has changed from the default.
> We were actually talking about the program that worked nicely on Linux,
> and failed on WSL.
>

More I think about it, less I see a justification for WSL control word defaults.

> But yes, the library may not be doing what it should do in a program
> that actively sets the precision to some other values than long
> double.
>
> I can imagine some measures that would deal with the problem in the
> usual case: E.g., on first invocation the function for setting the
> precision control changes the vector of the quad-math functions to
> versions that set the precision control for their own needs (and save
> and restore the x87 control word). However, that would not work for
> programs that don't use these functions, but set the control word
> directly (in assembly language), so unless the library is documented
> as requiring the function calls for changing the precision control,
> the library would still be deficient.
>

In this particular case, the only winning strategy is to refuse to play.
I.e. quadmath library should be and could be coded without any use of 80-bit
FP and with very minimalist use of 64-bit FP. On modern 64-bit x86 (and,
I suppose, on modern ARM and POWER) it's not only the most robust way,
precision-wise, but also the fastest.

But I can imagine other cases where judicious use of 80-bit FP is really
beneficiary.

> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Pages:12345678910111213
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor