Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Sorry. I just realized this sentance makes no sense :) -- Ian Main


devel / comp.arch / Signed division by 2^n

SubjectAuthor
* Signed division by 2^nThomas Koenig
+* Re: Signed division by 2^nMarcus
|`- Re: Signed division by 2^nMitchAlsup
+- Re: Signed division by 2^nStephen Fuld
+* Re: Signed division by 2^nAnton Ertl
|+* Re: Signed division by 2^nMitchAlsup
||`* Re: Signed division by 2^nThomas Koenig
|| `* Re: saturating arithmetic, not Signed division by 2^nJohn Levine
||  +- Re: saturating arithmetic, not Signed division by 2^nMitchAlsup
||  +- Re: saturating arithmetic, not Signed division by 2^nBrian G. Lucas
||  `* Re: saturating arithmetic, not Signed division by 2^nJeremy Linton
||   +* Re: saturating arithmetic, not Signed division by 2^nStefan Monnier
||   |+* Re: saturating arithmetic, not Signed division by 2^nThomas Koenig
||   ||+- Re: saturating arithmetic, not Signed division by 2^nMitchAlsup
||   ||+- Re: saturating arithmetic, not Signed division by 2^nStefan Monnier
||   ||+- Re: saturating arithmetic, not Signed division by 2^nDavid Brown
||   ||`- Re: saturating arithmetic, not Signed division by 2^nAnton Ertl
||   |`- Re: saturating arithmetic, not Signed division by 2^nIvan Godard
||   +- Re: saturating arithmetic, not Signed division by 2^nEricP
||   `* Re: saturating arithmetic, not Signed division by 2^nAnton Ertl
||    +- Re: saturating arithmetic, not Signed division by 2^nMitchAlsup
||    `* Re: saturating arithmetic, not Signed division by 2^nGeorge Neuner
||     +* Re: saturating arithmetic, not Signed division by 2^nNiklas Holsti
||     |`- Re: saturating arithmetic, not Signed division by 2^nBill Findlay
||     +* Re: saturating arithmetic, not Signed division by 2^nBill Findlay
||     |`- Re: saturating arithmetic, not Signed division by 2^nTerje Mathisen
||     +* Re: saturating arithmetic, not Signed division by 2^nTerje Mathisen
||     |`* Re: saturating arithmetic, not Signed division by 2^nThomas Koenig
||     | `* Re: saturating arithmetic, not Signed division by 2^nTerje Mathisen
||     |  +- Re: saturating arithmetic, not Signed division by 2^nMitchAlsup
||     |  `* Re: saturating arithmetic, not Signed division by 2^nAndreas Eder
||     |   `* Re: saturating arithmetic, not Signed division by 2^nTerje Mathisen
||     |    `* Re: saturating arithmetic, not Signed division by 2^nThomas Koenig
||     |     `* Re: saturating arithmetic, not Signed division by 2^nTerje Mathisen
||     |      `* Re: saturating arithmetic, not Signed division by 2^nThomas Koenig
||     |       `- Re: saturating arithmetic, not Signed division by 2^nThomas Koenig
||     `- Re: saturating arithmetic, not Signed division by 2^nMitchAlsup
|+* Re: Signed division by 2^nBGB
||+* Re: Signed division by 2^nIvan Godard
|||+- Re: Signed division by 2^nAnton Ertl
|||+- Re: Signed division by 2^nTerje Mathisen
|||+- Re: Signed division by 2^nMitchAlsup
|||`* Re: Signed division by 2^nBGB
||| `* Re: Signed division by 2^nMitchAlsup
|||  `* Re: Signed division by 2^nBGB
|||   `* Re: Signed division by 2^nMitchAlsup
|||    +* More complex instructions to reduce cycle overheadStefan Monnier
|||    |+* Re: More complex instructions to reduce cycle overheadIvan Godard
|||    ||`* Re: More complex instructions to reduce cycle overheadMitchAlsup
|||    || `- Re: More complex instructions to reduce cycle overheadIvan Godard
|||    |+* Re: More complex instructions to reduce cycle overheadMitchAlsup
|||    ||+- Re: More complex instructions to reduce cycle overheadStefan Monnier
|||    ||`* Re: More complex instructions to reduce cycle overheadIvan Godard
|||    || `* Re: More complex instructions to reduce cycle overheadMitchAlsup
|||    ||  `* Re: More complex instructions to reduce cycle overheadIvan Godard
|||    ||   `* Re: More complex instructions to reduce cycle overheadMitchAlsup
|||    ||    `* Re: More complex instructions to reduce cycle overheadIvan Godard
|||    ||     +* Re: More complex instructions to reduce cycle overheadEricP
|||    ||     |+* Re: More complex instructions to reduce cycle overheadThomas Koenig
|||    ||     ||+* Re: More complex instructions to reduce cycle overheadEricP
|||    ||     |||+* Re: More complex instructions to reduce cycle overheadThomas Koenig
|||    ||     ||||`* Re: More complex instructions to reduce cycle overheadBGB
|||    ||     |||| `* Re: More complex instructions to reduce cycle overheadEricP
|||    ||     ||||  +* Re: More complex instructions to reduce cycle overheadMitchAlsup
|||    ||     ||||  |+* Re: More complex instructions to reduce cycle overheadBGB
|||    ||     ||||  ||`* Re: More complex instructions to reduce cycle overheadMarcus
|||    ||     ||||  || `- Re: More complex instructions to reduce cycle overheadBGB
|||    ||     ||||  |`- Re: More complex instructions to reduce cycle overheadJimBrakefield
|||    ||     ||||  `* Re: More complex instructions to reduce cycle overheadBGB
|||    ||     ||||   +* Re: More complex instructions to reduce cycle overheadMarcus
|||    ||     ||||   |`* Re: More complex instructions to reduce cycle overheadBGB
|||    ||     ||||   | `* Re: More complex instructions to reduce cycle overheadEricP
|||    ||     ||||   |  `* Re: More complex instructions to reduce cycle overheadBGB
|||    ||     ||||   |   `- Re: More complex instructions to reduce cycle overheadEricP
|||    ||     ||||   `* Re: More complex instructions to reduce cycle overheadEricP
|||    ||     ||||    `* Re: More complex instructions to reduce cycle overheadMitchAlsup
|||    ||     ||||     `* Re: More complex instructions to reduce cycle overheadBGB
|||    ||     ||||      +* Re: More complex instructions to reduce cycle overheadEricP
|||    ||     ||||      |`* Re: More complex instructions to reduce cycle overheadBGB
|||    ||     ||||      | +- Timing... (Re: More complex instructions to reduce cycle overhead)BGB
|||    ||     ||||      | `* Re: Timing... (Re: More complex instructions to reduce cycle overhead)JimBrakefield
|||    ||     ||||      |  `- Re: Timing... (Re: More complex instructions to reduce cycleBGB
|||    ||     ||||      `* Re: More complex instructions to reduce cycle overheadMarcus
|||    ||     ||||       `- Re: More complex instructions to reduce cycle overheadBGB
|||    ||     |||`* Re: More complex instructions to reduce cycle overheadpaul wallich
|||    ||     ||| `- Re: More complex instructions to reduce cycle overheadMitchAlsup
|||    ||     ||`- Re: More complex instructions to reduce cycle overheadStefan Monnier
|||    ||     |`- Re: More complex instructions to reduce cycle overheadMitchAlsup
|||    ||     +* Re: More complex instructions to reduce cycle overheadPaul A. Clayton
|||    ||     |`- Re: More complex instructions to reduce cycle overheadPaul A. Clayton
|||    ||     `- Re: More complex instructions to reduce cycle overheadMitchAlsup
|||    |`* Re: More complex instructions to reduce cycle overheadAnton Ertl
|||    | `- Re: More complex instructions to reduce cycle overheadTerje Mathisen
|||    `* Re: Signed division by 2^nBGB
|||     `* Re: Signed division by 2^nMitchAlsup
|||      `- Re: Signed division by 2^nBGB
||`- Re: Signed division by 2^nThomas Koenig
|`* Re: Signed division by 2^naph
| `- Re: Signed division by 2^nAnton Ertl
`- Re: Signed division by 2^nIvan Godard

Pages:1234
Signed division by 2^n

<s7dn5p$78r$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16610&group=comp.arch#16610

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-6262-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Signed division by 2^n
Date: Tue, 11 May 2021 10:44:09 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <s7dn5p$78r$1@newsreader4.netcologne.de>
Injection-Date: Tue, 11 May 2021 10:44:09 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-6262-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:6262:0:7285:c2ff:fe6c:992d";
logging-data="7451"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 11 May 2021 10:44 UTC

Everybody should know that a signed division by 2^n cannot be
done with a single right shift :-) but having this as a single
instruction instead of four without branches, three with a branch
or two if you happen to own a POWER would make sense, especially
a conditional add of 2**n - 1 should be easier to do in hardware
than in software.

Does ISA actually implement this?

Re: Signed division by 2^n

<s7ebtr$uvh$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16616&group=comp.arch#16616

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: m.del...@this.bitsnbites.eu (Marcus)
Newsgroups: comp.arch
Subject: Re: Signed division by 2^n
Date: Tue, 11 May 2021 18:38:19 +0200
Organization: A noiseless patient Spider
Lines: 34
Message-ID: <s7ebtr$uvh$1@dont-email.me>
References: <s7dn5p$78r$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 11 May 2021 16:38:19 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="c8679b4e2bd34864a43c7e801490a814";
logging-data="31729"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX195fZxaH5ywnhmZazbjFNXDMehdHkSLBcg="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.8.1
Cancel-Lock: sha1:LyTmz6/nCHPTY1skLf/1qptQlCk=
In-Reply-To: <s7dn5p$78r$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: Marcus - Tue, 11 May 2021 16:38 UTC

On 2021-05-11, Thomas Koenig wrote:
> Everybody should know that a signed division by 2^n cannot be
> done with a single right shift :-) but having this as a single
> instruction instead of four without branches, three with a branch
> or two if you happen to own a POWER would make sense, especially
> a conditional add of 2**n - 1 should be easier to do in hardware
> than in software.
>
> Does ISA actually implement this?
>

Good point. I think that many programmers mistakenly do x >> n
thinking that they implement x / 2^n, even when x is a signed
integer. Likewise, many programmers probably think that x / 2^n
will automatically be converted to a plain shift, which is only
true for unsigned integers.

In other words - there may be a use for such an instruction.

Different ISA:s perform differently, of course. E.g. see what
happens on godbolt.org (one of my favorite online services):

https://godbolt.org/z/KGKnqzasz

For the C code:

int div16(int x) {
return x / 16;
}

....different compilers and ISA:s seem to produce between 3 and
6 instructions (excluding the return).

/Marcus

Re: Signed division by 2^n

<s7ece8$989$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16617&group=comp.arch#16617

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Signed division by 2^n
Date: Tue, 11 May 2021 09:47:03 -0700
Organization: A noiseless patient Spider
Lines: 18
Message-ID: <s7ece8$989$1@dont-email.me>
References: <s7dn5p$78r$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 11 May 2021 16:47:05 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="d291f65b192254a3f48b4557d18e5032";
logging-data="9481"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/9XEwAhLoLGV1fXAc7sRY84Ypq9LTr13I="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.1
Cancel-Lock: sha1:hzatEZxX0k/qur+/rzoaDUu0K1o=
In-Reply-To: <s7dn5p$78r$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: Stephen Fuld - Tue, 11 May 2021 16:47 UTC

On 5/11/2021 3:44 AM, Thomas Koenig wrote:
> Everybody should know that a signed division by 2^n cannot be
> done with a single right shift :-) but having this as a single
> instruction instead of four without branches, three with a branch
> or two if you happen to own a POWER would make sense, especially
> a conditional add of 2**n - 1 should be easier to do in hardware
> than in software.
>
> Does ISA actually implement this?

If you are talking about a "sign preserving" right shift (i.e. algebraic
versus logical shift), then yes, there are ISAs that implement that
instruction.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Signed division by 2^n

<f6b84e77-1005-42d9-add2-7770e48984d1n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16619&group=comp.arch#16619

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a0c:8d0b:: with SMTP id r11mr14979989qvb.22.1620753149990;
Tue, 11 May 2021 10:12:29 -0700 (PDT)
X-Received: by 2002:a54:4794:: with SMTP id o20mr15730432oic.99.1620753149795;
Tue, 11 May 2021 10:12:29 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 11 May 2021 10:12:29 -0700 (PDT)
In-Reply-To: <s7ebtr$uvh$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <s7dn5p$78r$1@newsreader4.netcologne.de> <s7ebtr$uvh$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f6b84e77-1005-42d9-add2-7770e48984d1n@googlegroups.com>
Subject: Re: Signed division by 2^n
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 11 May 2021 17:12:29 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Tue, 11 May 2021 17:12 UTC

On Tuesday, May 11, 2021 at 11:38:22 AM UTC-5, Marcus wrote:
> On 2021-05-11, Thomas Koenig wrote:
> > Everybody should know that a signed division by 2^n cannot be
> > done with a single right shift :-) but having this as a single
> > instruction instead of four without branches, three with a branch
> > or two if you happen to own a POWER would make sense, especially
> > a conditional add of 2**n - 1 should be easier to do in hardware
> > than in software.
> >
> > Does ISA actually implement this?
> >
> Good point. I think that many programmers mistakenly do x >> n
> thinking that they implement x / 2^n, even when x is a signed
> integer. Likewise, many programmers probably think that x / 2^n
> will automatically be converted to a plain shift, which is only
> true for unsigned integers.
<
I taught a FORTRAN compiler to do the 4 instruction version
way back in 1981.
>
> In other words - there may be a use for such an instruction.
<
Also note: this is one compelling reason for 1s-complement
arithmetics.
>
> Different ISA:s perform differently, of course. E.g. see what
> happens on godbolt.org (one of my favorite online services):
>
> https://godbolt.org/z/KGKnqzasz
>
> For the C code:
>
> int div16(int x) {
> return x / 16;
> }
>
> ...different compilers and ISA:s seem to produce between 3 and
> 6 instructions (excluding the return).
>
> /Marcus

Re: Signed division by 2^n

<2021May11.193250@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16621&group=comp.arch#16621

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Signed division by 2^n
Date: Tue, 11 May 2021 17:32:50 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 30
Distribution: world
Message-ID: <2021May11.193250@mips.complang.tuwien.ac.at>
References: <s7dn5p$78r$1@newsreader4.netcologne.de>
Injection-Info: reader02.eternal-september.org; posting-host="02e96990a4a520aa202191ff30918326";
logging-data="11867"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX181URJRCwYxYpqEuZDL6yqV"
Cancel-Lock: sha1:i+hPf+4H8JVoFWh2pyYgHRCUEz4=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Tue, 11 May 2021 17:32 UTC

Thomas Koenig <tkoenig@netcologne.de> writes:
>Everybody should know that a signed division by 2^n cannot be
>done with a single right shift :-)

Depends on the language and sometimes on its implementation. E.g., on
Gforth:

-9 4 / . -3 ok
-9 2 arshift . -3 ok

>but having this as a single
>instruction instead of four without branches, three with a branch
>or two if you happen to own a POWER would make sense, especially
>a conditional add of 2**n - 1 should be easier to do in hardware
>than in software.
>
>Does ISA actually implement this?

Even Aarch64, which supports pretty exotic stuff in some cases, needs
4 instructions for a signed symmetric division (or at least that's
what I get with gcc).

Is this frequent enough to merit a special instruction? Or do people
use unsigned numbers or explicit shift if they are
performance-conscious and want to divide by 2^n?

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Signed division by 2^n

<888390de-f439-4653-a57c-c0febaa51c8fn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16624&group=comp.arch#16624

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:4a:: with SMTP id t10mr29990868qkt.249.1620762059150;
Tue, 11 May 2021 12:40:59 -0700 (PDT)
X-Received: by 2002:aca:30cf:: with SMTP id w198mr4809390oiw.175.1620762058959;
Tue, 11 May 2021 12:40:58 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!feeder1.feed.usenet.farm!feed.usenet.farm!news-out.netnews.com!news.alt.net!fdc3.netnews.com!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 11 May 2021 12:40:58 -0700 (PDT)
In-Reply-To: <2021May11.193250@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <s7dn5p$78r$1@newsreader4.netcologne.de> <2021May11.193250@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <888390de-f439-4653-a57c-c0febaa51c8fn@googlegroups.com>
Subject: Re: Signed division by 2^n
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 11 May 2021 19:40:59 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 1240
 by: MitchAlsup - Tue, 11 May 2021 19:40 UTC

What percentage of integer {signed and unsigned} arithmetic
uses saturated semantics ?

Re: Signed division by 2^n

<s7en70$n5v$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16627&group=comp.arch#16627

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-6262-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Signed division by 2^n
Date: Tue, 11 May 2021 19:50:56 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <s7en70$n5v$1@newsreader4.netcologne.de>
References: <s7dn5p$78r$1@newsreader4.netcologne.de>
<2021May11.193250@mips.complang.tuwien.ac.at>
<888390de-f439-4653-a57c-c0febaa51c8fn@googlegroups.com>
Injection-Date: Tue, 11 May 2021 19:50:56 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-6262-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:6262:0:7285:c2ff:fe6c:992d";
logging-data="23743"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 11 May 2021 19:50 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:

> What percentage of integer {signed and unsigned} arithmetic
> uses saturated semantics ?

What programming language supports or demands it? I am
only aware of the "unsigned wraps around, integer overflow
is undefined" variety.

Re: Signed division by 2^n

<s7epaa$6ao$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16630&group=comp.arch#16630

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Signed division by 2^n
Date: Tue, 11 May 2021 13:26:50 -0700
Organization: A noiseless patient Spider
Lines: 16
Message-ID: <s7epaa$6ao$1@dont-email.me>
References: <s7dn5p$78r$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 11 May 2021 20:26:51 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8d7b9135d963dcff083c5b1cfb1a474a";
logging-data="6488"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18PqqZWb/3c+eNhewUBXhaT"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.0
Cancel-Lock: sha1:FR0V5OGaFCkiOVpFK6k8kkv/JAI=
In-Reply-To: <s7dn5p$78r$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: Ivan Godard - Tue, 11 May 2021 20:26 UTC

On 5/11/2021 3:44 AM, Thomas Koenig wrote:
> Everybody should know that a signed division by 2^n cannot be
> done with a single right shift :-) but having this as a single
> instruction instead of four without branches, three with a branch
> or two if you happen to own a POWER would make sense, especially
> a conditional add of 2**n - 1 should be easier to do in hardware
> than in software.
>
> Does ISA actually implement this?
>

Yes. Mill right shift carries a rounding mode attribute, like FP, which
gives this behavior among other interesting things. The tool chain
replaces signed DIV with the corresponding single shift instruction as a
peephole. And left shift carries an overflow behavior attribute too
(modulo/saturate/throw/double width result).

Re: saturating arithmetic, not Signed division by 2^n

<s7er6f$2qlr$1@gal.iecc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16632&group=comp.arch#16632

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.cmpublishers.com!adore2!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: saturating arithmetic, not Signed division by 2^n
Date: Tue, 11 May 2021 20:58:55 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <s7er6f$2qlr$1@gal.iecc.com>
References: <s7dn5p$78r$1@newsreader4.netcologne.de> <2021May11.193250@mips.complang.tuwien.ac.at> <888390de-f439-4653-a57c-c0febaa51c8fn@googlegroups.com> <s7en70$n5v$1@newsreader4.netcologne.de>
Injection-Date: Tue, 11 May 2021 20:58:55 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="92859"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <s7dn5p$78r$1@newsreader4.netcologne.de> <2021May11.193250@mips.complang.tuwien.ac.at> <888390de-f439-4653-a57c-c0febaa51c8fn@googlegroups.com> <s7en70$n5v$1@newsreader4.netcologne.de>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Tue, 11 May 2021 20:58 UTC

It appears that Thomas Koenig <tkoenig@netcologne.de> said:
>MitchAlsup <MitchAlsup@aol.com> schrieb:
>
>> What percentage of integer {signed and unsigned} arithmetic
>> uses saturated semantics ?
>
>What programming language supports or demands it? I am
>only aware of the "unsigned wraps around, integer overflow
>is undefined" variety.

Given that many popular chips have dedicated hardware that does saturated
arithmetic, there must be something that uses it.

R's,
John
--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: saturating arithmetic, not Signed division by 2^n

<2651b477-2868-4505-b282-f22a71ca59adn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16633&group=comp.arch#16633

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:40f:: with SMTP id n15mr12170294qtx.10.1620769617763;
Tue, 11 May 2021 14:46:57 -0700 (PDT)
X-Received: by 2002:aca:c488:: with SMTP id u130mr23866652oif.0.1620769617513;
Tue, 11 May 2021 14:46:57 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 11 May 2021 14:46:57 -0700 (PDT)
In-Reply-To: <s7er6f$2qlr$1@gal.iecc.com>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <s7dn5p$78r$1@newsreader4.netcologne.de> <2021May11.193250@mips.complang.tuwien.ac.at>
<888390de-f439-4653-a57c-c0febaa51c8fn@googlegroups.com> <s7en70$n5v$1@newsreader4.netcologne.de>
<s7er6f$2qlr$1@gal.iecc.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2651b477-2868-4505-b282-f22a71ca59adn@googlegroups.com>
Subject: Re: saturating arithmetic, not Signed division by 2^n
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 11 May 2021 21:46:57 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Tue, 11 May 2021 21:46 UTC

On Tuesday, May 11, 2021 at 3:58:58 PM UTC-5, John Levine wrote:
> It appears that Thomas Koenig <tko...@netcologne.de> said:
> >MitchAlsup <Mitch...@aol.com> schrieb:
> >
> >> What percentage of integer {signed and unsigned} arithmetic
> >> uses saturated semantics ?
> >
> >What programming language supports or demands it? I am
> >only aware of the "unsigned wraps around, integer overflow
> >is undefined" variety.
<
> Given that many popular chips have dedicated hardware that does saturated
> arithmetic, there must be something that uses it.
<
Graphics languages use it so that RGB blending does not wrap.
>
> R's,
> John
> --
> Regards,
> John Levine, jo...@taugh.com, Primary Perpetrator of "The Internet for Dummies",
> Please consider the environment before reading this e-mail. https://jl.ly

Re: saturating arithmetic, not Signed division by 2^n

<s7evjj$v0$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16635&group=comp.arch#16635

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: bage...@gmail.com (Brian G. Lucas)
Newsgroups: comp.arch
Subject: Re: saturating arithmetic, not Signed division by 2^n
Date: Tue, 11 May 2021 17:14:10 -0500
Organization: A noiseless patient Spider
Lines: 24
Message-ID: <s7evjj$v0$1@dont-email.me>
References: <s7dn5p$78r$1@newsreader4.netcologne.de>
<2021May11.193250@mips.complang.tuwien.ac.at>
<888390de-f439-4653-a57c-c0febaa51c8fn@googlegroups.com>
<s7en70$n5v$1@newsreader4.netcologne.de> <s7er6f$2qlr$1@gal.iecc.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 11 May 2021 22:14:11 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="42d34ed3b0924a5ef07fcd9430ea1270";
logging-data="992"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+W/5pw5pY7+xnWabrL7ncE"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.8.1
Cancel-Lock: sha1:7i3PkzgFeHeR/4UUreDWy71q8ZU=
In-Reply-To: <s7er6f$2qlr$1@gal.iecc.com>
Content-Language: en-US
 by: Brian G. Lucas - Tue, 11 May 2021 22:14 UTC

On 5/11/21 3:58 PM, John Levine wrote:
> It appears that Thomas Koenig <tkoenig@netcologne.de> said:
>> MitchAlsup <MitchAlsup@aol.com> schrieb:
>>
>>> What percentage of integer {signed and unsigned} arithmetic
>>> uses saturated semantics ?
>>
>> What programming language supports or demands it? I am
>> only aware of the "unsigned wraps around, integer overflow
>> is undefined" variety.
>
> Given that many popular chips have dedicated hardware that does saturated
> arithmetic, there must be something that uses it.
>
Signal processing code is full of saturating arithmetic. But I don't know of
any programming language that supports it directly. LLVM supports saturated
add/sub in the intermediate language. Not sure if anything uses it yet.

brian

> R's,
> John
>

Re: Signed division by 2^n

<s7l775$sq5$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16734&group=comp.arch#16734

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Signed division by 2^n
Date: Fri, 14 May 2021 01:59:41 -0500
Organization: A noiseless patient Spider
Lines: 48
Message-ID: <s7l775$sq5$1@dont-email.me>
References: <s7dn5p$78r$1@newsreader4.netcologne.de>
<2021May11.193250@mips.complang.tuwien.ac.at>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 14 May 2021 07:00:53 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8402408f5e5cb57f2dd58c4bb7f0a0e2";
logging-data="29509"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/iyj+YPwexbJnagnnGt0KB"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.1
Cancel-Lock: sha1:E2XaGyDtB6ve2rTzp5yRsy3gwRY=
In-Reply-To: <2021May11.193250@mips.complang.tuwien.ac.at>
Content-Language: en-US
 by: BGB - Fri, 14 May 2021 06:59 UTC

On 5/11/2021 12:32 PM, Anton Ertl wrote:
> Thomas Koenig <tkoenig@netcologne.de> writes:
>> Everybody should know that a signed division by 2^n cannot be
>> done with a single right shift :-)
>
> Depends on the language and sometimes on its implementation. E.g., on
> Gforth:
>
> -9 4 / . -3 ok
> -9 2 arshift . -3 ok
>
>> but having this as a single
>> instruction instead of four without branches, three with a branch
>> or two if you happen to own a POWER would make sense, especially
>> a conditional add of 2**n - 1 should be easier to do in hardware
>> than in software.
>>
>> Does ISA actually implement this?
>
> Even Aarch64, which supports pretty exotic stuff in some cases, needs
> 4 instructions for a signed symmetric division (or at least that's
> what I get with gcc).
>
> Is this frequent enough to merit a special instruction? Or do people
> use unsigned numbers or explicit shift if they are
> performance-conscious and want to divide by 2^n?
>

It seems like it could be supported...

However the ways I can think of it would likely add enough cost and
latency to the shift unit to make it "not likely worthwhile".

Option 1, if input is negative:
Negate Input;
Do Shift;
Negate Output.

Option 2, if input is negative:
Detect if ((Input&((1<<n)-1))!=0);
If So, Add 1 to output.

If one had an absolute-value instruction which set a status flag based
on whether or not the input was negative, this could be combined with a
conditional negate to reduce it to 3 instructions.

....

Re: Signed division by 2^n

<s7l7os$75r$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16735&group=comp.arch#16735

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Signed division by 2^n
Date: Fri, 14 May 2021 00:10:20 -0700
Organization: A noiseless patient Spider
Lines: 52
Message-ID: <s7l7os$75r$1@dont-email.me>
References: <s7dn5p$78r$1@newsreader4.netcologne.de>
<2021May11.193250@mips.complang.tuwien.ac.at> <s7l775$sq5$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 14 May 2021 07:10:21 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="965d25763e67ded1db4e08a96dc48409";
logging-data="7355"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19gvSovddDVLKhXr0Jkeqwy"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.0
Cancel-Lock: sha1:ML/pwDBeEttgZ/a4MVODW/9kNHI=
In-Reply-To: <s7l775$sq5$1@dont-email.me>
Content-Language: en-US
 by: Ivan Godard - Fri, 14 May 2021 07:10 UTC

On 5/13/2021 11:59 PM, BGB wrote:
> On 5/11/2021 12:32 PM, Anton Ertl wrote:
>> Thomas Koenig <tkoenig@netcologne.de> writes:
>>> Everybody should know that a signed division by 2^n cannot be
>>> done with a single right shift :-)
>>
>> Depends on the language and sometimes on its implementation.  E.g., on
>> Gforth:
>>
>> -9 4 / . -3  ok
>> -9 2 arshift . -3  ok
>>
>>> but having this as a single
>>> instruction instead of four without branches, three with a branch
>>> or two if you happen to own a POWER would make sense, especially
>>> a conditional add of 2**n - 1 should be easier to do in hardware
>>> than in software.
>>>
>>> Does ISA actually implement this?
>>
>> Even Aarch64, which supports pretty exotic stuff in some cases, needs
>> 4 instructions for a signed symmetric division (or at least that's
>> what I get with gcc).
>>
>> Is this frequent enough to merit a special instruction?  Or do people
>> use unsigned numbers or explicit shift if they are
>> performance-conscious and want to divide by 2^n?
>>
>
> It seems like it could be supported...
>
> However the ways I can think of it would likely add enough cost and
> latency to the shift unit to make it "not likely worthwhile".
>
> Option 1, if input is negative:
>  Negate Input;
>  Do Shift;
>  Negate Output.
>
> Option 2, if input is negative:
>  Detect if ((Input&((1<<n)-1))!=0);
>  If So, Add 1 to output.
>
>
> If one had an absolute-value instruction which set a status flag based
> on whether or not the input was negative, this could be combined with a
> conditional negate to reduce it to 3 instructions.
>
> ...

Or just integrate the integer shifter with the FP normalization shifter
and apply the right rounding mode.

Re: Signed division by 2^n

<2021May14.100911@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16736&group=comp.arch#16736

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Signed division by 2^n
Date: Fri, 14 May 2021 08:09:11 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 17
Message-ID: <2021May14.100911@mips.complang.tuwien.ac.at>
References: <s7dn5p$78r$1@newsreader4.netcologne.de> <2021May11.193250@mips.complang.tuwien.ac.at> <s7l775$sq5$1@dont-email.me> <s7l7os$75r$1@dont-email.me>
Injection-Info: reader02.eternal-september.org; posting-host="5d5f7c2fd86717a462d647ead1d1489d";
logging-data="17062"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX195X2lnkrCi7B4N3lQ+fEGI"
Cancel-Lock: sha1:t81t5aPjR3hmzTjTWfD37Kb3QVs=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Fri, 14 May 2021 08:09 UTC

Ivan Godard <ivan@millcomputing.com> writes:
>Or just integrate the integer shifter with the FP normalization shifter
>and apply the right rounding mode.

It's funny: signed integers are 2s-complement, and shift right
naturally floors (rounds towards negaitve infinity). FP numbers are
sign/magnitude, and shift right naturally rounds towards 0. But FP
also has rounding modes that have to take the sign bit into account,
and you can use that rounding correction to implement round-towards-0
for integers. I think the right rounding mode is flooring (it
increases the negative magnitude of an FP number), which is
particularly ironic.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Signed division by 2^n

<s7lf4s$c6n$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16739&group=comp.arch#16739

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-2862-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Signed division by 2^n
Date: Fri, 14 May 2021 09:16:12 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <s7lf4s$c6n$1@newsreader4.netcologne.de>
References: <s7dn5p$78r$1@newsreader4.netcologne.de>
<2021May11.193250@mips.complang.tuwien.ac.at> <s7l775$sq5$1@dont-email.me>
Injection-Date: Fri, 14 May 2021 09:16:12 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-2862-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:2862:0:7285:c2ff:fe6c:992d";
logging-data="12503"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Fri, 14 May 2021 09:16 UTC

BGB <cr88192@gmail.com> schrieb:

> Option 1, if input is negative:
> Negate Input;
> Do Shift;
> Negate Output.
>
> Option 2, if input is negative:
> Detect if ((Input&((1<<n)-1))!=0);
> If So, Add 1 to output.
>
>
> If one had an absolute-value instruction which set a status flag based
> on whether or not the input was negative, this could be combined with a
> conditional negate to reduce it to 3 instructions.

Hacker's Delight gives a two instruction sequence for POWER using
its shrsi instruction. This shifts right and sets the carry bit
if the number being shifted is negative and one or more 1-bits
are being shifted out. Combine this with an instruction to add
the carry bit to a register, and you're at two instructions.

One example of such an instruction is sradi.

Re: Signed division by 2^n

<s7m1uq$ep$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16750&group=comp.arch#16750

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!+9JlleTFc3MOERf2LU/SVA.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Signed division by 2^n
Date: Fri, 14 May 2021 16:37:15 +0200
Organization: Aioe.org NNTP Server
Lines: 64
Message-ID: <s7m1uq$ep$1@gioia.aioe.org>
References: <s7dn5p$78r$1@newsreader4.netcologne.de>
<2021May11.193250@mips.complang.tuwien.ac.at> <s7l775$sq5$1@dont-email.me>
<s7l7os$75r$1@dont-email.me>
NNTP-Posting-Host: +9JlleTFc3MOERf2LU/SVA.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Fri, 14 May 2021 14:37 UTC

Ivan Godard wrote:
> On 5/13/2021 11:59 PM, BGB wrote:
>> On 5/11/2021 12:32 PM, Anton Ertl wrote:
>>> Thomas Koenig <tkoenig@netcologne.de> writes:
>>>> Everybody should know that a signed division by 2^n cannot be
>>>> done with a single right shift :-)
>>>
>>> Depends on the language and sometimes on its implementation.  E.g., on
>>> Gforth:
>>>
>>> -9 4 / . -3  ok
>>> -9 2 arshift . -3  ok
>>>
>>>> but having this as a single
>>>> instruction instead of four without branches, three with a branch
>>>> or two if you happen to own a POWER would make sense, especially
>>>> a conditional add of 2**n - 1 should be easier to do in hardware
>>>> than in software.
>>>>
>>>> Does ISA actually implement this?
>>>
>>> Even Aarch64, which supports pretty exotic stuff in some cases, needs
>>> 4 instructions for a signed symmetric division (or at least that's
>>> what I get with gcc).
>>>
>>> Is this frequent enough to merit a special instruction?  Or do people
>>> use unsigned numbers or explicit shift if they are
>>> performance-conscious and want to divide by 2^n?
>>>
>>
>> It seems like it could be supported...
>>
>> However the ways I can think of it would likely add enough cost and
>> latency to the shift unit to make it "not likely worthwhile".
>>
>> Option 1, if input is negative:
>>  Â Negate Input;
>>  Â Do Shift;
>>  Â Negate Output.
>>
>> Option 2, if input is negative:
>>  Â Detect if ((Input&((1<<n)-1))!=0);
>>  Â If So, Add 1 to output.
>>
>>
>> If one had an absolute-value instruction which set a status flag based
>> on whether or not the input was negative, this could be combined with
>> a conditional negate to reduce it to 3 instructions.
>>
>> ...
>
> Or just integrate the integer shifter with the FP normalization shifter
> and apply the right rounding mode.

Exactly!

If you can do trunc/floor/ceil/nearest_or_even then it is easy to handle
negative values in whatever way the language requires.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Signed division by 2^n

<1de3be21-7e21-4ae1-8a0e-65c52e266c6cn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16753&group=comp.arch#16753

  copy link   Newsgroups: comp.arch
X-Received: by 2002:aed:2010:: with SMTP id 16mr43968873qta.256.1621005495269;
Fri, 14 May 2021 08:18:15 -0700 (PDT)
X-Received: by 2002:a4a:8311:: with SMTP id f17mr33847250oog.83.1621005495017;
Fri, 14 May 2021 08:18:15 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 14 May 2021 08:18:14 -0700 (PDT)
In-Reply-To: <s7l7os$75r$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <s7dn5p$78r$1@newsreader4.netcologne.de> <2021May11.193250@mips.complang.tuwien.ac.at>
<s7l775$sq5$1@dont-email.me> <s7l7os$75r$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1de3be21-7e21-4ae1-8a0e-65c52e266c6cn@googlegroups.com>
Subject: Re: Signed division by 2^n
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 14 May 2021 15:18:15 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Fri, 14 May 2021 15:18 UTC

On Friday, May 14, 2021 at 2:10:23 AM UTC-5, Ivan Godard wrote:
> On 5/13/2021 11:59 PM, BGB wrote:
> > On 5/11/2021 12:32 PM, Anton Ertl wrote:
> >> Thomas Koenig <tko...@netcologne.de> writes:
> >>> Everybody should know that a signed division by 2^n cannot be
> >>> done with a single right shift :-)
> >>
> >> Depends on the language and sometimes on its implementation. E.g., on
> >> Gforth:
> >>
> >> -9 4 / . -3 ok
> >> -9 2 arshift . -3 ok
> >>
> >>> but having this as a single
> >>> instruction instead of four without branches, three with a branch
> >>> or two if you happen to own a POWER would make sense, especially
> >>> a conditional add of 2**n - 1 should be easier to do in hardware
> >>> than in software.
> >>>
> >>> Does ISA actually implement this?
> >>
> >> Even Aarch64, which supports pretty exotic stuff in some cases, needs
> >> 4 instructions for a signed symmetric division (or at least that's
> >> what I get with gcc).
> >>
> >> Is this frequent enough to merit a special instruction? Or do people
> >> use unsigned numbers or explicit shift if they are
> >> performance-conscious and want to divide by 2^n?
> >>
> >
> > It seems like it could be supported...
> >
> > However the ways I can think of it would likely add enough cost and
> > latency to the shift unit to make it "not likely worthwhile".
> >
> > Option 1, if input is negative:
> > Negate Input;
> > Do Shift;
> > Negate Output.
> >
> > Option 2, if input is negative:
> > Detect if ((Input&((1<<n)-1))!=0);
> > If So, Add 1 to output.
> >
> >
> > If one had an absolute-value instruction which set a status flag based
> > on whether or not the input was negative, this could be combined with a
> > conditional negate to reduce it to 3 instructions.
> >
> > ...
> Or just integrate the integer shifter with the FP normalization shifter
> and apply the right rounding mode.
<
Here you go again, putting Integer Instructions through the FP units......
<
In before the separate Register file crowd.

Re: Signed division by 2^n

<s7m6ri$vta$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16756&group=comp.arch#16756

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Signed division by 2^n
Date: Fri, 14 May 2021 10:59:24 -0500
Organization: A noiseless patient Spider
Lines: 76
Message-ID: <s7m6ri$vta$1@dont-email.me>
References: <s7dn5p$78r$1@newsreader4.netcologne.de>
<2021May11.193250@mips.complang.tuwien.ac.at> <s7l775$sq5$1@dont-email.me>
<s7l7os$75r$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 14 May 2021 16:00:50 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8402408f5e5cb57f2dd58c4bb7f0a0e2";
logging-data="32682"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+9puqz3zu0uq90Gq+HXJET"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.1
Cancel-Lock: sha1:KF293lOqA7fhSsqDuQeVaavRsPI=
In-Reply-To: <s7l7os$75r$1@dont-email.me>
Content-Language: en-US
 by: BGB - Fri, 14 May 2021 15:59 UTC

On 5/14/2021 2:10 AM, Ivan Godard wrote:
> On 5/13/2021 11:59 PM, BGB wrote:
>> On 5/11/2021 12:32 PM, Anton Ertl wrote:
>>> Thomas Koenig <tkoenig@netcologne.de> writes:
>>>> Everybody should know that a signed division by 2^n cannot be
>>>> done with a single right shift :-)
>>>
>>> Depends on the language and sometimes on its implementation.  E.g., on
>>> Gforth:
>>>
>>> -9 4 / . -3  ok
>>> -9 2 arshift . -3  ok
>>>
>>>> but having this as a single
>>>> instruction instead of four without branches, three with a branch
>>>> or two if you happen to own a POWER would make sense, especially
>>>> a conditional add of 2**n - 1 should be easier to do in hardware
>>>> than in software.
>>>>
>>>> Does ISA actually implement this?
>>>
>>> Even Aarch64, which supports pretty exotic stuff in some cases, needs
>>> 4 instructions for a signed symmetric division (or at least that's
>>> what I get with gcc).
>>>
>>> Is this frequent enough to merit a special instruction?  Or do people
>>> use unsigned numbers or explicit shift if they are
>>> performance-conscious and want to divide by 2^n?
>>>
>>
>> It seems like it could be supported...
>>
>> However the ways I can think of it would likely add enough cost and
>> latency to the shift unit to make it "not likely worthwhile".
>>
>> Option 1, if input is negative:
>>   Negate Input;
>>   Do Shift;
>>   Negate Output.
>>
>> Option 2, if input is negative:
>>   Detect if ((Input&((1<<n)-1))!=0);
>>   If So, Add 1 to output.
>>
>>
>> If one had an absolute-value instruction which set a status flag based
>> on whether or not the input was negative, this could be combined with
>> a conditional negate to reduce it to 3 instructions.
>>
>> ...
>
> Or just integrate the integer shifter with the FP normalization shifter
> and apply the right rounding mode.

It is possible.

I was going to assert originally that these is a problem if the
renormalization shifter:
Only does a left-shift;
Isn't quite wide enough to be used for integer shifts;
...

But, then realized, this probably meant to use the input-side right
shifter, followed by the main adder (say, internally it uses a 64-bit
adder with a carry-in flag), and already implements some similar logic
for other reasons, ...

It actually seems possible, as the logic is basically analogous to doing
the Int->Float and Float->Int conversion logic at the same time, and
fudging one of the exponents to cause it to implement a right shift.

So, it should be technically possible to pull off something like this
via slight tweaks to an FADD unit...

....

Re: Signed division by 2^n

<c4fe5be0-030f-4ad1-8ff0-f89f08d1250en@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16760&group=comp.arch#16760

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:6914:: with SMTP id e20mr44394832qtr.268.1621015447582;
Fri, 14 May 2021 11:04:07 -0700 (PDT)
X-Received: by 2002:a05:6808:3a3:: with SMTP id n3mr31534554oie.157.1621015447324;
Fri, 14 May 2021 11:04:07 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 14 May 2021 11:04:07 -0700 (PDT)
In-Reply-To: <s7m6ri$vta$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <s7dn5p$78r$1@newsreader4.netcologne.de> <2021May11.193250@mips.complang.tuwien.ac.at>
<s7l775$sq5$1@dont-email.me> <s7l7os$75r$1@dont-email.me> <s7m6ri$vta$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c4fe5be0-030f-4ad1-8ff0-f89f08d1250en@googlegroups.com>
Subject: Re: Signed division by 2^n
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 14 May 2021 18:04:07 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Fri, 14 May 2021 18:04 UTC

On Friday, May 14, 2021 at 11:00:53 AM UTC-5, BGB wrote:
> On 5/14/2021 2:10 AM, Ivan Godard wrote:
> > On 5/13/2021 11:59 PM, BGB wrote:
> >> On 5/11/2021 12:32 PM, Anton Ertl wrote:
> >>> Thomas Koenig <tko...@netcologne.de> writes:
> >>>> Everybody should know that a signed division by 2^n cannot be
> >>>> done with a single right shift :-)
> >>>
> >>> Depends on the language and sometimes on its implementation. E.g., on
> >>> Gforth:
> >>>
> >>> -9 4 / . -3 ok
> >>> -9 2 arshift . -3 ok
> >>>
> >>>> but having this as a single
> >>>> instruction instead of four without branches, three with a branch
> >>>> or two if you happen to own a POWER would make sense, especially
> >>>> a conditional add of 2**n - 1 should be easier to do in hardware
> >>>> than in software.
> >>>>
> >>>> Does ISA actually implement this?
> >>>
> >>> Even Aarch64, which supports pretty exotic stuff in some cases, needs
> >>> 4 instructions for a signed symmetric division (or at least that's
> >>> what I get with gcc).
> >>>
> >>> Is this frequent enough to merit a special instruction? Or do people
> >>> use unsigned numbers or explicit shift if they are
> >>> performance-conscious and want to divide by 2^n?
> >>>
> >>
> >> It seems like it could be supported...
> >>
> >> However the ways I can think of it would likely add enough cost and
> >> latency to the shift unit to make it "not likely worthwhile".
> >>
> >> Option 1, if input is negative:
> >> Negate Input;
> >> Do Shift;
> >> Negate Output.
> >>
> >> Option 2, if input is negative:
> >> Detect if ((Input&((1<<n)-1))!=0);
> >> If So, Add 1 to output.
> >>
> >>
> >> If one had an absolute-value instruction which set a status flag based
> >> on whether or not the input was negative, this could be combined with
> >> a conditional negate to reduce it to 3 instructions.
> >>
> >> ...
> >
> > Or just integrate the integer shifter with the FP normalization shifter
> > and apply the right rounding mode.
> It is possible.
>
> I was going to assert originally that these is a problem if the
> renormalization shifter:
> Only does a left-shift;
> Isn't quite wide enough to be used for integer shifts;
<
The normalizer in and DP FMAC unit is at least 213-bits wide.
> ...
>
> But, then realized, this probably meant to use the input-side right
> shifter, followed by the main adder (say, internally it uses a 64-bit
> adder with a carry-in flag), and already implements some similar logic
> for other reasons, ...
>
> It actually seems possible, as the logic is basically analogous to doing
> the Int->Float and Float->Int conversion logic at the same time, and
> fudging one of the exponents to cause it to implement a right shift.
>
> So, it should be technically possible to pull off something like this
> via slight tweaks to an FADD unit...
<
Possibly, but the multiplier is dealing with 53+53 bit things minimum,
if said multiplier also FDIV and SQRT then it is 57+57
if said multiplier also Transcendentals then it is 58+58.....
>
> ...

Re: Signed division by 2^n

<s7mio3$qfs$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16765&group=comp.arch#16765

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Signed division by 2^n
Date: Fri, 14 May 2021 14:23:45 -0500
Organization: A noiseless patient Spider
Lines: 143
Message-ID: <s7mio3$qfs$1@dont-email.me>
References: <s7dn5p$78r$1@newsreader4.netcologne.de>
<2021May11.193250@mips.complang.tuwien.ac.at> <s7l775$sq5$1@dont-email.me>
<s7l7os$75r$1@dont-email.me> <s7m6ri$vta$1@dont-email.me>
<c4fe5be0-030f-4ad1-8ff0-f89f08d1250en@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 14 May 2021 19:23:47 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8402408f5e5cb57f2dd58c4bb7f0a0e2";
logging-data="27132"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18vpyXrerxC3j/zjqqDYd1+"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.1
Cancel-Lock: sha1:z68fnDXk1QIOt+UZOl78gYy/DbM=
In-Reply-To: <c4fe5be0-030f-4ad1-8ff0-f89f08d1250en@googlegroups.com>
Content-Language: en-US
 by: BGB - Fri, 14 May 2021 19:23 UTC

On 5/14/2021 1:04 PM, MitchAlsup wrote:
> On Friday, May 14, 2021 at 11:00:53 AM UTC-5, BGB wrote:
>> On 5/14/2021 2:10 AM, Ivan Godard wrote:
>>> On 5/13/2021 11:59 PM, BGB wrote:
>>>> On 5/11/2021 12:32 PM, Anton Ertl wrote:
>>>>> Thomas Koenig <tko...@netcologne.de> writes:
>>>>>> Everybody should know that a signed division by 2^n cannot be
>>>>>> done with a single right shift :-)
>>>>>
>>>>> Depends on the language and sometimes on its implementation. E.g., on
>>>>> Gforth:
>>>>>
>>>>> -9 4 / . -3 ok
>>>>> -9 2 arshift . -3 ok
>>>>>
>>>>>> but having this as a single
>>>>>> instruction instead of four without branches, three with a branch
>>>>>> or two if you happen to own a POWER would make sense, especially
>>>>>> a conditional add of 2**n - 1 should be easier to do in hardware
>>>>>> than in software.
>>>>>>
>>>>>> Does ISA actually implement this?
>>>>>
>>>>> Even Aarch64, which supports pretty exotic stuff in some cases, needs
>>>>> 4 instructions for a signed symmetric division (or at least that's
>>>>> what I get with gcc).
>>>>>
>>>>> Is this frequent enough to merit a special instruction? Or do people
>>>>> use unsigned numbers or explicit shift if they are
>>>>> performance-conscious and want to divide by 2^n?
>>>>>
>>>>
>>>> It seems like it could be supported...
>>>>
>>>> However the ways I can think of it would likely add enough cost and
>>>> latency to the shift unit to make it "not likely worthwhile".
>>>>
>>>> Option 1, if input is negative:
>>>> Negate Input;
>>>> Do Shift;
>>>> Negate Output.
>>>>
>>>> Option 2, if input is negative:
>>>> Detect if ((Input&((1<<n)-1))!=0);
>>>> If So, Add 1 to output.
>>>>
>>>>
>>>> If one had an absolute-value instruction which set a status flag based
>>>> on whether or not the input was negative, this could be combined with
>>>> a conditional negate to reduce it to 3 instructions.
>>>>
>>>> ...
>>>
>>> Or just integrate the integer shifter with the FP normalization shifter
>>> and apply the right rounding mode.
>> It is possible.
>>
>> I was going to assert originally that these is a problem if the
>> renormalization shifter:
>> Only does a left-shift;
>> Isn't quite wide enough to be used for integer shifts;
> <
> The normalizer in and DP FMAC unit is at least 213-bits wide.

Yeah ... 64 -> 54 bits (narrowing shift) ...

With rounding carry propagation limited to 12-bits in this case.

There is a possibility for long strings of 1s, but the cases that lead
to them mostly disappear if one does a carry-in during the main adder.

One of the things that helped "kill" my Long-Double FPU effort was
trying to widen these parts to 90 bits.

The Long-Double FPU would have still been too narrow to ensure
correctly-rounded Double values though, but would have reduced their
probability somewhat.

LUT cost was an issue, and timing was pretty tight at 50MHz.
I had to invoke a lot of fiddly to "make it work" here.

More recently I have applied some of the tweaks to the DP FPU to get it
to pass timing at 75MHz (and make a change that ended up leading to FADD
and FMUL being 7-cycle operations).

I can only guess what sorts of damage a 213 bit mantissa would do...

Occasional rounding errors seem like a better trade-off IMO.

>> ...
>>
>> But, then realized, this probably meant to use the input-side right
>> shifter, followed by the main adder (say, internally it uses a 64-bit
>> adder with a carry-in flag), and already implements some similar logic
>> for other reasons, ...
>>
>> It actually seems possible, as the logic is basically analogous to doing
>> the Int->Float and Float->Int conversion logic at the same time, and
>> fudging one of the exponents to cause it to implement a right shift.
>>
>> So, it should be technically possible to pull off something like this
>> via slight tweaks to an FADD unit...
> <
> Possibly, but the multiplier is dealing with 53+53 bit things minimum,
> if said multiplier also FDIV and SQRT then it is 57+57
> if said multiplier also Transcendentals then it is 58+58.....

This is assuming one uses a "square" multiplier, rather than a
"triangular" multiplier.

As-is:
Square Multiplier: 54*54 -> 108
Triangular Multiplier: 54*54 -> 72

The LongDouble used a wider multiplier:
72*72->90 (initial)
85*85->90 (likely needed to avoid some issues, *)

*: Algorithms based on iterative convergence get stuck in an infinite
loop if FADDX and FMULX use different mantissa lengths.

This means that I would need to make them agree on a fixed 80-bit mantissa.

Or: S.E15.F80.P32 (where P=Zero Padding)

I did at one point start trying to implement a combined FMAC unit, but
then realized it was likely to have a fairly high latency.

An FPU with a separate FADD and FMUL unit could give lower latency for
FADD and FMUL, and could fake FMAC with only slightly higher latency
than the combined unit.

There are some operations though which could exist with an FMAC unit
which would not work correctly with an FMUL+FADD glued together, but I
am already pushing the limits of what seems viable on the XC7A100T.

Re: Signed division by 2^n

<00a4b04a-ef97-44fd-a3a9-aa777fcc71bbn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16768&group=comp.arch#16768

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:9a16:: with SMTP id c22mr45595123qke.0.1621027751766; Fri, 14 May 2021 14:29:11 -0700 (PDT)
X-Received: by 2002:a05:6830:4093:: with SMTP id x19mr22092912ott.81.1621027751494; Fri, 14 May 2021 14:29:11 -0700 (PDT)
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.net!newsfeed.xs4all.nl!newsfeed8.news.xs4all.nl!tr1.eu1.usenetexpress.com!feeder.usenetexpress.com!tr2.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 14 May 2021 14:29:11 -0700 (PDT)
In-Reply-To: <s7mio3$qfs$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <s7dn5p$78r$1@newsreader4.netcologne.de> <2021May11.193250@mips.complang.tuwien.ac.at> <s7l775$sq5$1@dont-email.me> <s7l7os$75r$1@dont-email.me> <s7m6ri$vta$1@dont-email.me> <c4fe5be0-030f-4ad1-8ff0-f89f08d1250en@googlegroups.com> <s7mio3$qfs$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <00a4b04a-ef97-44fd-a3a9-aa777fcc71bbn@googlegroups.com>
Subject: Re: Signed division by 2^n
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 14 May 2021 21:29:11 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 193
 by: MitchAlsup - Fri, 14 May 2021 21:29 UTC

On Friday, May 14, 2021 at 2:23:49 PM UTC-5, BGB wrote:
> On 5/14/2021 1:04 PM, MitchAlsup wrote:
> > On Friday, May 14, 2021 at 11:00:53 AM UTC-5, BGB wrote:
> >> On 5/14/2021 2:10 AM, Ivan Godard wrote:
> >>> On 5/13/2021 11:59 PM, BGB wrote:
> >>>> On 5/11/2021 12:32 PM, Anton Ertl wrote:
> >>>>> Thomas Koenig <tko...@netcologne.de> writes:
> >>>>>> Everybody should know that a signed division by 2^n cannot be
> >>>>>> done with a single right shift :-)
> >>>>>
> >>>>> Depends on the language and sometimes on its implementation. E.g., on
> >>>>> Gforth:
> >>>>>
> >>>>> -9 4 / . -3 ok
> >>>>> -9 2 arshift . -3 ok
> >>>>>
> >>>>>> but having this as a single
> >>>>>> instruction instead of four without branches, three with a branch
> >>>>>> or two if you happen to own a POWER would make sense, especially
> >>>>>> a conditional add of 2**n - 1 should be easier to do in hardware
> >>>>>> than in software.
> >>>>>>
> >>>>>> Does ISA actually implement this?
> >>>>>
> >>>>> Even Aarch64, which supports pretty exotic stuff in some cases, needs
> >>>>> 4 instructions for a signed symmetric division (or at least that's
> >>>>> what I get with gcc).
> >>>>>
> >>>>> Is this frequent enough to merit a special instruction? Or do people
> >>>>> use unsigned numbers or explicit shift if they are
> >>>>> performance-conscious and want to divide by 2^n?
> >>>>>
> >>>>
> >>>> It seems like it could be supported...
> >>>>
> >>>> However the ways I can think of it would likely add enough cost and
> >>>> latency to the shift unit to make it "not likely worthwhile".
> >>>>
> >>>> Option 1, if input is negative:
> >>>> Negate Input;
> >>>> Do Shift;
> >>>> Negate Output.
> >>>>
> >>>> Option 2, if input is negative:
> >>>> Detect if ((Input&((1<<n)-1))!=0);
> >>>> If So, Add 1 to output.
> >>>>
> >>>>
> >>>> If one had an absolute-value instruction which set a status flag based
> >>>> on whether or not the input was negative, this could be combined with
> >>>> a conditional negate to reduce it to 3 instructions.
> >>>>
> >>>> ...
> >>>
> >>> Or just integrate the integer shifter with the FP normalization shifter
> >>> and apply the right rounding mode.
> >> It is possible.
> >>
> >> I was going to assert originally that these is a problem if the
> >> renormalization shifter:
> >> Only does a left-shift;
> >> Isn't quite wide enough to be used for integer shifts;
> > <
> > The normalizer in and DP FMAC unit is at least 213-bits wide.
<
> Yeah ... 64 -> 54 bits (narrowing shift) ...
>
> With rounding carry propagation limited to 12-bits in this case.
>
> There is a possibility for long strings of 1s, but the cases that lead
> to them mostly disappear if one does a carry-in during the main adder.
>
> One of the things that helped "kill" my Long-Double FPU effort was
> trying to widen these parts to 90 bits.
>
>
> The Long-Double FPU would have still been too narrow to ensure
> correctly-rounded Double values though, but would have reduced their
> probability somewhat.
>
> LUT cost was an issue, and timing was pretty tight at 50MHz.
> I had to invoke a lot of fiddly to "make it work" here.
>
> More recently I have applied some of the tweaks to the DP FPU to get it
> to pass timing at 75MHz (and make a change that ended up leading to FADD
> and FMUL being 7-cycle operations).
>
>
> I can only guess what sorts of damage a 213 bit mantissa would do...
>
> Occasional rounding errors seem like a better trade-off IMO.
<
Try having that argument with Kahan !!
> >> ...
> >>
> >> But, then realized, this probably meant to use the input-side right
> >> shifter, followed by the main adder (say, internally it uses a 64-bit
> >> adder with a carry-in flag), and already implements some similar logic
> >> for other reasons, ...
> >>
> >> It actually seems possible, as the logic is basically analogous to doing
> >> the Int->Float and Float->Int conversion logic at the same time, and
> >> fudging one of the exponents to cause it to implement a right shift.
> >>
> >> So, it should be technically possible to pull off something like this
> >> via slight tweaks to an FADD unit...
> > <
> > Possibly, but the multiplier is dealing with 53+53 bit things minimum,
> > if said multiplier also FDIV and SQRT then it is 57+57
> > if said multiplier also Transcendentals then it is 58+58.....
<
> This is assuming one uses a "square" multiplier, rather than a
> "triangular" multiplier.
<
The proper word is parallelogram not square
>
> As-is:
> Square Multiplier: 54*54 -> 108
> Triangular Multiplier: 54*54 -> 72
<
I question your definition of triangular::
> Triangular Multiplier: 54×54 -> 54 !?!
>
> The LongDouble used a wider multiplier:
> 72*72->90 (initial)
> 85*85->90 (likely needed to avoid some issues, *)
>
> *: Algorithms based on iterative convergence get stuck in an infinite
> loop if FADDX and FMULX use different mantissa lengths.
>
> This means that I would need to make them agree on a fixed 80-bit mantissa.
>
> Or: S.E15.F80.P32 (where P=Zero Padding)
>
>
> I did at one point start trying to implement a combined FMAC unit, but
> then realized it was likely to have a fairly high latency.
<
Your implementation medium is harming your ability to pull off your design.
>
> An FPU with a separate FADD and FMUL unit could give lower latency for
> FADD and FMUL, and could fake FMAC with only slightly higher latency
> than the combined unit.
<
Maybe,
FADD: 2-cycles is darned hard, 3-cycles is pretty easy.
FMUL: 4-cycles is rather standard for 16-gates/cycle machines.
FMAC: 4-cycles is pretty hard, 5-cycles is a bit better
<
AMD Athlon and Opteron used FADD=4 and FMUL=4 to simplify the
pipelineing and to prevent having both units deliver result in the same
cycle.
<
On the other hand, a single FMAC unit can do it all::
FADD: FMAC 1*Rs1+Rs2
FMUL: FMAC Rs1*Rs2+0
<
So if you find yourself in a position where you need FMAC (say to meet
IEEE 754-2008+) you can have the design team build the FMAC unit.
Later on, when building the next and wider machine, you can add an
FADD or FMUL or both based on statistics you have gathered from
generation 1. Given and FMAC, FADD is a degenerate subset which
a GOOD Verilog compiler can autogenerate if you feed it the above
fixed values {FMAC 1*Rs1+Rs2 and FMAC Rs1*Rs2+0}. THis REALLY
reduces the designer workloads.
>
>
> There are some operations though which could exist with an FMAC unit
> which would not work correctly with an FMUL+FADD glued together, but I
> am already pushing the limits of what seems viable on the XC7A100T.
<
Yep, your implementation medium is getting in your way. So are some of
your tools.

More complex instructions to reduce cycle overhead

<jwv1ra92e0t.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16771&group=comp.arch#16771

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: More complex instructions to reduce cycle overhead
Date: Fri, 14 May 2021 17:55:14 -0400
Organization: A noiseless patient Spider
Lines: 24
Message-ID: <jwv1ra92e0t.fsf-monnier+comp.arch@gnu.org>
References: <s7dn5p$78r$1@newsreader4.netcologne.de>
<2021May11.193250@mips.complang.tuwien.ac.at>
<s7l775$sq5$1@dont-email.me> <s7l7os$75r$1@dont-email.me>
<s7m6ri$vta$1@dont-email.me>
<c4fe5be0-030f-4ad1-8ff0-f89f08d1250en@googlegroups.com>
<s7mio3$qfs$1@dont-email.me>
<00a4b04a-ef97-44fd-a3a9-aa777fcc71bbn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="ea2d874832a9c5172debb835c6f6b45f";
logging-data="4458"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/jufQFeN5y6r+Xf2RknpP1"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:E0Vjn9wL+UkJBqN4TF8vDGCKx6s=
sha1:YakBkzzbzZiVRu1cyPY+yHW0VDo=
 by: Stefan Monnier - Fri, 14 May 2021 21:55 UTC

IIUC cycle time (for the EX stage) can be split into:
A- time to perform single-cycle operation
B- time to propagate the result through the forwarding network
C- time for the actual latch/flipflop

Arguably, B and C are overheads.
Has there been ISAs that aim to maximize the proportion of time spent in
A rather than B and C by having instructions that perform several
sequential operations.

I guess the "negate inputs" options in MY66000 (and the shifts in ARM3)
could be counted as such an example, tho a limited one.

I'm thinking more of an ISA where an instruction is expected to do
something like `(A op1 B) op2 C` in a single cycle (for various
combinations of `op1` and `op2` like additions, shifts, and whatnot).

I'm far from convinced it would work out well (there's a risk you'd end
up having to use a NOP for `op1` or `op2` in too many cases), but I'm
curious if someone has tried out something like that,

Stefan

Re: More complex instructions to reduce cycle overhead

<s7n03h$jvm$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16772&group=comp.arch#16772

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: More complex instructions to reduce cycle overhead
Date: Fri, 14 May 2021 16:11:46 -0700
Organization: A noiseless patient Spider
Lines: 32
Message-ID: <s7n03h$jvm$1@dont-email.me>
References: <s7dn5p$78r$1@newsreader4.netcologne.de>
<2021May11.193250@mips.complang.tuwien.ac.at> <s7l775$sq5$1@dont-email.me>
<s7l7os$75r$1@dont-email.me> <s7m6ri$vta$1@dont-email.me>
<c4fe5be0-030f-4ad1-8ff0-f89f08d1250en@googlegroups.com>
<s7mio3$qfs$1@dont-email.me>
<00a4b04a-ef97-44fd-a3a9-aa777fcc71bbn@googlegroups.com>
<jwv1ra92e0t.fsf-monnier+comp.arch@gnu.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 14 May 2021 23:11:45 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="084bb58afc123c2913520798d93e8cc0";
logging-data="20470"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+9YNeg0N/LpTGZwVbpwt66"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.1
Cancel-Lock: sha1:GsRCv1lrnYneKvGmTJi6nOvIohA=
In-Reply-To: <jwv1ra92e0t.fsf-monnier+comp.arch@gnu.org>
Content-Language: en-US
 by: Ivan Godard - Fri, 14 May 2021 23:11 UTC

On 5/14/2021 2:55 PM, Stefan Monnier wrote:
>
> IIUC cycle time (for the EX stage) can be split into:
> A- time to perform single-cycle operation
> B- time to propagate the result through the forwarding network
> C- time for the actual latch/flipflop
>
> Arguably, B and C are overheads.
> Has there been ISAs that aim to maximize the proportion of time spent in
> A rather than B and C by having instructions that perform several
> sequential operations.
>
> I guess the "negate inputs" options in MY66000 (and the shifts in ARM3)
> could be counted as such an example, tho a limited one.
>
> I'm thinking more of an ISA where an instruction is expected to do
> something like `(A op1 B) op2 C` in a single cycle (for various
> combinations of `op1` and `op2` like additions, shifts, and whatnot).
>
> I'm far from convinced it would work out well (there's a risk you'd end
> up having to use a NOP for `op1` or `op2` in too many cases), but I'm
> curious if someone has tried out something like that,
>
>
> Stefan
>

Bill Wulf (CMU) did this, but I forget what they called it. Mitch was at
CMU, maybe he remembers. It did address arithmetic real well, MAC, and A
<comp> B <rel> C, others not so mutch. A bit tough fitting two opcodes
and four regs into an instruction IIRC. A Thumb-like subset gets rid of
the noops. Mill's ganging is this in effect.

Re: More complex instructions to reduce cycle overhead

<049b46dd-4544-4fe7-861b-85f97b3269c3n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16773&group=comp.arch#16773

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:b11:: with SMTP id u17mr48463877qvj.42.1621035238303; Fri, 14 May 2021 16:33:58 -0700 (PDT)
X-Received: by 2002:a9d:2de1:: with SMTP id g88mr43362240otb.5.1621035237863; Fri, 14 May 2021 16:33:57 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!tr1.eu1.usenetexpress.com!feeder.usenetexpress.com!tr3.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 14 May 2021 16:33:57 -0700 (PDT)
In-Reply-To: <jwv1ra92e0t.fsf-monnier+comp.arch@gnu.org>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <s7dn5p$78r$1@newsreader4.netcologne.de> <2021May11.193250@mips.complang.tuwien.ac.at> <s7l775$sq5$1@dont-email.me> <s7l7os$75r$1@dont-email.me> <s7m6ri$vta$1@dont-email.me> <c4fe5be0-030f-4ad1-8ff0-f89f08d1250en@googlegroups.com> <s7mio3$qfs$1@dont-email.me> <00a4b04a-ef97-44fd-a3a9-aa777fcc71bbn@googlegroups.com> <jwv1ra92e0t.fsf-monnier+comp.arch@gnu.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <049b46dd-4544-4fe7-861b-85f97b3269c3n@googlegroups.com>
Subject: Re: More complex instructions to reduce cycle overhead
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 14 May 2021 23:33:58 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 65
 by: MitchAlsup - Fri, 14 May 2021 23:33 UTC

On Friday, May 14, 2021 at 4:55:16 PM UTC-5, Stefan Monnier wrote:
> IIUC cycle time (for the EX stage) can be split into:
> A- time to perform single-cycle operation
> B- time to propagate the result through the forwarding network
> C- time for the actual latch/flipflop
>
> Arguably, B and C are overheads.
> Has there been ISAs that aim to maximize the proportion of time spent in
> A rather than B and C by having instructions that perform several
> sequential operations.
<
For single cycle back-to-back, this is accurate. C, however, is not a delay
one can get rid of, unless one is not building a fully pipelined machine
(new operation starting every cycle in the same FU.)
>
> I guess the "negate inputs" options in MY66000 (and the shifts in ARM3)
> could be counted as such an example, tho a limited one.
<
For the My 66000 case: negate is an XOR gate where the select input to
the data-path of gates is seen 1.5 cycles before data passes through the
gate. Data gets XORed, and for integer operations, a carry bit is passed
through the network (65-bits) the carry bits are inserted into the 64-bit
adder and adds no delay. For logical instructions the XOR is all that
happens, for FP only the sign bit gets XORed.
>
> I'm thinking more of an ISA where an instruction is expected to do
> something like `(A op1 B) op2 C` in a single cycle (for various
> combinations of `op1` and `op2` like additions, shifts, and whatnot).
>
> I'm far from convinced it would work out well (there's a risk you'd end
> up having to use a NOP for `op1` or `op2` in too many cases), but I'm
> curious if someone has tried out something like that,
<
64-bit integer add is 11-gates with a carry select adder. Take a 16-gate
machine, and this gives you 5-gates for result buffering to drive the
heavily loaded result bus, pass through the forwarding logic, wire delay,
and meet setup time for the flip-flop. Yes, this path is tight. The wider
machine the tighter the path.
<
In K9, we did not have the forwarding path in the "cycle time" and could
not do back-to-back instructions. One of the things that gets dropped
as the gates-per-cycle drop or the machine gets wider or both. This also
changes the time between tag and data in reservation station design.
<
Also notice that the 12-gate-per-cycle design of *Dozer family had a lot
of strange forwarding cases--the wide ones were forwarded, any change
in data size was delayed a cycle because it was asking too much for the
data path (and the integer adder was circuit designed down to 8-gates of
delay while remaining 11-actual gates in the path.
<
Finally note that the characteristic delay of the flip-flop is 5-gates when
you include clock Jitter and clock Skew. so a 16-gate machine actually
cycles at 21-gates of delay. These 5-gates of delay really hurt at an 8-gate
delay pipeline.
<
And there is Mitch's second law:: When you take the logic in a pipelined
machine and divide each stage by 2, you end up with 2.5× as many pipeline
stages !!
>
>
> Stefan

Re: More complex instructions to reduce cycle overhead

<e1e2de3c-657b-4c13-9648-828713ffce70n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16774&group=comp.arch#16774

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5208:: with SMTP id r8mr30498532qtn.178.1621035583056; Fri, 14 May 2021 16:39:43 -0700 (PDT)
X-Received: by 2002:a4a:a5c2:: with SMTP id k2mr38478636oom.5.1621035582855; Fri, 14 May 2021 16:39:42 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.mixmin.net!news-out.netnews.com!newsin.alt.net!fdcspool2.netnews.com!news-out.netnews.com!news.alt.net!fdc2.netnews.com!feeder1.feed.usenet.farm!feed.usenet.farm!tr1.eu1.usenetexpress.com!feeder.usenetexpress.com!tr1.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 14 May 2021 16:39:42 -0700 (PDT)
In-Reply-To: <s7n03h$jvm$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <s7dn5p$78r$1@newsreader4.netcologne.de> <2021May11.193250@mips.complang.tuwien.ac.at> <s7l775$sq5$1@dont-email.me> <s7l7os$75r$1@dont-email.me> <s7m6ri$vta$1@dont-email.me> <c4fe5be0-030f-4ad1-8ff0-f89f08d1250en@googlegroups.com> <s7mio3$qfs$1@dont-email.me> <00a4b04a-ef97-44fd-a3a9-aa777fcc71bbn@googlegroups.com> <jwv1ra92e0t.fsf-monnier+comp.arch@gnu.org> <s7n03h$jvm$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <e1e2de3c-657b-4c13-9648-828713ffce70n@googlegroups.com>
Subject: Re: More complex instructions to reduce cycle overhead
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 14 May 2021 23:39:43 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 44
 by: MitchAlsup - Fri, 14 May 2021 23:39 UTC

On Friday, May 14, 2021 at 6:11:47 PM UTC-5, Ivan Godard wrote:
> On 5/14/2021 2:55 PM, Stefan Monnier wrote:
> >
> > IIUC cycle time (for the EX stage) can be split into:
> > A- time to perform single-cycle operation
> > B- time to propagate the result through the forwarding network
> > C- time for the actual latch/flipflop
> >
> > Arguably, B and C are overheads.
> > Has there been ISAs that aim to maximize the proportion of time spent in
> > A rather than B and C by having instructions that perform several
> > sequential operations.
> >
> > I guess the "negate inputs" options in MY66000 (and the shifts in ARM3)
> > could be counted as such an example, tho a limited one.
> >
> > I'm thinking more of an ISA where an instruction is expected to do
> > something like `(A op1 B) op2 C` in a single cycle (for various
> > combinations of `op1` and `op2` like additions, shifts, and whatnot).
> >
> > I'm far from convinced it would work out well (there's a risk you'd end
> > up having to use a NOP for `op1` or `op2` in too many cases), but I'm
> > curious if someone has tried out something like that,
> >
> >
> > Stefan
> >
> Bill Wulf (CMU) did this, but I forget what they called it.
<
I don't remember--when I knew Bill, he was involved with PDP-11 stuff
and the BLISS compiler stuff.
<
> Mitch was at
> CMU, maybe he remembers. It did address arithmetic real well, MAC, and A
> <comp> B <rel> C, others not so mutch. A bit tough fitting two opcodes
> and four regs into an instruction IIRC.
<
You don't need 4 registers as the calculations are serially dependent.
More like 3 operands 1 result 2 calculations.
<
> A Thumb-like subset gets rid of
> the noops. Mill's ganging is this in effect.
<
In a packetized instruction fetch model, one can perform instruction fusing
to get the same effect. { CMP-BC, LD-OP, OP-ST, ADD-CMP-BB }

Pages:1234
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor