Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

24 Apr, 2024: Testing a new version of the Overboard here. If you have an issue post about it to rocksolid.nodes.help (I know. Everyone on Usenet has issues)


devel / comp.arch / Odd-length Data Once Again

SubjectAuthor
* Odd-length Data Once AgainQuadibloc
+* Re: Odd-length Data Once AgainThomas Koenig
|`* Re: Odd-length Data Once AgainQuadibloc
| `* Re: Odd-length Data Once AgainMitchAlsup
|  `* Re: Odd-length Data Once AgainQuadibloc
|   `- Re: Odd-length Data Once AgainQuadibloc
`* Re: Odd-length Data Once AgainQuadibloc
 `* Re: Odd-length Data Once AgainJimBrakefield
  `* Re: Odd-length Data Once AgainQuadibloc
   `* Re: Odd-length Data Once AgainMitchAlsup
    `* Re: Odd-length Data Once AgainQuadibloc
     `* Re: Odd-length Data Once AgainJimBrakefield
      +* Re: Odd-length Data Once AgainBGB
      |`* Re: Odd-length Data Once AgainQuadibloc
      | `* Re: Odd-length Data Once AgainBGB
      |  `* Re: Odd-length Data Once AgainThomas Koenig
      |   `* Re: Odd-length Data Once AgainBGB
      |    +* Re: Odd-length Data Once AgainThomas Koenig
      |    |`- Re: Odd-length Data Once AgainBGB
      |    `* Re: Odd-length Data Once AgainTerje Mathisen
      |     `- Re: Odd-length Data Once AgainBGB
      +- Division by constants: (was: Odd-length Data Once Again)Anton Ertl
      `* Re: Odd-length Data Once AgainTerje Mathisen
       `* Re: Odd-length Data Once AgainAnton Ertl
        +* Re: Odd-length Data Once AgainTerje Mathisen
        |`* Re: Odd-length Data Once AgainAnton Ertl
        | `- Re: Odd-length Data Once AgainTerje Mathisen
        `* Re: Odd-length Data Once AgainThomas Koenig
         `* Re: Odd-length Data Once AgainTerje Mathisen
          `- Re: Odd-length Data Once AgainThomas Koenig

Pages:12
Odd-length Data Once Again

<9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22130&group=comp.arch#22130

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:5762:: with SMTP id r2mr7929136qvx.31.1637684586374;
Tue, 23 Nov 2021 08:23:06 -0800 (PST)
X-Received: by 2002:a4a:5a43:: with SMTP id v64mr3954335ooa.26.1637684586044;
Tue, 23 Nov 2021 08:23:06 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 23 Nov 2021 08:23:05 -0800 (PST)
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa3a:e00:2dac:90b2:4ca9:4a28;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa3a:e00:2dac:90b2:4ca9:4a28
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
Subject: Odd-length Data Once Again
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Tue, 23 Nov 2021 16:23:06 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 42
 by: Quadibloc - Tue, 23 Nov 2021 16:23 UTC

How to design a computer that handles 36-bit single-precision floating-point
numbers, and 60-bit double-precision floating-point numbers, that is just
as fast and efficient as a conventional computer, where all the data is in
lengths that are a power of two?

I have toyed with a number of different schemes of approaching this goal.

But those schemes have all required compromises of one sort or another,
thus falling somewhat short of that goal. In some cases, not by much, and
it had seemed to me that perhaps the goal was impossible to achieve.

But now I think I have found a way to achieve it. Of course, there are still
costs and compromises, but now they no longer get in the way of
performance.

The scheme I came up with is outlined on:

http://www.quadibloc.com/arch/per12.htm

Basically, what it consists of is this:

Memory is divided into 180-bit memory lines. In this way, each memory
line can contain either five 36-bit single-precision floating-point numbers,
or three 60-bit double-precision floating-point numbers, which are aligned
in both cases, thus never crossing memory line boundaries.

To avoid extra overhead when addressing either type of floating-point
numbers, both the displacements in instructions, and the contents of
index registers, are in *mixed-radix* format. So mixed-radix arithmetic
is used for indexed addressing.

And, therefore, in addition to plain binary index registers, used for
addressing 45-bit memory words, there is an additional set of index
registers for addressing 36-bit floats, and another set of index
registers for addressing 60-bit floats.

So there are three (actually four in the more complete proposal for
this architecture) sets of index registers, and there is some inefficiency
in the format of displacements in instructions. Those compromises,
though, don't get in the way of memory accesses for these unusual-length
data items being fast and straightforward.

John Savard

Re: Odd-length Data Once Again

<snj51u$d74$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22131&group=comp.arch#22131

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.niel.me!aioe.org!news.dns-netz.com!news.freedyn.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-19ac-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Odd-length Data Once Again
Date: Tue, 23 Nov 2021 16:31:59 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <snj51u$d74$1@newsreader4.netcologne.de>
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
Injection-Date: Tue, 23 Nov 2021 16:31:59 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-19ac-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:19ac:0:7285:c2ff:fe6c:992d";
logging-data="13540"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 23 Nov 2021 16:31 UTC

Quadibloc <jsavard@ecn.ab.ca> schrieb:
> How to design a computer that handles 36-bit single-precision floating-point
> numbers, and 60-bit double-precision floating-point numbers,

Is there a particular reason why you chose 60 and not 72 bits for the
double precision?

At least FORTRAN prescribed that a DOUBLE PRECISION number takes
up twice the amount of memory that a REAL variable had. You
could, of course, pad a common block, but it would be a waste.

Re: Odd-length Data Once Again

<b28ca639-7949-41b5-a232-9d091658d723n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22135&group=comp.arch#22135

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:4155:: with SMTP id e21mr8469452qtm.312.1637691331781;
Tue, 23 Nov 2021 10:15:31 -0800 (PST)
X-Received: by 2002:a9d:110:: with SMTP id 16mr6230144otu.94.1637691331463;
Tue, 23 Nov 2021 10:15:31 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 23 Nov 2021 10:15:31 -0800 (PST)
In-Reply-To: <snj51u$d74$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa3a:e00:2dac:90b2:4ca9:4a28;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa3a:e00:2dac:90b2:4ca9:4a28
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com> <snj51u$d74$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b28ca639-7949-41b5-a232-9d091658d723n@googlegroups.com>
Subject: Re: Odd-length Data Once Again
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Tue, 23 Nov 2021 18:15:31 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 14
 by: Quadibloc - Tue, 23 Nov 2021 18:15 UTC

On Tuesday, November 23, 2021 at 9:32:01 AM UTC-7, Thomas Koenig wrote:

> Is there a particular reason why you chose 60 and not 72 bits for the
> double precision?

Basically, on the premise that even 64 bits is wastefully long, and the Control
Data 6600 proves that 60 bit floats are long enough for the most demanding
scientific computations.

That may not be reasonable. The same principle as I outline here would also work
for a machine with 144-bit memory lines, and 36-bit single, 48-bit intermediate,
and 72-bit double; I prefer 48 bits to 45 bits as a length for intermediate floating
point. So there certainly are other options.

John Savard

Re: Odd-length Data Once Again

<eddb9640-f865-4bfd-b443-40381783f3e0n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22137&group=comp.arch#22137

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:5762:: with SMTP id r2mr9169260qvx.31.1637693471145;
Tue, 23 Nov 2021 10:51:11 -0800 (PST)
X-Received: by 2002:a9d:110:: with SMTP id 16mr6446618otu.94.1637693471017;
Tue, 23 Nov 2021 10:51:11 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 23 Nov 2021 10:51:10 -0800 (PST)
In-Reply-To: <b28ca639-7949-41b5-a232-9d091658d723n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<snj51u$d74$1@newsreader4.netcologne.de> <b28ca639-7949-41b5-a232-9d091658d723n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <eddb9640-f865-4bfd-b443-40381783f3e0n@googlegroups.com>
Subject: Re: Odd-length Data Once Again
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 23 Nov 2021 18:51:11 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 25
 by: MitchAlsup - Tue, 23 Nov 2021 18:51 UTC

On Tuesday, November 23, 2021 at 12:15:32 PM UTC-6, Quadibloc wrote:
> On Tuesday, November 23, 2021 at 9:32:01 AM UTC-7, Thomas Koenig wrote:
>
> > Is there a particular reason why you chose 60 and not 72 bits for the
> > double precision?
> Basically, on the premise that even 64 bits is wastefully long, and the Control
> Data 6600 proves that 60 bit floats are long enough for the most demanding
> scientific computations.
<
Misnomer:: CDC 6600 proved that 60-bits of irregularly rounded FP was
sufficient for FP applications of the 1960s.
<
This in no way proves that 60-bits is (or will be found to be) sufficient for
FP applications of the '2020s. The sizes of the data sets are 10^5 larger.
<
On the other hand, 32-bit posits are being found to be sufficient for a
lot of smaller applications (as we speak).
<
So, I am in basic agreement with Thomas:: why not unify at the larger boundary.
>
> That may not be reasonable. The same principle as I outline here would also work
> for a machine with 144-bit memory lines, and 36-bit single, 48-bit intermediate,
> and 72-bit double; I prefer 48 bits to 45 bits as a length for intermediate floating
> point. So there certainly are other options.
>
> John Savard

Re: Odd-length Data Once Again

<91dcdae4-2158-400a-b329-86437a4354b8n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22138&group=comp.arch#22138

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:8c2:: with SMTP id da2mr9044852qvb.23.1637694541747;
Tue, 23 Nov 2021 11:09:01 -0800 (PST)
X-Received: by 2002:a05:6808:abc:: with SMTP id r28mr4684829oij.157.1637694541469;
Tue, 23 Nov 2021 11:09:01 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 23 Nov 2021 11:09:01 -0800 (PST)
In-Reply-To: <eddb9640-f865-4bfd-b443-40381783f3e0n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa3a:e00:2dac:90b2:4ca9:4a28;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa3a:e00:2dac:90b2:4ca9:4a28
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<snj51u$d74$1@newsreader4.netcologne.de> <b28ca639-7949-41b5-a232-9d091658d723n@googlegroups.com>
<eddb9640-f865-4bfd-b443-40381783f3e0n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <91dcdae4-2158-400a-b329-86437a4354b8n@googlegroups.com>
Subject: Re: Odd-length Data Once Again
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Tue, 23 Nov 2021 19:09:01 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 20
 by: Quadibloc - Tue, 23 Nov 2021 19:09 UTC

On Tuesday, November 23, 2021 at 11:51:12 AM UTC-7, MitchAlsup wrote:

> Misnomer:: CDC 6600 proved that 60-bits of irregularly rounded FP was
> sufficient for FP applications of the 1960s.

> This in no way proves that 60-bits is (or will be found to be) sufficient for
> FP applications of the '2020s. The sizes of the data sets are 10^5 larger.

Oh, yes, it might well be that the requirements may be different.

I'd be using a 60-bit format based on IEEE-754, so except possibly for
allowing slightly more than 0.5 ULP for division, I wouldn't be copying
the bad points of the 6600 or the Cray-I.

The idea is, of course, that whatever length of floating-point numbers you
may actually need... unless they just happen to be the power-of-two
lengths, the ability to use other lengths for one's data types instead could
well be useful. Even if those lengths aren't the same as those as I may
use in an example, or which inspired me to look into these possibilities.

John Savard

Re: Odd-length Data Once Again

<86086e3f-e823-410e-8bf2-4cae7a4fb14an@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22140&group=comp.arch#22140

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1828:: with SMTP id t40mr1005822qtc.0.1637705074683;
Tue, 23 Nov 2021 14:04:34 -0800 (PST)
X-Received: by 2002:a05:6808:211c:: with SMTP id r28mr685271oiw.155.1637705074444;
Tue, 23 Nov 2021 14:04:34 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 23 Nov 2021 14:04:34 -0800 (PST)
In-Reply-To: <91dcdae4-2158-400a-b329-86437a4354b8n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa3a:e00:2dac:90b2:4ca9:4a28;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa3a:e00:2dac:90b2:4ca9:4a28
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<snj51u$d74$1@newsreader4.netcologne.de> <b28ca639-7949-41b5-a232-9d091658d723n@googlegroups.com>
<eddb9640-f865-4bfd-b443-40381783f3e0n@googlegroups.com> <91dcdae4-2158-400a-b329-86437a4354b8n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <86086e3f-e823-410e-8bf2-4cae7a4fb14an@googlegroups.com>
Subject: Re: Odd-length Data Once Again
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Tue, 23 Nov 2021 22:04:34 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Quadibloc - Tue, 23 Nov 2021 22:04 UTC

On Tuesday, November 23, 2021 at 12:09:02 PM UTC-7, Quadibloc wrote:
> On Tuesday, November 23, 2021 at 11:51:12 AM UTC-7, MitchAlsup wrote:

> > Misnomer:: CDC 6600 proved that 60-bits of irregularly rounded FP was
> > sufficient for FP applications of the 1960s.

> > This in no way proves that 60-bits is (or will be found to be) sufficient for
> > FP applications of the '2020s. The sizes of the data sets are 10^5 larger.

> Oh, yes, it might well be that the requirements may be different.

I've added a little table to the page at

http://www.quadibloc.com/arch/per12.htm

which shows various other possibilities for use with this scheme.

What I've occasionally encountered in the literature are papers about using
80-bit or 128-bit floating-point numbers for some specialized applications
for which 64-bit double precision is inadequate, but not anything saying that
in general because programs are performing larger simulations these days
that run longer, we should change from using 64-bit floats to something
modestly larger, like 72 bits, to reflect that.

While I described the ordinary floating-point formats with a 180-bit memory
line, there would also be longer formats than 60-bit double-precision available.
The temporary real format, where the leading bit of the significand is no
longer hidden, would be 90 bits long, and there could even be a double format
that is 180 bits long.

The table I've added shows how the same scheme would work out, dividing
a memory line of various lengths into aligned floats that don't cross the boundaries
between memory lines (except possibly for the double format, which could
occupy two lines in some cases); the examples I've given are memory lines of
144 bits, 180 bits, 192 bits, and 256 bits.

Among those possibilities, people should be able to find something they would
like.

John Savard

Re: Odd-length Data Once Again

<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22147&group=comp.arch#22147

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5bc8:: with SMTP id b8mr4948847qtb.247.1637743112764;
Wed, 24 Nov 2021 00:38:32 -0800 (PST)
X-Received: by 2002:a05:6808:16ac:: with SMTP id bb44mr4172612oib.122.1637743112466;
Wed, 24 Nov 2021 00:38:32 -0800 (PST)
Path: i2pn2.org!i2pn.org!news.swapon.de!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 24 Nov 2021 00:38:32 -0800 (PST)
In-Reply-To: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa3a:e00:d97a:f95d:af3d:2fde;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa3a:e00:d97a:f95d:af3d:2fde
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com>
Subject: Re: Odd-length Data Once Again
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 24 Nov 2021 08:38:32 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Quadibloc - Wed, 24 Nov 2021 08:38 UTC

On Tuesday, November 23, 2021 at 9:23:07 AM UTC-7, Quadibloc wrote:

> To avoid extra overhead when addressing either type of floating-point
> numbers, both the displacements in instructions, and the contents of
> index registers, are in *mixed-radix* format. So mixed-radix arithmetic
> is used for indexed addressing.

Of course, now I can see why I hadn't thought of this before.

This means that an *array subscript* has to be a special type of number
depending on the type of the elements of the array. So code in ordinary
computer languages would be difficult to compile, given that the
advantages of the architecture would be lost if unnecessary conversions
from binary integers to mixed-radix displacements were performed.

This isn't an insuperable problem - one could define a computer language
that allowed the programmer to write efficient higher-level code for this
kind of architecture by specifying counter/index/pointer types explicitly, or a
conventional language could be used, since keeping track of this sort of
thing is not more complicated than things compilers already do.

John Savard

Re: Odd-length Data Once Again

<345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22154&group=comp.arch#22154

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:4446:: with SMTP id w6mr8097828qkp.631.1637853460650;
Thu, 25 Nov 2021 07:17:40 -0800 (PST)
X-Received: by 2002:a05:6808:16ac:: with SMTP id bb44mr16344361oib.122.1637853460342;
Thu, 25 Nov 2021 07:17:40 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 25 Nov 2021 07:17:40 -0800 (PST)
In-Reply-To: <ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.253.102; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.253.102
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com> <ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
Subject: Re: Odd-length Data Once Again
From: jim.brak...@ieee.org (JimBrakefield)
Injection-Date: Thu, 25 Nov 2021 15:17:40 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 30
 by: JimBrakefield - Thu, 25 Nov 2021 15:17 UTC

On Wednesday, November 24, 2021 at 2:38:33 AM UTC-6, Quadibloc wrote:
> On Tuesday, November 23, 2021 at 9:23:07 AM UTC-7, Quadibloc wrote:
>
> > To avoid extra overhead when addressing either type of floating-point
> > numbers, both the displacements in instructions, and the contents of
> > index registers, are in *mixed-radix* format. So mixed-radix arithmetic
> > is used for indexed addressing.
> Of course, now I can see why I hadn't thought of this before.
>
> This means that an *array subscript* has to be a special type of number
> depending on the type of the elements of the array. So code in ordinary
> computer languages would be difficult to compile, given that the
> advantages of the architecture would be lost if unnecessary conversions
> from binary integers to mixed-radix displacements were performed.
>
> This isn't an insuperable problem - one could define a computer language
> that allowed the programmer to write efficient higher-level code for this
> kind of architecture by specifying counter/index/pointer types explicitly, or a
> conventional language could be used, since keeping track of this sort of
> thing is not more complicated than things compilers already do.
>
> John Savard

Given a cache line of 256 bits, the number 252 is almost magical.
Divides evenly by:
6 & 12
7, 14 and 28
9, 18 & 36
21, 42 & 84
and of course 256 divides evenly by 4, 8, 16 ...
For split radix addresses the data size and instruction size choices are magnificent

Re: Odd-length Data Once Again

<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22156&group=comp.arch#22156

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:20ee:: with SMTP id 14mr10296394qvk.94.1637886191304;
Thu, 25 Nov 2021 16:23:11 -0800 (PST)
X-Received: by 2002:a05:6830:2b0f:: with SMTP id l15mr25247016otv.333.1637886191046;
Thu, 25 Nov 2021 16:23:11 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 25 Nov 2021 16:23:10 -0800 (PST)
In-Reply-To: <345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa3c:ec00:fcde:7675:5a62:241f;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa3c:ec00:fcde:7675:5a62:241f
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com> <345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com>
Subject: Re: Odd-length Data Once Again
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 26 Nov 2021 00:23:11 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 14
 by: Quadibloc - Fri, 26 Nov 2021 00:23 UTC

On Thursday, November 25, 2021 at 8:17:41 AM UTC-7, JimBrakefield wrote:

> Given a cache line of 256 bits, the number 252 is almost magical.

I know that in my original Concertina article, I took 256 bits, and for 36 bits
I used 252, and for 51 bits I used 255, to have single precision and intermediate
precision values that fitted into a unit of alignment.

But I am disappointed, though, that while I found a compromise that met my
performance goals, it's so different from the standpoint of programming for it
that it remains impractical. I think I finally have reached the limits of exploring
this subject, and there isn't an option that would provide a better solution than
those I've already examined.

John Savard

Re: Odd-length Data Once Again

<20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22157&group=comp.arch#22157

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:11c4:: with SMTP id n4mr28505004qtk.56.1637970181272;
Fri, 26 Nov 2021 15:43:01 -0800 (PST)
X-Received: by 2002:a9d:749a:: with SMTP id t26mr32284505otk.96.1637970181036;
Fri, 26 Nov 2021 15:43:01 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 26 Nov 2021 15:43:00 -0800 (PST)
In-Reply-To: <fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com> <345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
Subject: Re: Odd-length Data Once Again
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 26 Nov 2021 23:43:01 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 37
 by: MitchAlsup - Fri, 26 Nov 2021 23:43 UTC

On Thursday, November 25, 2021 at 6:23:12 PM UTC-6, Quadibloc wrote:
> On Thursday, November 25, 2021 at 8:17:41 AM UTC-7, JimBrakefield wrote:
>
> > Given a cache line of 256 bits, the number 252 is almost magical.
<
Yes, but::
The length of a cache line is a balancing act between memory performance
and the size of the tag-overhead and latency of the data beets moving the
data back and forth.
x86 went 512-bits in about 2003
256-bits is probably too small today
GPUs have been using 1024-bits since the-mid-teens.
<
Important to note: it remains a balancing act !
<
> I know that in my original Concertina article, I took 256 bits, and for 36 bits
> I used 252, and for 51 bits I used 255, to have single precision and intermediate
> precision values that fitted into a unit of alignment.
<
Pairs of these "things" in a 512-bit line or quads in a 1024-bit line are reasonable.
But the magicity of 256 (or 252) should not lead decisions of cache-line-size--
but be lead by decisions of cache-line-size.
>
> But I am disappointed, though, that while I found a compromise that met my
> performance goals, it's so different from the standpoint of programming for it
> that it remains impractical. I think I finally have reached the limits of exploring
> this subject, and there isn't an option that would provide a better solution than
> those I've already examined.
<
It will remain impractical (in my opinion) until you realize that performing AGEN
arithmetic for these odd-sized thingies is inherently a fixed-point calculation
(not integer). Ultimately, if you "buy" a small multiplier* in the address path (ala
Boroughs BSP) to properly round these fixed-point things--you will find that
there is a solution awaiting.
<
(*) or suitable adder configured to round 252->256 to make the AGENs work.
>
> John Savard

Re: Odd-length Data Once Again

<bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22158&group=comp.arch#22158

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:180c:: with SMTP id t12mr21314958qtc.507.1637982484342;
Fri, 26 Nov 2021 19:08:04 -0800 (PST)
X-Received: by 2002:a05:6808:144f:: with SMTP id x15mr16004930oiv.157.1637982484096;
Fri, 26 Nov 2021 19:08:04 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 26 Nov 2021 19:08:03 -0800 (PST)
In-Reply-To: <20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa3c:ec00:995c:4035:a8a6:7355;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa3c:ec00:995c:4035:a8a6:7355
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com> <345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com> <20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com>
Subject: Re: Odd-length Data Once Again
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sat, 27 Nov 2021 03:08:04 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 14
 by: Quadibloc - Sat, 27 Nov 2021 03:08 UTC

On Friday, November 26, 2021 at 4:43:02 PM UTC-7, MitchAlsup wrote:
> Ultimately, if you "buy" a small multiplier* in the address path (ala
> Boroughs BSP) to properly round these fixed-point things--you will find that
> there is a solution awaiting.

Oh, yes, I noted that the BSP presumably multiplied by 15, and then "divided"
by 255 by means of dividing by its reciprocal - and the trick is to have an
end-around-carry on the eight bits after the binary point so that the result
is an integer plus a fraction of the form n/255 so that the result will always
be correct.

So I am very much aware of that solution, I just tried to do even better, and
that attempt, on further reflection, was a failure,.

John Savard

Re: Odd-length Data Once Again

<095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22159&group=comp.arch#22159

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1aa8:: with SMTP id s40mr29430885qtc.381.1637985783646;
Fri, 26 Nov 2021 20:03:03 -0800 (PST)
X-Received: by 2002:a05:6808:1141:: with SMTP id u1mr27411072oiu.30.1637985783336;
Fri, 26 Nov 2021 20:03:03 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 26 Nov 2021 20:03:03 -0800 (PST)
In-Reply-To: <bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.253.102; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.253.102
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com> <345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com> <20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
<bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>
Subject: Re: Odd-length Data Once Again
From: jim.brak...@ieee.org (JimBrakefield)
Injection-Date: Sat, 27 Nov 2021 04:03:03 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 19
 by: JimBrakefield - Sat, 27 Nov 2021 04:03 UTC

On Friday, November 26, 2021 at 9:08:05 PM UTC-6, Quadibloc wrote:
> On Friday, November 26, 2021 at 4:43:02 PM UTC-7, MitchAlsup wrote:
> > Ultimately, if you "buy" a small multiplier* in the address path (ala
> > Boroughs BSP) to properly round these fixed-point things--you will find that
> > there is a solution awaiting.
> Oh, yes, I noted that the BSP presumably multiplied by 15, and then "divided"
> by 255 by means of dividing by its reciprocal - and the trick is to have an
> end-around-carry on the eight bits after the binary point so that the result
> is an integer plus a fraction of the form n/255 so that the result will always
> be correct.
>
> So I am very much aware of that solution, I just tried to do even better, and
> that attempt, on further reflection, was a failure,.
>
> John Savard

The Wikipedia page "division algorithm" covers "division by a constant" with authority?
The scaled reciprocal multiplies shown can be shortened by appropriate shift and add trees?

Re: Odd-length Data Once Again

<snsl3s$q31$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22160&group=comp.arch#22160

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Odd-length Data Once Again
Date: Sat, 27 Nov 2021 01:01:07 -0600
Organization: A noiseless patient Spider
Lines: 62
Message-ID: <snsl3s$q31$1@dont-email.me>
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com>
<345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com>
<20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
<bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com>
<095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 27 Nov 2021 07:01:16 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="a3517d7e2a372ab6240ee272c4b5d154";
logging-data="26721"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19yGDarlheQmoRy8bfDu3FS"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.3.2
Cancel-Lock: sha1:Yjw5qQHZxbRIc1Z5qA865IFiPcE=
In-Reply-To: <095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>
Content-Language: en-US
 by: BGB - Sat, 27 Nov 2021 07:01 UTC

On 11/26/2021 10:03 PM, JimBrakefield wrote:
> On Friday, November 26, 2021 at 9:08:05 PM UTC-6, Quadibloc wrote:
>> On Friday, November 26, 2021 at 4:43:02 PM UTC-7, MitchAlsup wrote:
>>> Ultimately, if you "buy" a small multiplier* in the address path (ala
>>> Boroughs BSP) to properly round these fixed-point things--you will find that
>>> there is a solution awaiting.
>> Oh, yes, I noted that the BSP presumably multiplied by 15, and then "divided"
>> by 255 by means of dividing by its reciprocal - and the trick is to have an
>> end-around-carry on the eight bits after the binary point so that the result
>> is an integer plus a fraction of the form n/255 so that the result will always
>> be correct.
>>
>> So I am very much aware of that solution, I just tried to do even better, and
>> that attempt, on further reflection, was a failure,.
>>
>> John Savard
>
> The Wikipedia page "division algorithm" covers "division by a constant" with authority?
> The scaled reciprocal multiplies shown can be shortened by appropriate shift and add trees?
>

Yeah.

Addressing by non-power-of-2 scales could work, but the harder thing
would be getting good throughput.

If one had specific divisors, like 15 or 255, iterative shift-and-add
could work.

But, FWIW:
y=x/n;
Can be transformed into, say:
y=(x*c)>>d;
Where, in my case, typically:
d=32;
c=(0x100000000ULL+(n-1))/n;

It is possible in many cases to use a smaller reciprocal, depends on the
ranges of the values.

Curiously, Wikipedia shows a different / more complicated algorithm,
implying that possibly my algorithm is not exact?...

I guess I may need to do some more extensive testing and to determine
whether or not my approach is accurate (of if I may need to tweak it).

It is possible to use a table of reciprocals, which works well for small N.

There is a trick where it is possible to extend the range of the table
(up to larger divisors) by using shifts (works well with a CLZ
instruction), but with the primary drawback that this approach is not
integer exact.

There are ways to fix up the exactness issue (and get exact results),
however, IME, these tend to add complexity and on average work out
slower than falling back to something like binary long division (if the
divisor is out of the range of the lookup table).

....

Re: Odd-length Data Once Again

<ab458dda-e101-4519-a2f8-77c6cbd77692n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22161&group=comp.arch#22161

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:400e:: with SMTP id kd14mr17131535qvb.70.1638000262575;
Sat, 27 Nov 2021 00:04:22 -0800 (PST)
X-Received: by 2002:aca:2412:: with SMTP id n18mr28508378oic.119.1638000262379;
Sat, 27 Nov 2021 00:04:22 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 27 Nov 2021 00:04:22 -0800 (PST)
In-Reply-To: <snsl3s$q31$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fa3c:ec00:995c:4035:a8a6:7355;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fa3c:ec00:995c:4035:a8a6:7355
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com> <345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com> <20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
<bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com> <095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>
<snsl3s$q31$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ab458dda-e101-4519-a2f8-77c6cbd77692n@googlegroups.com>
Subject: Re: Odd-length Data Once Again
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sat, 27 Nov 2021 08:04:22 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 21
 by: Quadibloc - Sat, 27 Nov 2021 08:04 UTC

On Saturday, November 27, 2021 at 12:01:19 AM UTC-7, BGB wrote:
> There are ways to fix up the exactness issue (and get exact results),
> however, IME, these tend to add complexity and on average work out
> slower than falling back to something like binary long division (if the
> divisor is out of the range of the lookup table).

Interesting. In my reply to Mitch, I noted that fixing up exactness was
simple, once I understood it.

The reciprocal of 255 is .000000010000000100000001... in binary.

So to multiply by that exactly, what you do is take the last eight
bits of the result, the eight bits after the binary point, and slap an
end-around-carry on them. Once you do that, instead of having a
binary expansion that goes on forever, you have an integer, plus
a fraction in the form n/255. An end-around-carry isn't a big deal,
after all, it was standard equipment in ALUs for computers that
used one's complement arithmetic, quite common in the old
days.

John Savard

Re: Odd-length Data Once Again

<snsvo7$nnd$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22162&group=comp.arch#22162

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Odd-length Data Once Again
Date: Sat, 27 Nov 2021 04:02:39 -0600
Organization: A noiseless patient Spider
Lines: 57
Message-ID: <snsvo7$nnd$1@dont-email.me>
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com>
<345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com>
<20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
<bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com>
<095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>
<snsl3s$q31$1@dont-email.me>
<ab458dda-e101-4519-a2f8-77c6cbd77692n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 27 Nov 2021 10:02:47 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="a3517d7e2a372ab6240ee272c4b5d154";
logging-data="24301"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/9XE3RnDpzVU0HyL5N8CKP"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.3.2
Cancel-Lock: sha1:9c61aQyqR+LemHiFxknXWw9+X80=
In-Reply-To: <ab458dda-e101-4519-a2f8-77c6cbd77692n@googlegroups.com>
Content-Language: en-US
 by: BGB - Sat, 27 Nov 2021 10:02 UTC

On 11/27/2021 2:04 AM, Quadibloc wrote:
> On Saturday, November 27, 2021 at 12:01:19 AM UTC-7, BGB wrote:
>

I did more testing, and noted that for dividing sufficiently large
numbers, my approach:
r = ((1<<32)+(d-1))/d;
q = (n*r)>>32;

Was prone to occasional off-by-one errors.
These generally happened when the numerator was over ~ 500M or so.

Did fiddle with it, and came up with an intermediate option:
k = 32+floor(log2(d))
if(!(d&(d-1)))k--;
r = ((1LL<<k)+(d-1))/d;
q = (n*r)>>k;

This did seem to pass with a brute-force check at least over the parts
of the space I was able to verify within a reasonable timeframe.

>> There are ways to fix up the exactness issue (and get exact results),
>> however, IME, these tend to add complexity and on average work out
>> slower than falling back to something like binary long division (if the
>> divisor is out of the range of the lookup table).
>
> Interesting. In my reply to Mitch, I noted that fixing up exactness was
> simple, once I understood it.
>
> The reciprocal of 255 is .000000010000000100000001... in binary.
>
> So to multiply by that exactly, what you do is take the last eight
> bits of the result, the eight bits after the binary point, and slap an
> end-around-carry on them. Once you do that, instead of having a
> binary expansion that goes on forever, you have an integer, plus
> a fraction in the form n/255. An end-around-carry isn't a big deal,
> after all, it was standard equipment in ALUs for computers that
> used one's complement arithmetic, quite common in the old
> days.
>

Yeah. What I meant was exactness of trying to use shift-trickery to
extend the use of a small division lookup table up to arbitrarily large
divisors.

It is a nifty trick for 3D renderers, which don't generally care about
bit-exact division. Not so great for the C library or compiler, where:
c=a/b;
Is generally expected to be exact.

It is possible to fetch different pieces for different bit-ranges and
add them together to get a reciprocal, but this is more complicated and
still not exact.

One can use an iterative fixup approach, but this is typically slower
than using binary long-division in these cases.

Re: Odd-length Data Once Again

<snt18o$uvn$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22163&group=comp.arch#22163

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-19ac-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Odd-length Data Once Again
Date: Sat, 27 Nov 2021 10:28:40 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <snt18o$uvn$1@newsreader4.netcologne.de>
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com>
<345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com>
<20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
<bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com>
<095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>
<snsl3s$q31$1@dont-email.me>
<ab458dda-e101-4519-a2f8-77c6cbd77692n@googlegroups.com>
<snsvo7$nnd$1@dont-email.me>
Injection-Date: Sat, 27 Nov 2021 10:28:40 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-19ac-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:19ac:0:7285:c2ff:fe6c:992d";
logging-data="31735"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Sat, 27 Nov 2021 10:28 UTC

BGB <cr88192@gmail.com> schrieb:
> On 11/27/2021 2:04 AM, Quadibloc wrote:
>> On Saturday, November 27, 2021 at 12:01:19 AM UTC-7, BGB wrote:
>>
>
> I did more testing, and noted that for dividing sufficiently large
> numbers, my approach:
> r = ((1<<32)+(d-1))/d;
> q = (n*r)>>32;

Get "Hacker's Delight" and read the stuff about division by constants.
It's quite well written.

Division by constants: (was: Odd-length Data Once Again)

<2021Nov27.124554@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22164&group=comp.arch#22164

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Division by constants: (was: Odd-length Data Once Again)
Date: Sat, 27 Nov 2021 11:45:54 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 45
Message-ID: <2021Nov27.124554@mips.complang.tuwien.ac.at>
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com> <ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com> <345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com> <fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com> <20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com> <bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com> <095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>
Injection-Info: reader02.eternal-september.org; posting-host="675467b237cc642ea87d8c5765d6c1ce";
logging-data="13867"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18q4AsZiutudNW5lcKijoEj"
Cancel-Lock: sha1:pShdv+Rmw2bhoc6dNPISi4N322M=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Sat, 27 Nov 2021 11:45 UTC

JimBrakefield <jim.brakefield@ieee.org> writes:
>The Wikipedia page "division algorithm" covers "division by a constant" with authority?

It's not bad, but it does not reference the (IMO) best paper on the
topic:

Arch D. Robison. N -bit unsigned division via N -bit multiply-add. In
Proceedings of the 17th IEEE Symposium on Computer Arithmetic
(ARITH-17). IEEE Computer Society Press, 2005.

You can also read my paper on the topic, which presents a solution
that has a better latency in some cases; it also has a relatively
up-to-date Section on "Related Work".

@InProceedings{ertl19kps,
author = {M. Anton Ertl},
title = {Integer Division by Multiplying with the
Double-Width Reciprocal},
crossref = {kps19},
pages = {75--84},
url = {http://www.complang.tuwien.ac.at/papers/ertl19kps.pdf},
url-slides = {http://www.complang.tuwien.ac.at/papers/ertl19kps-slides.pdf},
abstract = {Earlier work on integer division by multiplying with
the reciprocal has focused on multiplying with a
single-width reciprocal, combined with a correction
and followed by a shift. The present work explores
using a double-width reciprocal to allow getting rid
of the correction and shift.}
}

@Proceedings{kps19,
title = {20. Kolloquium Programmiersprachen und Grundlagen
der Programmierung (KPS)},
booktitle = {20. Kolloquium Programmiersprachen und Grundlagen
der Programmierung (KPS)},
year = {2019},
key = {kps19},
editor = {Martin Pl\"umicke and Fayez Abu Alia},
url = {https://www.hb.dhbw-stuttgart.de/kps2019/kps2019_Tagungsband.pdf}
}

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Odd-length Data Once Again

<sntqnn$e9f$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22165&group=comp.arch#22165

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Odd-length Data Once Again
Date: Sat, 27 Nov 2021 11:43:17 -0600
Organization: A noiseless patient Spider
Lines: 45
Message-ID: <sntqnn$e9f$1@dont-email.me>
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com>
<345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com>
<20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
<bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com>
<095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>
<snsl3s$q31$1@dont-email.me>
<ab458dda-e101-4519-a2f8-77c6cbd77692n@googlegroups.com>
<snsvo7$nnd$1@dont-email.me> <snt18o$uvn$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 27 Nov 2021 17:43:19 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="a3517d7e2a372ab6240ee272c4b5d154";
logging-data="14639"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18drw7uNkePMCwRjMwXKhof"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.3.2
Cancel-Lock: sha1:o5/joqp3vRdz8Yb7wanh1CvKpT4=
In-Reply-To: <snt18o$uvn$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: BGB - Sat, 27 Nov 2021 17:43 UTC

On 11/27/2021 4:28 AM, Thomas Koenig wrote:
> BGB <cr88192@gmail.com> schrieb:
>> On 11/27/2021 2:04 AM, Quadibloc wrote:
>>> On Saturday, November 27, 2021 at 12:01:19 AM UTC-7, BGB wrote:
>>>
>>
>> I did more testing, and noted that for dividing sufficiently large
>> numbers, my approach:
>> r = ((1<<32)+(d-1))/d;
>> q = (n*r)>>32;
>
> Get "Hacker's Delight" and read the stuff about division by constants.
> It's quite well written.
>

Yeah, I don't have this.

My initial algo was basically something I threw together experimentally,
and did basic tests, and it seemed to work so I used it.

However, I didn't really test it "exhaustively", as brute-forcing the
entire 32/32 space is a bit of a problem. Trying to brute-force up to
the 32-bit limit for the numerator did reveal some problems though
(occasional off-by-one errors for large values).

The modified algo seemed to work and avoided this issue. But, as noted,
can't test over the entire space as this seems like it would take an
implausibly long time.

Not so sure about whatever exactly is going on with the approach
described on Wikipedia, my own attempts to mimic it led to results which
"didn't work".

Did add the modified algo to my compiler (for the division by constant
case), and to the lookup-table divider, no visible change in the
behavior of Doom or similar.

This mostly applies to 'int' cases, as the 64-bit and 128-bit dividers
do not use lookup tables (these cases only use the binary long division
loop).

....

Re: Odd-length Data Once Again

<so0b8i$7d5$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22166&group=comp.arch#22166

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-19ac-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Odd-length Data Once Again
Date: Sun, 28 Nov 2021 16:37:38 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <so0b8i$7d5$1@newsreader4.netcologne.de>
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com>
<345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com>
<20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
<bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com>
<095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>
<snsl3s$q31$1@dont-email.me>
<ab458dda-e101-4519-a2f8-77c6cbd77692n@googlegroups.com>
<snsvo7$nnd$1@dont-email.me> <snt18o$uvn$1@newsreader4.netcologne.de>
<sntqnn$e9f$1@dont-email.me>
Injection-Date: Sun, 28 Nov 2021 16:37:38 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-19ac-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:19ac:0:7285:c2ff:fe6c:992d";
logging-data="7589"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Sun, 28 Nov 2021 16:37 UTC

BGB <cr88192@gmail.com> schrieb:
> On 11/27/2021 4:28 AM, Thomas Koenig wrote:
>> BGB <cr88192@gmail.com> schrieb:
>>> On 11/27/2021 2:04 AM, Quadibloc wrote:
>>>> On Saturday, November 27, 2021 at 12:01:19 AM UTC-7, BGB wrote:
>>>>
>>>
>>> I did more testing, and noted that for dividing sufficiently large
>>> numbers, my approach:
>>> r = ((1<<32)+(d-1))/d;
>>> q = (n*r)>>32;
>>
>> Get "Hacker's Delight" and read the stuff about division by constants.
>> It's quite well written.
>>
>
> Yeah, I don't have this.

It is addressed to compiler writers, so it is definitely something
that could benefit you. (As you roll your own architecture, you
could even implement a "signed division by 2" instruction, or at
least POWER's ADDZE so you only need two instructions for that).

As far as division by constands is concerned,
https://oeis.org/A346496 has a sequence for 32-bit unsigned
division.

Re: Odd-length Data Once Again

<so0fi9$1olt$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22167&group=comp.arch#22167

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!ppYixYMWAWh/woI8emJOIQ.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Odd-length Data Once Again
Date: Sun, 28 Nov 2021 18:51:04 +0100
Organization: Aioe.org NNTP Server
Message-ID: <so0fi9$1olt$1@gioia.aioe.org>
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com>
<345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com>
<20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
<bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com>
<095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="58045"; posting-host="ppYixYMWAWh/woI8emJOIQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.10
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Sun, 28 Nov 2021 17:51 UTC

JimBrakefield wrote:
> On Friday, November 26, 2021 at 9:08:05 PM UTC-6, Quadibloc wrote:
>> On Friday, November 26, 2021 at 4:43:02 PM UTC-7, MitchAlsup wrote:
>>> Ultimately, if you "buy" a small multiplier* in the address path (ala
>>> Boroughs BSP) to properly round these fixed-point things--you will find that
>>> there is a solution awaiting.
>> Oh, yes, I noted that the BSP presumably multiplied by 15, and then "divided"
>> by 255 by means of dividing by its reciprocal - and the trick is to have an
>> end-around-carry on the eight bits after the binary point so that the result
>> is an integer plus a fraction of the form n/255 so that the result will always
>> be correct.
>>
>> So I am very much aware of that solution, I just tried to do even better, and
>> that attempt, on further reflection, was a failure,.
>>
>> John Savard
>
> The Wikipedia page "division algorithm" covers "division by a constant" with authority?
> The scaled reciprocal multiplies shown can be shortened by appropriate shift and add trees?

When I re-invented reciprocal muls for constant DIV, I was told by Agner
Fog that DEC had published something about the trick several years
earlier, but that's fine, I had already written code that did exhaustive
testing for all 32/32->32 divisions, just to verify my intuition that
having a correctly rounded up 33+ bit reciprocal would always give the
exact result.

Agnar & I came up with optimal x86 sequences for both divisors that are
OK with a 32-bit reciprocal and those that need the extra bit at the end
(i.e. when the 33rd bit after rounding up 34+ is a 1 bit).

Since then every compiler where I have checked the asm/machine code
output will use effectively the same approach, sometimes simplifying it
slightly for smaller divisors.

When you want to use this trick for BSP style prime modulus division,
then you have to look for and select a divisor where the reciprocal have
a small set of set bits.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Odd-length Data Once Again

<so0g0l$1vpr$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22168&group=comp.arch#22168

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!ppYixYMWAWh/woI8emJOIQ.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Odd-length Data Once Again
Date: Sun, 28 Nov 2021 18:58:44 +0100
Organization: Aioe.org NNTP Server
Message-ID: <so0g0l$1vpr$1@gioia.aioe.org>
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com>
<345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com>
<20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
<bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com>
<095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>
<snsl3s$q31$1@dont-email.me>
<ab458dda-e101-4519-a2f8-77c6cbd77692n@googlegroups.com>
<snsvo7$nnd$1@dont-email.me> <snt18o$uvn$1@newsreader4.netcologne.de>
<sntqnn$e9f$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="65339"; posting-host="ppYixYMWAWh/woI8emJOIQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.10
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Sun, 28 Nov 2021 17:58 UTC

BGB wrote:
> On 11/27/2021 4:28 AM, Thomas Koenig wrote:
>> BGB <cr88192@gmail.com> schrieb:
>>> On 11/27/2021 2:04 AM, Quadibloc wrote:
>>>> On Saturday, November 27, 2021 at 12:01:19 AM UTC-7, BGB wrote:
>>>
>>> I did more testing, and noted that for dividing sufficiently large
>>> numbers, my approach:
>>>     r = ((1<<32)+(d-1))/d;
>>>     q = (n*r)>>32;
>>
>> Get "Hacker's Delight" and read the stuff about division by constants.
>> It's quite well written.
>>
>
> Yeah, I don't have this.
>
> My initial algo was basically something I threw together experimentally,
> and did basic tests, and it seemed to work so I used it.
>
> However, I didn't really test it "exhaustively", as brute-forcing the
> entire 32/32 space is a bit of a problem. Trying to brute-force up to
> the 32-bit limit for the numerator did reveal some problems though
> (occasional off-by-one errors for large values).
>
>
> The modified algo seemed to work and avoided this issue. But, as noted,
> can't test over the entire space as this seems like it would take an
> implausibly long time.

Not really: You only need to test the boundary values, i.e. those N/M
where N is very close to a multiple of M, so you check (N-1)/M and N/M,
and you should start with the largest possible N that fits in 32 bits.

Doing that for all divisors is quite feasible for 32-bit values.
somewhat less so for 64-bit. :-)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Odd-length Data Once Again

<so0i6b$fuj$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22169&group=comp.arch#22169

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Odd-length Data Once Again
Date: Sun, 28 Nov 2021 12:35:54 -0600
Organization: A noiseless patient Spider
Lines: 90
Message-ID: <so0i6b$fuj$1@dont-email.me>
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com>
<345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com>
<20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
<bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com>
<095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>
<snsl3s$q31$1@dont-email.me>
<ab458dda-e101-4519-a2f8-77c6cbd77692n@googlegroups.com>
<snsvo7$nnd$1@dont-email.me> <snt18o$uvn$1@newsreader4.netcologne.de>
<sntqnn$e9f$1@dont-email.me> <so0b8i$7d5$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 28 Nov 2021 18:35:56 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="fec1b91bd94e14a8c151a2f0f9556a21";
logging-data="16339"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18ITIOBqI/tnyFY+LDk8ykV"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.3.2
Cancel-Lock: sha1:N582YW/5ae392y+WONwSlZ4MGZc=
In-Reply-To: <so0b8i$7d5$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: BGB - Sun, 28 Nov 2021 18:35 UTC

On 11/28/2021 10:37 AM, Thomas Koenig wrote:
> BGB <cr88192@gmail.com> schrieb:
>> On 11/27/2021 4:28 AM, Thomas Koenig wrote:
>>> BGB <cr88192@gmail.com> schrieb:
>>>> On 11/27/2021 2:04 AM, Quadibloc wrote:
>>>>> On Saturday, November 27, 2021 at 12:01:19 AM UTC-7, BGB wrote:
>>>>>
>>>>
>>>> I did more testing, and noted that for dividing sufficiently large
>>>> numbers, my approach:
>>>> r = ((1<<32)+(d-1))/d;
>>>> q = (n*r)>>32;
>>>
>>> Get "Hacker's Delight" and read the stuff about division by constants.
>>> It's quite well written.
>>>
>>
>> Yeah, I don't have this.
>
> It is addressed to compiler writers, so it is definitely something
> that could benefit you. (As you roll your own architecture, you
> could even implement a "signed division by 2" instruction, or at
> least POWER's ADDZE so you only need two instructions for that).
>

I did look it up, and noted that it is apparently freely available
online (well, as opposed to needing to order a book).

But, yeah, such an instruction could be possible.

As-is, signed division by power-of-2 is 4 instructions:
MOV Rs, Rt | CMPGE 0, Rs
ADD?F (d-1), Rt
SHAD Rt, -k, Rd

Where: k=log2(d), and Rt is a scratch register.

> As far as division by constands is concerned,
> https://oeis.org/A346496 has a sequence for 32-bit unsigned
> division.
>

OK. I had done more testing and noted that my revised form still failed
for larger values with various divisors (7, 14, 19, 28, 31, ...).

I added logic to detect these cases and disable the optimization if it
would lead to unsafe results (and noted that these cases can be detected
without needing to brute-force the entire range).

Also switched the calculation for the shift to:
k = 31+ceil(log2(d));

The table-driven divider also now has a few extra checks for this.

It is possible I could add runtime calls for a 64*64->128 multiply with
a 128-bit right shift, which could potentially be used as a fallback case.

TBD.

Ironically, I did note recently when attempting to run a profiler on my
emulator (having to resort to WSL + gprof as none of the Windows
profilers seem to want to work right now), that apparently:
MOV (IR / 2R), ADD/SUB, SHAD/SHLD, ..., are actually some of the most
common instructions being executed, but are partly hidden away in my
other stats given that they are also typically executed in parallel with
other instructions, and thus partly avoid being counted in the
per-instruction clock-cycle budget.

So, say:
ADD 4, R5 | MOV.L (SP, 24), R4

Will only count the 'MOV.L' for its clock-cycles, and ignore the 'ADD'
(since its clock-cycles are effectively hidden behind the 'MOV.L').

For compiler stats:
~ 40% of F8 block instructions are in Lane 2/3;
~ 20% of F2 block instructions are in Lane 2/3;
~ 13% of F0 block instructions are in Lane 2/3;

None of the F1 block is in these lanes, partly as there is nothing in
the F1 block that is allowed in these lanes.

Annoyingly, some amount of the recent debugging did adversely effect the
Dhrystone score, but alas...

Re: Odd-length Data Once Again

<so0jbm$6nh$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22170&group=comp.arch#22170

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Odd-length Data Once Again
Date: Sun, 28 Nov 2021 12:55:48 -0600
Organization: A noiseless patient Spider
Lines: 71
Message-ID: <so0jbm$6nh$1@dont-email.me>
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com>
<345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com>
<20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
<bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com>
<095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>
<snsl3s$q31$1@dont-email.me>
<ab458dda-e101-4519-a2f8-77c6cbd77692n@googlegroups.com>
<snsvo7$nnd$1@dont-email.me> <snt18o$uvn$1@newsreader4.netcologne.de>
<sntqnn$e9f$1@dont-email.me> <so0g0l$1vpr$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 28 Nov 2021 18:55:50 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="fec1b91bd94e14a8c151a2f0f9556a21";
logging-data="6897"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/DLg5pLX2Nr+wbBgWsll1i"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.3.2
Cancel-Lock: sha1:yoh8SgaUIZtroY6axGxgduL3cmc=
In-Reply-To: <so0g0l$1vpr$1@gioia.aioe.org>
Content-Language: en-US
 by: BGB - Sun, 28 Nov 2021 18:55 UTC

On 11/28/2021 11:58 AM, Terje Mathisen wrote:
> BGB wrote:
>> On 11/27/2021 4:28 AM, Thomas Koenig wrote:
>>> BGB <cr88192@gmail.com> schrieb:
>>>> On 11/27/2021 2:04 AM, Quadibloc wrote:
>>>>> On Saturday, November 27, 2021 at 12:01:19 AM UTC-7, BGB wrote:
>>>>
>>>> I did more testing, and noted that for dividing sufficiently large
>>>> numbers, my approach:
>>>>     r = ((1<<32)+(d-1))/d;
>>>>     q = (n*r)>>32;
>>>
>>> Get "Hacker's Delight" and read the stuff about division by constants.
>>> It's quite well written.
>>>
>>
>> Yeah, I don't have this.
>>
>> My initial algo was basically something I threw together
>> experimentally, and did basic tests, and it seemed to work so I used it.
>>
>> However, I didn't really test it "exhaustively", as brute-forcing the
>> entire 32/32 space is a bit of a problem. Trying to brute-force up to
>> the 32-bit limit for the numerator did reveal some problems though
>> (occasional off-by-one errors for large values).
>>
>>
>> The modified algo seemed to work and avoided this issue. But, as
>> noted, can't test over the entire space as this seems like it would
>> take an implausibly long time.
>
> Not really: You only need to test the boundary values, i.e. those N/M
> where N is very close to a multiple of M, so you check (N-1)/M and N/M,
> and you should start with the largest possible N that fits in 32 bits.
>
> Doing that for all divisors is quite feasible for 32-bit values.
> somewhat less so for 64-bit. :-)
>

My initial tests were brute forcing the range.

My first stage testing of the revised algorithm was only checking up to
2^31, but changing it to 2^32 showed that some divisors were still
failing (7, 14, 19, 28, 31, ...).

Did observe though that in the cases where it started failing, it would
also tend to fail within 2*D of the maximum value (which can generally
be checked a bit more quickly).

But, yeah, checking in the immediate vicinity of an integer multiple
could further narrow the check.

Say:
Find T=((2^32)-1)/D
P=T*D
Search, say, (P-10)..(P+10), and see if a mismatch happens.

My initial (earlier) tests for the original division algo were more like:
Brute force the 64K * 64K space;
Check a bunch of randomly generated numbers;
Call it good if it passed.

But, as noted, it turns out this wasn't quite good enough.
Didn't account for an error rate that got bigger the further one gets
from zero.

> Terje
>

Re: Odd-length Data Once Again

<2021Nov29.134003@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22172&group=comp.arch#22172

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Odd-length Data Once Again
Date: Mon, 29 Nov 2021 12:40:03 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 25
Message-ID: <2021Nov29.134003@mips.complang.tuwien.ac.at>
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com> <ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com> <345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com> <fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com> <20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com> <bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com> <095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com> <so0fi9$1olt$1@gioia.aioe.org>
Injection-Info: reader02.eternal-september.org; posting-host="ef6c9b24978eca151723df781a349157";
logging-data="9281"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19AghCHuaQrFUhNnWrMv8Gz"
Cancel-Lock: sha1:Nps9OGtiXCL2WupM3jw69WJ2Ldo=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Mon, 29 Nov 2021 12:40 UTC

Terje Mathisen <terje.mathisen@tmsw.no> writes:
>Since then every compiler where I have checked the asm/machine code
>output will use effectively the same approach, sometimes simplifying it
>slightly for smaller divisors.

For u/7 on AMD64 (latency measured on Skylake) there are at least the
following sequences:

gcc Robison/fish ertl
latency=8c latency=6.25c latency=6c
%Cl=0x4924924924924925
%C=0x2492492492492493 %C=0x9249249249249249 %Ch=0x2492492492492492
movabs $C,%rdx movabs $C,%rax movabs $Cl,%rax
mov %rdi,%rax mov %rax,%rcx mul %rdi
mul %rdx mul %rdi mov %rdx,%rcx
sub %rdx,%rdi add %rcx,%rax movabs $Ch,%rax
shr %rdi adc $0x0,%rdx mul %rdi
lea (%rdx,%rdi),%rax shr $0x2,%rdx add %rcx,%rax
shr $0x2,%rax mov %rdx,%rax adc $0x0,%rdx
mov %rdx,%rax

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Odd-length Data Once Again

<so2r6a$lhn$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=22173&group=comp.arch#22173

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!ppYixYMWAWh/woI8emJOIQ.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Odd-length Data Once Again
Date: Mon, 29 Nov 2021 16:21:46 +0100
Organization: Aioe.org NNTP Server
Message-ID: <so2r6a$lhn$1@gioia.aioe.org>
References: <9ec1b1f6-d685-4e27-83c1-506f90f6d3f2n@googlegroups.com>
<ee025a51-ec51-4c05-afba-14dac73a7bben@googlegroups.com>
<345a3478-acfc-4a13-a11b-4756b560a66an@googlegroups.com>
<fd376748-c164-4449-8215-29c4c92bf34dn@googlegroups.com>
<20bc194b-6eb2-41ed-b8f5-cf98d000b134n@googlegroups.com>
<bacef91a-d240-4e8b-9e36-53956f58dd40n@googlegroups.com>
<095a1b86-b4ad-438c-9051-747c8dbd8252n@googlegroups.com>
<so0fi9$1olt$1@gioia.aioe.org> <2021Nov29.134003@mips.complang.tuwien.ac.at>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="22071"; posting-host="ppYixYMWAWh/woI8emJOIQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.10
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Mon, 29 Nov 2021 15:21 UTC

Anton Ertl wrote:
> Terje Mathisen <terje.mathisen@tmsw.no> writes:
>> Since then every compiler where I have checked the asm/machine code
>> output will use effectively the same approach, sometimes simplifying it
>> slightly for smaller divisors.
>
> For u/7 on AMD64 (latency measured on Skylake) there are at least the
> following sequences:
>
> gcc Robison/fish ertl
> latency=8c latency=6.25c latency=6c
> %Cl=0x4924924924924925
> %C=0x2492492492492493 %C=0x9249249249249249 %Ch=0x2492492492492492
> movabs $C,%rdx movabs $C,%rax movabs $Cl,%rax
> mov %rdi,%rax mov %rax,%rcx mul %rdi
> mul %rdx mul %rdi mov %rdx,%rcx
> sub %rdx,%rdi add %rcx,%rax movabs $Ch,%rax
> shr %rdi adc $0x0,%rdx mul %rdi
> lea (%rdx,%rdi),%rax shr $0x2,%rdx add %rcx,%rax
> shr $0x2,%rax mov %rdx,%rax adc $0x0,%rdx
> mov %rdx,%rax

The main difference between the two last approaches is the extra MUL vs
ADD/ADC to add in the 65th bit of the reciprocal, we used the first of
these but as long as the multiplier is properly pipelined your double
wide mul is very nice.

I suspect that for divisors with less than max significant bits, your
version will also generate a sufficiently accurate factional part which,
after multiplication, can generate the remainder from the division.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Pages:12
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor