Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Logic is a little bird, sitting in a tree; that smells *_____awful*.


devel / comp.arch / Re: Encoding 20 and 40 bit instructions in 128 bits

SubjectAuthor
* Encoding 20 and 40 bit instructions in 128 bitsThomas Koenig
+* Re: Encoding 20 and 40 bit instructions in 128 bitsStephen Fuld
|`* Re: Encoding 20 and 40 bit instructions in 128 bitsThomas Koenig
| +* Re: Encoding 20 and 40 bit instructions in 128 bitsStephen Fuld
| |+- Re: Encoding 20 and 40 bit instructions in 128 bitsThomas Koenig
| |+* Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
| ||`- Re: Encoding 20 and 40 bit instructions in 128 bitsStephen Fuld
| |`* Re: Encoding 20 and 40 bit instructions in 128 bitsQuadibloc
| | `* Re: Encoding 20 and 40 bit instructions in 128 bitsStephen Fuld
| |  +* Re: Encoding 20 and 40 bit instructions in 128 bitsQuadibloc
| |  |`* Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
| |  | +* Re: Encoding 20 and 40 bit instructions in 128 bitsJimBrakefield
| |  | |+* Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
| |  | ||`- Re: Encoding 20 and 40 bit instructions in 128 bitsJimBrakefield
| |  | |`* Re: Encoding 20 and 40 bit instructions in 128 bitsJimBrakefield
| |  | | +* Re: Encoding 20 and 40 bit instructions in 128 bitsEricP
| |  | | |+* Re: Encoding 20 and 40 bit instructions in 128 bitsJimBrakefield
| |  | | ||+* Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
| |  | | |||`- Re: Encoding 20 and 40 bit instructions in 128 bitsEricP
| |  | | ||`- Re: Encoding 20 and 40 bit instructions in 128 bitsEricP
| |  | | |`* Re: Encoding 20 and 40 bit instructions in 128 bitsEricP
| |  | | | `* Re: Encoding 20 and 40 bit instructions in 128 bitsThomas Koenig
| |  | | |  `* Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
| |  | | |   `* Re: Encoding 20 and 40 bit instructions in 128 bitsThomas Koenig
| |  | | |    +* Re: Encoding 20 and 40 bit instructions in 128 bitsBGB
| |  | | |    |`* Re: Encoding 20 and 40 bit instructions in 128 bitsBrett
| |  | | |    | `* Re: Encoding 20 and 40 bit instructions in 128 bitsBGB
| |  | | |    |  `* Re: Encoding 20 and 40 bit instructions in 128 bitsBrett
| |  | | |    |   +* Re: Encoding 20 and 40 bit instructions in 128 bitsQuadibloc
| |  | | |    |   |`* Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
| |  | | |    |   | `* Re: Encoding 20 and 40 bit instructions in 128 bitsThomas Koenig
| |  | | |    |   |  `* Re: Encoding 20 and 40 bit instructions in 128 bitsStephen Fuld
| |  | | |    |   |   +* Re: Encoding 20 and 40 bit instructions in 128 bitsStefan Monnier
| |  | | |    |   |   |`- Re: Encoding 20 and 40 bit instructions in 128 bitsStephen Fuld
| |  | | |    |   |   +* Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
| |  | | |    |   |   |`* Re: Encoding 20 and 40 bit instructions in 128 bitsQuadibloc
| |  | | |    |   |   | `* Re: Encoding 20 and 40 bit instructions in 128 bitsThomas Koenig
| |  | | |    |   |   |  +* Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
| |  | | |    |   |   |  |+* Re: Encoding 20 and 40 bit instructions in 128 bitsStefan Monnier
| |  | | |    |   |   |  ||+- Re: Encoding 20 and 40 bit instructions in 128 bitsBernd Linsel
| |  | | |    |   |   |  ||+- Re: Encoding 20 and 40 bit instructions in 128 bitsAnton Ertl
| |  | | |    |   |   |  ||`- Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
| |  | | |    |   |   |  |+* Re: Encoding 20 and 40 bit instructions in 128 bitsThomas Koenig
| |  | | |    |   |   |  ||`- Re: Encoding 20 and 40 bit instructions in 128 bitsBrian G. Lucas
| |  | | |    |   |   |  |`- Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
| |  | | |    |   |   |  +* Re: Encoding 20 and 40 bit instructions in 128 bitsAnton Ertl
| |  | | |    |   |   |  |`* Re: Encoding 20 and 40 bit instructions in 128 bitsThomas Koenig
| |  | | |    |   |   |  | `- Re: Encoding 20 and 40 bit instructions in 128 bitsBGB
| |  | | |    |   |   |  +* Re: Encoding 20 and 40 bit instructions in 128 bitsEricP
| |  | | |    |   |   |  |`* Re: Encoding 20 and 40 bit instructions in 128 bitsBGB
| |  | | |    |   |   |  | `* Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
| |  | | |    |   |   |  |  `* Re: Encoding 20 and 40 bit instructions in 128 bitsIvan Godard
| |  | | |    |   |   |  |   `* Re: Encoding 20 and 40 bit instructions in 128 bitsThomas Koenig
| |  | | |    |   |   |  |    `* Re: Encoding 20 and 40 bit instructions in 128 bitsIvan Godard
| |  | | |    |   |   |  |     +* Re: Encoding 20 and 40 bit instructions in 128 bitsThomas Koenig
| |  | | |    |   |   |  |     |`* Re: Encoding 20 and 40 bit instructions in 128 bitsQuadibloc
| |  | | |    |   |   |  |     | +- Re: Encoding 20 and 40 bit instructions in 128 bitsStephen Fuld
| |  | | |    |   |   |  |     | `- Re: Encoding 20 and 40 bit instructions in 128 bitsIvan Godard
| |  | | |    |   |   |  |     +* Re: Encoding 20 and 40 bit instructions in 128 bitsStefan Monnier
| |  | | |    |   |   |  |     |`- Re: Encoding 20 and 40 bit instructions in 128 bitsIvan Godard
| |  | | |    |   |   |  |     +* Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128John Levine
| |  | | |    |   |   |  |     |+* Re: instruction set binding time, was Encoding 20 and 40 bitThomas Koenig
| |  | | |    |   |   |  |     ||+* Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128Stefan Monnier
| |  | | |    |   |   |  |     |||+- Re: instruction set binding time, was Encoding 20 and 40 bitIvan Godard
| |  | | |    |   |   |  |     |||`* Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128Anton Ertl
| |  | | |    |   |   |  |     ||| +* Re: instruction set binding time, was Encoding 20 and 40 bitIvan Godard
| |  | | |    |   |   |  |     ||| |+* Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128Stefan Monnier
| |  | | |    |   |   |  |     ||| ||+- Re: instruction set binding time, was Encoding 20 and 40 bitBGB
| |  | | |    |   |   |  |     ||| ||+- Re: instruction set binding time, was Encoding 20 and 40 bitIvan Godard
| |  | | |    |   |   |  |     ||| ||`* Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128Anton Ertl
| |  | | |    |   |   |  |     ||| || `* Re: instruction set binding time, was Encoding 20 and 40 bitThomas Koenig
| |  | | |    |   |   |  |     ||| ||  +- Re: instruction set binding time, was Encoding 20 and 40 bitJohn Levine
| |  | | |    |   |   |  |     ||| ||  `* Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128Anton Ertl
| |  | | |    |   |   |  |     ||| ||   `* Re: instruction set binding time, was Encoding 20 and 40 bitTerje Mathisen
| |  | | |    |   |   |  |     ||| ||    `* Re: instruction set binding time, was Encoding 20 and 40 bitMitchAlsup
| |  | | |    |   |   |  |     ||| ||     +* Re: instruction set binding time, was Encoding 20 and 40 bitBGB
| |  | | |    |   |   |  |     ||| ||     |`- Re: instruction set binding time, was Encoding 20 and 40 bitMitchAlsup
| |  | | |    |   |   |  |     ||| ||     `- Re: instruction set binding time, was Encoding 20 and 40 bitTerje Mathisen
| |  | | |    |   |   |  |     ||| |`* Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128Anton Ertl
| |  | | |    |   |   |  |     ||| | +* Re: instruction set binding time, was Encoding 20 and 40 bitIvan Godard
| |  | | |    |   |   |  |     ||| | |+* Re: instruction set binding time, was Encoding 20 and 40 bitThomas Koenig
| |  | | |    |   |   |  |     ||| | ||`* Re: instruction set binding time, was Encoding 20 and 40 bitIvan Godard
| |  | | |    |   |   |  |     ||| | || `* Re: instruction set binding time, was Encoding 20 and 40 bitThomas Koenig
| |  | | |    |   |   |  |     ||| | ||  `* Re: instruction set binding time, was Encoding 20 and 40 bitIvan Godard
| |  | | |    |   |   |  |     ||| | ||   `* Re: instruction set binding time, was Encoding 20 and 40 bitThomas Koenig
| |  | | |    |   |   |  |     ||| | ||    +- Re: instruction set binding time, was Encoding 20 and 40 bitIvan Godard
| |  | | |    |   |   |  |     ||| | ||    `- Re: instruction set binding time, was Encoding 20 and 40 bitMitchAlsup
| |  | | |    |   |   |  |     ||| | |+- Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128Anton Ertl
| |  | | |    |   |   |  |     ||| | |`- Re: instruction set binding time, was Encoding 20 and 40 bitJohn Levine
| |  | | |    |   |   |  |     ||| | `* Re: instruction set binding time, was Encoding 20 and 40 bitThomas Koenig
| |  | | |    |   |   |  |     ||| |  `- Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128Anton Ertl
| |  | | |    |   |   |  |     ||| `* Re: instruction set binding time, was Encoding 20 and 40 bitQuadibloc
| |  | | |    |   |   |  |     |||  +* Re: instruction set binding time, was Encoding 20 and 40 bitBGB
| |  | | |    |   |   |  |     |||  |+* Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128Anton Ertl
| |  | | |    |   |   |  |     |||  ||+* Re: instruction set binding time, was Encoding 20 and 40 bitScott Smader
| |  | | |    |   |   |  |     |||  |||+* Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128Stefan Monnier
| |  | | |    |   |   |  |     |||  ||||`* Re: instruction set binding time, was Encoding 20 and 40 bitScott Smader
| |  | | |    |   |   |  |     |||  |||| +* Re: instruction set binding time, was Encoding 20 and 40 bitIvan Godard
| |  | | |    |   |   |  |     |||  |||| |+- Re: instruction set binding time, was Encoding 20 and 40 bitAnton Ertl
| |  | | |    |   |   |  |     |||  |||| |`* Re: instruction set binding time, was Encoding 20 and 40 bitIvan Godard
| |  | | |    |   |   |  |     |||  |||| | +- Re: instruction set binding time, was Encoding 20 and 40 bitMitchAlsup
| |  | | |    |   |   |  |     |||  |||| | +* Re: instruction set binding time, was Encoding 20 and 40 bitIvan Godard
| |  | | |    |   |   |  |     |||  |||| | `* Re: instruction set binding time, was Encoding 20 and 40 bitAnton Ertl
| |  | | |    |   |   |  |     |||  |||| +- Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128James Van Buskirk
| |  | | |    |   |   |  |     |||  |||| `* Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128Anton Ertl
| |  | | |    |   |   |  |     |||  |||+* Statically scheduled plus run ahead.Brett
| |  | | |    |   |   |  |     |||  |||`* Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128Anton Ertl
| |  | | |    |   |   |  |     |||  ||+* Re: instruction set binding time, was Encoding 20 and 40 bitBGB
| |  | | |    |   |   |  |     |||  ||+- Re: instruction set binding time, was Encoding 20 and 40 bitMitchAlsup
| |  | | |    |   |   |  |     |||  ||`* Re: instruction set binding time, was Encoding 20 and 40 bitThomas Koenig
| |  | | |    |   |   |  |     |||  |`* Re: instruction set binding time, was Encoding 20 and 40 bitMitchAlsup
| |  | | |    |   |   |  |     |||  +- Re: instruction set binding time, was Encoding 20 and 40 bitMitchAlsup
| |  | | |    |   |   |  |     |||  `- Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128Anton Ertl
| |  | | |    |   |   |  |     ||`* Re: instruction set binding time, was Encoding 20 and 40 bitIvan Godard
| |  | | |    |   |   |  |     |+- Re: instruction set binding time, was Encoding 20 and 40 bitMitchAlsup
| |  | | |    |   |   |  |     |`* Re: instruction set binding time, was Encoding 20 and 40 bitStephen Fuld
| |  | | |    |   |   |  |     `* Re: Encoding 20 and 40 bit instructions in 128 bitsAnton Ertl
| |  | | |    |   |   |  +* Re: Encoding 20 and 40 bit instructions in 128 bitsQuadibloc
| |  | | |    |   |   |  +- Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
| |  | | |    |   |   |  +- Re: Encoding 20 and 40 bit instructions in 128 bitsQuadibloc
| |  | | |    |   |   |  `- Re: Encoding 20 and 40 bit instructions in 128 bitsQuadibloc
| |  | | |    |   |   +* Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
| |  | | |    |   |   `- Re: Encoding 20 and 40 bit instructions in 128 bitsBGB
| |  | | |    |   `- Re: Encoding 20 and 40 bit instructions in 128 bitsBGB
| |  | | |    `* Re: Encoding 20 and 40 bit instructions in 128 bitsStephen Fuld
| |  | | `- Re: Encoding 20 and 40 bit instructions in 128 bitsThomas Koenig
| |  | `* Re: Encoding 20 and 40 bit instructions in 128 bitsQuadibloc
| |  `- Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
| `- Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
+- Re: Encoding 20 and 40 bit instructions in 128 bitsIvan Godard
+* Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
+* Re: Encoding 20 and 40 bit instructions in 128 bitsMitchAlsup
`- Re: Encoding 20 and 40 bit instructions in 128 bitsPaul A. Clayton

Pages:1234567891011121314
Re: Encoding 20 and 40 bit instructions in 128 bits

<bb043f0b-3e4d-421a-b1db-3261089fa20en@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23260&group=comp.arch#23260

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:101:: with SMTP id u1mr8060004qtw.559.1643841711765;
Wed, 02 Feb 2022 14:41:51 -0800 (PST)
X-Received: by 2002:a05:6870:5304:: with SMTP id j4mr16277oan.267.1643841705493;
Wed, 02 Feb 2022 14:41:45 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 2 Feb 2022 14:41:45 -0800 (PST)
In-Reply-To: <9a3958f2-0b31-47a2-9403-fcaada993190n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.253.102; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.253.102
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com> <9a3958f2-0b31-47a2-9403-fcaada993190n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <bb043f0b-3e4d-421a-b1db-3261089fa20en@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: jim.brak...@ieee.org (JimBrakefield)
Injection-Date: Wed, 02 Feb 2022 22:41:51 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 34
 by: JimBrakefield - Wed, 2 Feb 2022 22:41 UTC

On Wednesday, February 2, 2022 at 4:17:14 PM UTC-6, MitchAlsup wrote:
> On Wednesday, February 2, 2022 at 3:39:15 PM UTC-6, JimBrakefield wrote:
>
> > Some comments:
> > A) In the modern uP arch, instruction flow is separated from data flow. Thus there is negligible hardware cost in having instruction granularity different from data granularity.
> <
> Indeed--clever observation
> <
> > B) Why use the remaining bits to re-sync the instruction addresses? E.g.. why not have 21 bit instruction granularity instead of 20 bit? Re-sync to always occur at the start of 64 or 128 or 256-bit instruction block?
> <
> two (2) bits are not used 21×6 = 126
> <
> > C) Opens a new arena for uP architecture, eg instruction formats not tied to data granularity.
> <
> Indeed
> <
> > D) Need a name for for the concept. Maybe there is already an ISA of this nature (other than stack machines with very short instructions)?
> <
> Mill instruction formats and layout are similar, but not based on fixed sized containers and are also model specific.
> <
> > E) Need yet another name for the concept of using some of the remaining instruction block bits for (usually) decode purposes?
> <
> Instruction Containerization ! {And I give permission for you to steal this name)

Containerized ISAs !

Re: Encoding 20 and 40 bit instructions in 128 bits

<b5560d4d-ef07-4b0e-b8a8-6fd979c954dcn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23267&group=comp.arch#23267

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:c29:: with SMTP id a9mr32042574qvd.128.1643912332284;
Thu, 03 Feb 2022 10:18:52 -0800 (PST)
X-Received: by 2002:a05:6808:151e:: with SMTP id u30mr8251863oiw.64.1643912332047;
Thu, 03 Feb 2022 10:18:52 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 3 Feb 2022 10:18:51 -0800 (PST)
In-Reply-To: <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:939:28ea:70d:b09e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:939:28ea:70d:b09e
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b5560d4d-ef07-4b0e-b8a8-6fd979c954dcn@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Thu, 03 Feb 2022 18:18:52 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 40
 by: Quadibloc - Thu, 3 Feb 2022 18:18 UTC

On Wednesday, February 2, 2022 at 10:32:49 AM UTC-7, MitchAlsup wrote:
> On Wednesday, February 2, 2022 at 11:04:44 AM UTC-6, Quadibloc wrote:
> > On Wednesday, February 2, 2022 at 9:39:54 AM UTC-7, Stephen Fuld wrote:
> > > And one
> > > quibble; you may need the address of the next instruction in the bundle
> > > to handle the case of the current instruction is a CALL type instruction
> > > that puts the address of the next instruction in a register.
> <
> > That is true; if you can branch into the middle of a bundle, then presumably
> > branches out with returns in the middle will be allowed. Here, though, I
> > think the issue will be encoding rather than calculation. If a 128-bit bundle
> > is divided into 20-bit packets, presumably packet n will be given an
> > address aligned on 16-bit boundaries which will be...
> <
> I am surprised at you--you are making the mistake that control flow addressing
> is identical to memory addressing--it is not (in this case). Instructions are located
> on 20-bit boundaries, and whether or not an instruction starts in this container
> is controlled by the header to the bundle. HW has no problem decoding to 20-bit
> boundaries, so the only constraint is that the range in the LOBs of the IP is [0..5]
> (instead of 0..7]) !!

This is interesting, because my reply to Stephen Fuld was based on percieving
him as having made that mistake.

What I was assuming was that when the control flow escapes from the internal
circuitry of the processor, and you actually have a sequence of bits that serves
as an address - _then_ it's going to be a memory address.

If I'm going to return from a subroutine, I *need* the control flow address of
the return point to quickly and easily supply me with the memory address of
the code to which I'm returning.

And so a code address might have the form (128-bit aligned memory address)
followed by (instruction number from 0 to 5). But in order to grab the memory address
really fast, a 'divide by six' numerical operation will be avoided. Plus, explaining it as
a 16-bit aligned memory address, which happens to point to a fictional location
for the instruction... just makes it easier to understand and less confusing for
the programmer (who I am assuming to have a very conventional and old-fashioned
mindset).

John Savard

Re: Encoding 20 and 40 bit instructions in 128 bits

<d3c95730-ff9c-4b34-9b87-2d48a7f76478n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23268&group=comp.arch#23268

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:57d0:: with SMTP id w16mr23980696qta.171.1643914132490;
Thu, 03 Feb 2022 10:48:52 -0800 (PST)
X-Received: by 2002:a05:6808:1598:: with SMTP id t24mr8203118oiw.50.1643914132257;
Thu, 03 Feb 2022 10:48:52 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 3 Feb 2022 10:48:52 -0800 (PST)
In-Reply-To: <b5560d4d-ef07-4b0e-b8a8-6fd979c954dcn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:ed48:78bd:8cc7:53e;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:ed48:78bd:8cc7:53e
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<b5560d4d-ef07-4b0e-b8a8-6fd979c954dcn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <d3c95730-ff9c-4b34-9b87-2d48a7f76478n@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 03 Feb 2022 18:48:52 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 53
 by: MitchAlsup - Thu, 3 Feb 2022 18:48 UTC

On Thursday, February 3, 2022 at 12:18:54 PM UTC-6, Quadibloc wrote:
> On Wednesday, February 2, 2022 at 10:32:49 AM UTC-7, MitchAlsup wrote:
> > On Wednesday, February 2, 2022 at 11:04:44 AM UTC-6, Quadibloc wrote:
> > > On Wednesday, February 2, 2022 at 9:39:54 AM UTC-7, Stephen Fuld wrote:
> > > > And one
> > > > quibble; you may need the address of the next instruction in the bundle
> > > > to handle the case of the current instruction is a CALL type instruction
> > > > that puts the address of the next instruction in a register.
> > <
> > > That is true; if you can branch into the middle of a bundle, then presumably
> > > branches out with returns in the middle will be allowed. Here, though, I
> > > think the issue will be encoding rather than calculation. If a 128-bit bundle
> > > is divided into 20-bit packets, presumably packet n will be given an
> > > address aligned on 16-bit boundaries which will be...
> > <
> > I am surprised at you--you are making the mistake that control flow addressing
> > is identical to memory addressing--it is not (in this case). Instructions are located
> > on 20-bit boundaries, and whether or not an instruction starts in this container
> > is controlled by the header to the bundle. HW has no problem decoding to 20-bit
> > boundaries, so the only constraint is that the range in the LOBs of the IP is [0..5]
> > (instead of 0..7]) !!
> This is interesting, because my reply to Stephen Fuld was based on percieving
> him as having made that mistake.
>
> What I was assuming was that when the control flow escapes from the internal
> circuitry of the processor, and you actually have a sequence of bits that serves
> as an address - _then_ it's going to be a memory address.
<
But the HIOBs of the address access the container (128-bits)
while the LOBs have a range from [0..5] and access the 20-bit instruction start.
>
> If I'm going to return from a subroutine, I *need* the control flow address of
> the return point to quickly and easily supply me with the memory address of
> the code to which I'm returning.
<
Sure, just use the above algorithm, and take some king of exception if the
LOBs are not in range [0..5].
>
> And so a code address might have the form (128-bit aligned memory address)
> followed by (instruction number from 0 to 5). But in order to grab the memory address
> really fast, a 'divide by six' numerical operation will be avoided.
<
> Plus, explaining it as
> a 16-bit aligned memory address, which happens to point to a fictional location
> for the instruction... just makes it easier to understand and less confusing for
> the programmer (who I am assuming to have a very conventional and old-fashioned
> mindset).
<
Why is the <ahem> programmer concerned with the size of individual instructions?
They are not from machines as diverse as B5500-to-x86-to-IBM z-series.
<
Certainly the assembler and linker need to be concerned, but the programmer ???
>
> John Savard

Re: Encoding 20 and 40 bit instructions in 128 bits

<abf3e0db-5e9c-4942-84b5-c8749f414ac5n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23274&group=comp.arch#23274

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ae9:df85:: with SMTP id t127mr738997qkf.744.1643954843508;
Thu, 03 Feb 2022 22:07:23 -0800 (PST)
X-Received: by 2002:a05:6808:118c:: with SMTP id j12mr597158oil.259.1643954842868;
Thu, 03 Feb 2022 22:07:22 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 3 Feb 2022 22:07:22 -0800 (PST)
In-Reply-To: <d3c95730-ff9c-4b34-9b87-2d48a7f76478n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:939:28ea:70d:b09e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:939:28ea:70d:b09e
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<b5560d4d-ef07-4b0e-b8a8-6fd979c954dcn@googlegroups.com> <d3c95730-ff9c-4b34-9b87-2d48a7f76478n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <abf3e0db-5e9c-4942-84b5-c8749f414ac5n@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 04 Feb 2022 06:07:23 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 11
 by: Quadibloc - Fri, 4 Feb 2022 06:07 UTC

On Thursday, February 3, 2022 at 11:48:54 AM UTC-7, MitchAlsup wrote:

> Why is the <ahem> programmer concerned with the size of individual instructions?
> They are not from machines as diverse as B5500-to-x86-to-IBM z-series.
> <
> Certainly the assembler and linker need to be concerned, but the programmer ???

Um, it depends what you mean by "concerned". The programmer doesn't have to
calculate stuff by hand that the assembler does, but should still have a basic
general idea of how the machine being programmed works.

John Savard

Re: Encoding 20 and 40 bit instructions in 128 bits

<stir8k$jam$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23275&group=comp.arch#23275

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!To5nvU/sTaigmVbgRJ05pQ.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Fri, 4 Feb 2022 10:22:36 +0100
Organization: Aioe.org NNTP Server
Message-ID: <stir8k$jam$1@gioia.aioe.org>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de>
<ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>
<stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>
<2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<b5560d4d-ef07-4b0e-b8a8-6fd979c954dcn@googlegroups.com>
<d3c95730-ff9c-4b34-9b87-2d48a7f76478n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="19798"; posting-host="To5nvU/sTaigmVbgRJ05pQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.10.2
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Fri, 4 Feb 2022 09:22 UTC

MitchAlsup wrote:
> On Thursday, February 3, 2022 at 12:18:54 PM UTC-6, Quadibloc wrote:
>> On Wednesday, February 2, 2022 at 10:32:49 AM UTC-7, MitchAlsup wrote:
>>> On Wednesday, February 2, 2022 at 11:04:44 AM UTC-6, Quadibloc wrote:
>>>> On Wednesday, February 2, 2022 at 9:39:54 AM UTC-7, Stephen Fuld wrote:
>>>>> And one
>>>>> quibble; you may need the address of the next instruction in the bundle
>>>>> to handle the case of the current instruction is a CALL type instruction
>>>>> that puts the address of the next instruction in a register.
>>> <
>>>> That is true; if you can branch into the middle of a bundle, then presumably
>>>> branches out with returns in the middle will be allowed. Here, though, I
>>>> think the issue will be encoding rather than calculation. If a 128-bit bundle
>>>> is divided into 20-bit packets, presumably packet n will be given an
>>>> address aligned on 16-bit boundaries which will be...
>>> <
>>> I am surprised at you--you are making the mistake that control flow addressing
>>> is identical to memory addressing--it is not (in this case). Instructions are located
>>> on 20-bit boundaries, and whether or not an instruction starts in this container
>>> is controlled by the header to the bundle. HW has no problem decoding to 20-bit
>>> boundaries, so the only constraint is that the range in the LOBs of the IP is [0..5]
>>> (instead of 0..7]) !!
>> This is interesting, because my reply to Stephen Fuld was based on percieving
>> him as having made that mistake.
>>
>> What I was assuming was that when the control flow escapes from the internal
>> circuitry of the processor, and you actually have a sequence of bits that serves
>> as an address - _then_ it's going to be a memory address.
> <
> But the HIOBs of the address access the container (128-bits)
> while the LOBs have a range from [0..5] and access the 20-bit instruction start.
>>
>> If I'm going to return from a subroutine, I *need* the control flow address of
>> the return point to quickly and easily supply me with the memory address of
>> the code to which I'm returning.
> <
> Sure, just use the above algorithm, and take some king of exception if the
> LOBs are not in range [0..5].

I agree.

Extracting the bottom three bits, trapping if they are 11x, while at the
same time multiplying them by 20 and the top bits by 16 (trivial lookup
or shift/add for the former, just a routing issue for the top bits)
should only add one or two gate delays, right?

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Encoding 20 and 40 bit instructions in 128 bits

<24fb4429-3be2-4f48-bead-d78f9d70aa29n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23276&group=comp.arch#23276

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:2424:: with SMTP id gy4mr1409568qvb.125.1643970925497;
Fri, 04 Feb 2022 02:35:25 -0800 (PST)
X-Received: by 2002:a05:6820:514:: with SMTP id m20mr787091ooj.28.1643970925260;
Fri, 04 Feb 2022 02:35:25 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 4 Feb 2022 02:35:25 -0800 (PST)
In-Reply-To: <stir8k$jam$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:939:28ea:70d:b09e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:939:28ea:70d:b09e
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<b5560d4d-ef07-4b0e-b8a8-6fd979c954dcn@googlegroups.com> <d3c95730-ff9c-4b34-9b87-2d48a7f76478n@googlegroups.com>
<stir8k$jam$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <24fb4429-3be2-4f48-bead-d78f9d70aa29n@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 04 Feb 2022 10:35:25 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 12
 by: Quadibloc - Fri, 4 Feb 2022 10:35 UTC

On Friday, February 4, 2022 at 2:22:31 AM UTC-7, Terje Mathisen wrote:

> Extracting the bottom three bits, trapping if they are 11x, while at the
> same time multiplying them by 20 and the top bits by 16 (trivial lookup
> or shift/add for the former, just a routing issue for the top bits)
> should only add one or two gate delays, right?

Why would you _ever_ need to multiply anything by 20?

I mean, the instruction decode unit would be hardwired to find the instructions
in whatever part of the block they are put.

John Savard

Re: Encoding 20 and 40 bit instructions in 128 bits

<741d4f56-d006-4cba-b7c5-c3cbebe4dcffn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23278&group=comp.arch#23278

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:57d0:: with SMTP id w16mr2159018qta.171.1643985437739;
Fri, 04 Feb 2022 06:37:17 -0800 (PST)
X-Received: by 2002:a05:6808:689:: with SMTP id k9mr1437223oig.281.1643985437145;
Fri, 04 Feb 2022 06:37:17 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 4 Feb 2022 06:37:16 -0800 (PST)
In-Reply-To: <stir8k$jam$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:302e:7005:d193:7915;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:302e:7005:d193:7915
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<b5560d4d-ef07-4b0e-b8a8-6fd979c954dcn@googlegroups.com> <d3c95730-ff9c-4b34-9b87-2d48a7f76478n@googlegroups.com>
<stir8k$jam$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <741d4f56-d006-4cba-b7c5-c3cbebe4dcffn@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 04 Feb 2022 14:37:17 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 56
 by: MitchAlsup - Fri, 4 Feb 2022 14:37 UTC

On Friday, February 4, 2022 at 3:22:31 AM UTC-6, Terje Mathisen wrote:
> MitchAlsup wrote:
> > On Thursday, February 3, 2022 at 12:18:54 PM UTC-6, Quadibloc wrote:
> >> On Wednesday, February 2, 2022 at 10:32:49 AM UTC-7, MitchAlsup wrote:
> >>> On Wednesday, February 2, 2022 at 11:04:44 AM UTC-6, Quadibloc wrote:
> >>>> On Wednesday, February 2, 2022 at 9:39:54 AM UTC-7, Stephen Fuld wrote:
> >>>>> And one
> >>>>> quibble; you may need the address of the next instruction in the bundle
> >>>>> to handle the case of the current instruction is a CALL type instruction
> >>>>> that puts the address of the next instruction in a register.
> >>> <
> >>>> That is true; if you can branch into the middle of a bundle, then presumably
> >>>> branches out with returns in the middle will be allowed. Here, though, I
> >>>> think the issue will be encoding rather than calculation. If a 128-bit bundle
> >>>> is divided into 20-bit packets, presumably packet n will be given an
> >>>> address aligned on 16-bit boundaries which will be...
> >>> <
> >>> I am surprised at you--you are making the mistake that control flow addressing
> >>> is identical to memory addressing--it is not (in this case). Instructions are located
> >>> on 20-bit boundaries, and whether or not an instruction starts in this container
> >>> is controlled by the header to the bundle. HW has no problem decoding to 20-bit
> >>> boundaries, so the only constraint is that the range in the LOBs of the IP is [0..5]
> >>> (instead of 0..7]) !!
> >> This is interesting, because my reply to Stephen Fuld was based on percieving
> >> him as having made that mistake.
> >>
> >> What I was assuming was that when the control flow escapes from the internal
> >> circuitry of the processor, and you actually have a sequence of bits that serves
> >> as an address - _then_ it's going to be a memory address.
> > <
> > But the HIOBs of the address access the container (128-bits)
> > while the LOBs have a range from [0..5] and access the 20-bit instruction start.
> >>
> >> If I'm going to return from a subroutine, I *need* the control flow address of
> >> the return point to quickly and easily supply me with the memory address of
> >> the code to which I'm returning.
> > <
> > Sure, just use the above algorithm, and take some king of exception if the
> > LOBs are not in range [0..5].
> I agree.
>
> Extracting the bottom three bits, trapping if they are 11x, while at the
> same time multiplying them by 20 and the top bits by 16 (trivial lookup
> or shift/add for the former, just a routing issue for the top bits)
> should only add one or two gate delays, right?
<
There is no multiply, there is a multiplexer/shifter that selects 20-bit fields
instead of selecting 16-bit fields. The decoder can even be wired up to
assert the out-of-range exception.
<
>
> Terje
>
>
> --
> - <Terje.Mathisen at tmsw.no>
> "almost all programming can be viewed as an exercise in caching"

Re: Encoding 20 and 40 bit instructions in 128 bits

<stjpp1$19ip$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23279&group=comp.arch#23279

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!rd9pRsUZyxkRLAEK7e/Uzw.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Fri, 4 Feb 2022 19:03:14 +0100
Organization: Aioe.org NNTP Server
Message-ID: <stjpp1$19ip$1@gioia.aioe.org>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de>
<ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>
<stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>
<2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<b5560d4d-ef07-4b0e-b8a8-6fd979c954dcn@googlegroups.com>
<d3c95730-ff9c-4b34-9b87-2d48a7f76478n@googlegroups.com>
<stir8k$jam$1@gioia.aioe.org>
<24fb4429-3be2-4f48-bead-d78f9d70aa29n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="42585"; posting-host="rd9pRsUZyxkRLAEK7e/Uzw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.10.2
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Fri, 4 Feb 2022 18:03 UTC

Quadibloc wrote:
> On Friday, February 4, 2022 at 2:22:31 AM UTC-7, Terje Mathisen wrote:
>
>> Extracting the bottom three bits, trapping if they are 11x, while at the
>> same time multiplying them by 20 and the top bits by 16 (trivial lookup
>> or shift/add for the former, just a routing issue for the top bits)
>> should only add one or two gate delays, right?
>
> Why would you _ever_ need to multiply anything by 20?
>
> I mean, the instruction decode unit would be hardwired to find the instructions
> in whatever part of the block they are put.

I wrote "trivial lookup", how is that different from "hardwired"? :-)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Encoding 20 and 40 bit instructions in 128 bits

<b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23280&group=comp.arch#23280

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5c16:: with SMTP id i22mr218192qti.657.1643999654701;
Fri, 04 Feb 2022 10:34:14 -0800 (PST)
X-Received: by 2002:a05:6808:689:: with SMTP id k9mr136091oig.281.1643999654382;
Fri, 04 Feb 2022 10:34:14 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 4 Feb 2022 10:34:14 -0800 (PST)
In-Reply-To: <4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.253.102; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.253.102
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: jim.brak...@ieee.org (JimBrakefield)
Injection-Date: Fri, 04 Feb 2022 18:34:14 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 61
 by: JimBrakefield - Fri, 4 Feb 2022 18:34 UTC

On Wednesday, February 2, 2022 at 3:39:15 PM UTC-6, JimBrakefield wrote:
> On Wednesday, February 2, 2022 at 11:32:49 AM UTC-6, MitchAlsup wrote:
> > On Wednesday, February 2, 2022 at 11:04:44 AM UTC-6, Quadibloc wrote:
> > > On Wednesday, February 2, 2022 at 9:39:54 AM UTC-7, Stephen Fuld wrote:
> > > > And one
> > > > quibble; you may need the address of the next instruction in the bundle
> > > > to handle the case of the current instruction is a CALL type instruction
> > > > that puts the address of the next instruction in a register.
> > <
> > > That is true; if you can branch into the middle of a bundle, then presumably
> > > branches out with returns in the middle will be allowed. Here, though, I
> > > think the issue will be encoding rather than calculation. If a 128-bit bundle
> > > is divided into 20-bit packets, presumably packet n will be given an
> > > address aligned on 16-bit boundaries which will be...
> > <
> > I am surprised at you--you are making the mistake that control flow addressing
> > is identical to memory addressing--it is not (in this case). Instructions are located
> > on 20-bit boundaries, and whether or not an instruction starts in this container
> > is controlled by the header to the bundle. HW has no problem decoding to 20-bit
> > boundaries, so the only constraint is that the range in the LOBs of the IP is [0..5]
> > (instead of 0..7]) !!
> > <
> > > associated with one
> > > of the packets. The encoding scheme will presumably make those addresses
> > > consecutive, to simplify the calculation that needs to be done.
> > >
> > > John Savard
> Some comments:
> A) In the modern uP arch, instruction flow is separated from data flow. Thus there is negligible hardware cost in having instruction granularity different from data granularity.
> B) Why use the remaining bits to re-sync the instruction addresses? E.g. why not have 21 bit instruction granularity instead of 20 bit? Re-sync to always occur at the start of 64 or 128 or 256-bit instruction block?
> C) Opens a new arena for uP architecture, eg instruction formats not tied to data granularity.
> D) Need a name for for the concept. Maybe there is already an ISA of this nature (other than stack machines with very short instructions)?
> E) Need yet another name for the concept of using some of the remaining instruction block bits for (usually) decode purposes?

The next thing needing discussion is relative branches. In order to avoid division by small odd numbers, the branch displacement also needs to have the container plus packet location format. The branch address is then a modulo add with carry between the current instruction location and the relative displacement. Leading to the conclusion that instruction addresses and displacements need to be a distinct category or "type" from data addresses and offsets (which use whole binary numbers).

Re: Encoding 20 and 40 bit instructions in 128 bits

<_IhLJ.4280$0vE9.17@fx17.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23282&group=comp.arch#23282

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!feeder1.feed.usenet.farm!feed.usenet.farm!peer03.ams4!peer.am4.highwinds-media.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx17.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me> <c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me> <de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com> <4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com> <b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
In-Reply-To: <b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 54
Message-ID: <_IhLJ.4280$0vE9.17@fx17.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Fri, 04 Feb 2022 22:24:58 UTC
Date: Fri, 04 Feb 2022 17:23:34 -0500
X-Received-Bytes: 4922
 by: EricP - Fri, 4 Feb 2022 22:23 UTC

JimBrakefield wrote:
> On Wednesday, February 2, 2022 at 3:39:15 PM UTC-6, JimBrakefield wrote:
>> On Wednesday, February 2, 2022 at 11:32:49 AM UTC-6, MitchAlsup wrote:
>>> On Wednesday, February 2, 2022 at 11:04:44 AM UTC-6, Quadibloc wrote:
>>>> On Wednesday, February 2, 2022 at 9:39:54 AM UTC-7, Stephen Fuld wrote:
>>>>> And one
>>>>> quibble; you may need the address of the next instruction in the bundle
>>>>> to handle the case of the current instruction is a CALL type instruction
>>>>> that puts the address of the next instruction in a register.
>>> <
>>>> That is true; if you can branch into the middle of a bundle, then presumably
>>>> branches out with returns in the middle will be allowed. Here, though, I
>>>> think the issue will be encoding rather than calculation. If a 128-bit bundle
>>>> is divided into 20-bit packets, presumably packet n will be given an
>>>> address aligned on 16-bit boundaries which will be...
>>> <
>>> I am surprised at you--you are making the mistake that control flow addressing
>>> is identical to memory addressing--it is not (in this case). Instructions are located
>>> on 20-bit boundaries, and whether or not an instruction starts in this container
>>> is controlled by the header to the bundle. HW has no problem decoding to 20-bit
>>> boundaries, so the only constraint is that the range in the LOBs of the IP is [0..5]
>>> (instead of 0..7]) !!
>>> <
>>>> associated with one
>>>> of the packets. The encoding scheme will presumably make those addresses
>>>> consecutive, to simplify the calculation that needs to be done.
>>>>
>>>> John Savard
>> Some comments:
>> A) In the modern uP arch, instruction flow is separated from data flow. Thus there is negligible hardware cost in having instruction granularity different from data granularity.
>> B) Why use the remaining bits to re-sync the instruction addresses? E.g. why not have 21 bit instruction granularity instead of 20 bit? Re-sync to always occur at the start of 64 or 128 or 256-bit instruction block?
>> C) Opens a new arena for uP architecture, eg instruction formats not tied to data granularity.
>> D) Need a name for for the concept. Maybe there is already an ISA of this nature (other than stack machines with very short instructions)?
>> E) Need yet another name for the concept of using some of the remaining instruction block bits for (usually) decode purposes?
>
> The next thing needing discussion is relative branches. In order to avoid division by small odd numbers, the branch displacement also needs to have the container plus packet location format. The branch address is then a modulo add with carry between the current instruction location and the relative displacement. Leading to the conclusion that instruction addresses and displacements need to be a distinct category or "type" from data addresses and offsets (which use whole binary numbers).

The problems with relative branches can be avoided by changing the layout
of the bundle. Instead of 20 contiguous bits that start on bit boundaries,
have six 16-bit instruction words starting on byte boundaries,
followed by six 4-bit fields in 3 bytes which extend those words,
followed by 1 byte with the start bits.

This means the address of the start of each instruction is byte aligned,
and the offset from an instruction to another is always in integer bytes.
And this is the same as RIP relative offsets to data.

The byte increment from instruction to instruction inside a bundle is 2,
except for the last instruction in slot[5] which is 6.
But this is just like a variable length instruction set except it is
addresses that end in 0xA that appear to be a longer format.

Re: Encoding 20 and 40 bit instructions in 128 bits

<stka8c$o82$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23283&group=comp.arch#23283

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-3f18-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Fri, 4 Feb 2022 22:44:28 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <stka8c$o82$1@newsreader4.netcologne.de>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de>
<ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>
<stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>
<2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>
<b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
Injection-Date: Fri, 4 Feb 2022 22:44:28 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-3f18-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd6:3f18:0:7285:c2ff:fe6c:992d";
logging-data="24834"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Fri, 4 Feb 2022 22:44 UTC

JimBrakefield <jim.brakefield@ieee.org> schrieb:

> The next thing needing discussion is relative branches. In order
> to avoid division by small odd numbers, the branch displacement
> also needs to have the container plus packet location format.

In principle, yes.

> The branch address is then a modulo add with carry between the
> current instruction location and the relative displacement.

No need for that.

I would envisage a relative branch as the difference between
bundles, plus three bits for the slot.

> Leading
> to the conclusion that instruction addresses and displacements need
> to be a distinct category or "type" from data addresses and offsets
> (which use whole binary numbers).

For four-byte instructions, the offset of a relative branch should
also be multiplied by four before adding to the PC.

Re: Encoding 20 and 40 bit instructions in 128 bits

<c75e0e4a-0a17-4429-9282-2f928600f77en@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23284&group=comp.arch#23284

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:7d8b:: with SMTP id c11mr920866qtd.655.1644014685589;
Fri, 04 Feb 2022 14:44:45 -0800 (PST)
X-Received: by 2002:a05:6870:e3c2:: with SMTP id y2mr321592oad.194.1644014685281;
Fri, 04 Feb 2022 14:44:45 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!2.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 4 Feb 2022 14:44:45 -0800 (PST)
In-Reply-To: <_IhLJ.4280$0vE9.17@fx17.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.253.102; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.253.102
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com> <b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
<_IhLJ.4280$0vE9.17@fx17.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c75e0e4a-0a17-4429-9282-2f928600f77en@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: jim.brak...@ieee.org (JimBrakefield)
Injection-Date: Fri, 04 Feb 2022 22:44:45 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 90
 by: JimBrakefield - Fri, 4 Feb 2022 22:44 UTC

On Friday, February 4, 2022 at 4:25:03 PM UTC-6, EricP wrote:
> JimBrakefield wrote:
> > On Wednesday, February 2, 2022 at 3:39:15 PM UTC-6, JimBrakefield wrote:
> >> On Wednesday, February 2, 2022 at 11:32:49 AM UTC-6, MitchAlsup wrote:
> >>> On Wednesday, February 2, 2022 at 11:04:44 AM UTC-6, Quadibloc wrote:
> >>>> On Wednesday, February 2, 2022 at 9:39:54 AM UTC-7, Stephen Fuld wrote:
> >>>>> And one
> >>>>> quibble; you may need the address of the next instruction in the bundle
> >>>>> to handle the case of the current instruction is a CALL type instruction
> >>>>> that puts the address of the next instruction in a register.
> >>> <
> >>>> That is true; if you can branch into the middle of a bundle, then presumably
> >>>> branches out with returns in the middle will be allowed. Here, though, I
> >>>> think the issue will be encoding rather than calculation. If a 128-bit bundle
> >>>> is divided into 20-bit packets, presumably packet n will be given an
> >>>> address aligned on 16-bit boundaries which will be...
> >>> <
> >>> I am surprised at you--you are making the mistake that control flow addressing
> >>> is identical to memory addressing--it is not (in this case). Instructions are located
> >>> on 20-bit boundaries, and whether or not an instruction starts in this container
> >>> is controlled by the header to the bundle. HW has no problem decoding to 20-bit
> >>> boundaries, so the only constraint is that the range in the LOBs of the IP is [0..5]
> >>> (instead of 0..7]) !!
> >>> <
> >>>> associated with one
> >>>> of the packets. The encoding scheme will presumably make those addresses
> >>>> consecutive, to simplify the calculation that needs to be done.
> >>>>
> >>>> John Savard
> >> Some comments:
> >> A) In the modern uP arch, instruction flow is separated from data flow.. Thus there is negligible hardware cost in having instruction granularity different from data granularity.
> >> B) Why use the remaining bits to re-sync the instruction addresses? E.g. why not have 21 bit instruction granularity instead of 20 bit? Re-sync to always occur at the start of 64 or 128 or 256-bit instruction block?
> >> C) Opens a new arena for uP architecture, eg instruction formats not tied to data granularity.
> >> D) Need a name for for the concept. Maybe there is already an ISA of this nature (other than stack machines with very short instructions)?
> >> E) Need yet another name for the concept of using some of the remaining instruction block bits for (usually) decode purposes?
> >
> > The next thing needing discussion is relative branches. In order to avoid division by small odd numbers, the branch displacement also needs to have the container plus packet location format. The branch address is then a modulo add with carry between the current instruction location and the relative displacement. Leading to the conclusion that instruction addresses and displacements need to be a distinct category or "type" from data addresses and offsets (which use whole binary numbers).
> The problems with relative branches can be avoided by changing the layout
> of the bundle. Instead of 20 contiguous bits that start on bit boundaries,
> have six 16-bit instruction words starting on byte boundaries,
> followed by six 4-bit fields in 3 bytes which extend those words,
> followed by 1 byte with the start bits.
>
> This means the address of the start of each instruction is byte aligned,
> and the offset from an instruction to another is always in integer bytes.
> And this is the same as RIP relative offsets to data.
>
> The byte increment from instruction to instruction inside a bundle is 2,
> except for the last instruction in slot[5] which is 6.
> But this is just like a variable length instruction set except it is
> addresses that end in 0xA that appear to be a longer format.

Slick, and not slick at the same time:
Avoids the distinction between instruction and data addresses ! (mostly)
Complicates (in a minor way) extraction of instructions.
Instruction address adder/incrementer still needs to skip over part of the instruction block or deal with variable instruction lengths?

Re: Encoding 20 and 40 bit instructions in 128 bits

<2770ff66-6a4b-4775-82a3-363a82e883d1n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23285&group=comp.arch#23285

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:29ea:: with SMTP id jv10mr3658102qvb.46.1644015765564;
Fri, 04 Feb 2022 15:02:45 -0800 (PST)
X-Received: by 2002:a05:6870:5381:: with SMTP id h1mr337868oan.333.1644015765332;
Fri, 04 Feb 2022 15:02:45 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 4 Feb 2022 15:02:45 -0800 (PST)
In-Reply-To: <c75e0e4a-0a17-4429-9282-2f928600f77en@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:302e:7005:d193:7915;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:302e:7005:d193:7915
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com> <b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
<_IhLJ.4280$0vE9.17@fx17.iad> <c75e0e4a-0a17-4429-9282-2f928600f77en@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2770ff66-6a4b-4775-82a3-363a82e883d1n@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 04 Feb 2022 23:02:45 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 7
 by: MitchAlsup - Fri, 4 Feb 2022 23:02 UTC

Relative are dealt with by making the relative branch relative to the start of
the container the relative branch reside in.

So you do not branch based on the IP of the branch instruction, but the
address of the container. The 3-LoB of the relative are non-relative offset
into the targeted container.

Presto, simple in HW, only a bit complicated for SW.

Re: Encoding 20 and 40 bit instructions in 128 bits

<PukLJ.638$Gu79.233@fx26.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23287&group=comp.arch#23287

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.uzoreto.com!peer01.ams4!peer.am4.highwinds-media.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx26.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me> <c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me> <de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com> <4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com> <b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com> <_IhLJ.4280$0vE9.17@fx17.iad> <c75e0e4a-0a17-4429-9282-2f928600f77en@googlegroups.com>
In-Reply-To: <c75e0e4a-0a17-4429-9282-2f928600f77en@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 70
Message-ID: <PukLJ.638$Gu79.233@fx26.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sat, 05 Feb 2022 01:34:39 UTC
Date: Fri, 04 Feb 2022 20:23:40 -0500
X-Received-Bytes: 6040
 by: EricP - Sat, 5 Feb 2022 01:23 UTC

JimBrakefield wrote:
> On Friday, February 4, 2022 at 4:25:03 PM UTC-6, EricP wrote:
>> JimBrakefield wrote:
>>> On Wednesday, February 2, 2022 at 3:39:15 PM UTC-6, JimBrakefield wrote:
>>>> On Wednesday, February 2, 2022 at 11:32:49 AM UTC-6, MitchAlsup wrote:
>>>>> On Wednesday, February 2, 2022 at 11:04:44 AM UTC-6, Quadibloc wrote:
>>>>>> On Wednesday, February 2, 2022 at 9:39:54 AM UTC-7, Stephen Fuld wrote:
>>>>>>> And one
>>>>>>> quibble; you may need the address of the next instruction in the bundle
>>>>>>> to handle the case of the current instruction is a CALL type instruction
>>>>>>> that puts the address of the next instruction in a register.
>>>>> <
>>>>>> That is true; if you can branch into the middle of a bundle, then presumably
>>>>>> branches out with returns in the middle will be allowed. Here, though, I
>>>>>> think the issue will be encoding rather than calculation. If a 128-bit bundle
>>>>>> is divided into 20-bit packets, presumably packet n will be given an
>>>>>> address aligned on 16-bit boundaries which will be...
>>>>> <
>>>>> I am surprised at you--you are making the mistake that control flow addressing
>>>>> is identical to memory addressing--it is not (in this case). Instructions are located
>>>>> on 20-bit boundaries, and whether or not an instruction starts in this container
>>>>> is controlled by the header to the bundle. HW has no problem decoding to 20-bit
>>>>> boundaries, so the only constraint is that the range in the LOBs of the IP is [0..5]
>>>>> (instead of 0..7]) !!
>>>>> <
>>>>>> associated with one
>>>>>> of the packets. The encoding scheme will presumably make those addresses
>>>>>> consecutive, to simplify the calculation that needs to be done.
>>>>>>
>>>>>> John Savard
>>>> Some comments:
>>>> A) In the modern uP arch, instruction flow is separated from data flow.. Thus there is negligible hardware cost in having instruction granularity different from data granularity.
>>>> B) Why use the remaining bits to re-sync the instruction addresses? E.g. why not have 21 bit instruction granularity instead of 20 bit? Re-sync to always occur at the start of 64 or 128 or 256-bit instruction block?
>>>> C) Opens a new arena for uP architecture, eg instruction formats not tied to data granularity.
>>>> D) Need a name for for the concept. Maybe there is already an ISA of this nature (other than stack machines with very short instructions)?
>>>> E) Need yet another name for the concept of using some of the remaining instruction block bits for (usually) decode purposes?
>>> The next thing needing discussion is relative branches. In order to avoid division by small odd numbers, the branch displacement also needs to have the container plus packet location format. The branch address is then a modulo add with carry between the current instruction location and the relative displacement. Leading to the conclusion that instruction addresses and displacements need to be a distinct category or "type" from data addresses and offsets (which use whole binary numbers).
>> The problems with relative branches can be avoided by changing the layout
>> of the bundle. Instead of 20 contiguous bits that start on bit boundaries,
>> have six 16-bit instruction words starting on byte boundaries,
>> followed by six 4-bit fields in 3 bytes which extend those words,
>> followed by 1 byte with the start bits.
>>
>> This means the address of the start of each instruction is byte aligned,
>> and the offset from an instruction to another is always in integer bytes.
>> And this is the same as RIP relative offsets to data.
>>
>> The byte increment from instruction to instruction inside a bundle is 2,
>> except for the last instruction in slot[5] which is 6.
>> But this is just like a variable length instruction set except it is
>> addresses that end in 0xA that appear to be a longer format.
>
> Slick, and not slick at the same time:

:-) It's a floor wax AND a dessert topping!

> Avoids the distinction between instruction and data addresses ! (mostly)
> Complicates (in a minor way) extraction of instructions.
> Instruction address adder/incrementer still needs to skip over part of the instruction block or deal with variable instruction lengths?

Instead of the decoder looking at an opcode to decide how much to
add to the RIP, it looks at the RIP address last hex digit.

The only place I can think of that might care how the RIP increment works
would be the linker since offsets are usually relative to the incremented
address so that offset=0 means the next sequential instruction.
Also for dynamic patching DLL linkages.

One could also make the offset relative to the branch instruction address.

Re: Encoding 20 and 40 bit instructions in 128 bits

<QukLJ.639$Gu79.628@fx26.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23288&group=comp.arch#23288

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx26.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me> <c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me> <de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com> <4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com> <b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com> <_IhLJ.4280$0vE9.17@fx17.iad> <c75e0e4a-0a17-4429-9282-2f928600f77en@googlegroups.com> <2770ff66-6a4b-4775-82a3-363a82e883d1n@googlegroups.com>
In-Reply-To: <2770ff66-6a4b-4775-82a3-363a82e883d1n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 19
Message-ID: <QukLJ.639$Gu79.628@fx26.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sat, 05 Feb 2022 01:34:40 UTC
Date: Fri, 04 Feb 2022 20:34:18 -0500
X-Received-Bytes: 2036
 by: EricP - Sat, 5 Feb 2022 01:34 UTC

MitchAlsup wrote:
> Relative are dealt with by making the relative branch relative to the start of
> the container the relative branch reside in.
>
> So you do not branch based on the IP of the branch instruction, but the
> address of the container. The 3-LoB of the relative are non-relative offset
> into the targeted container.
>
> Presto, simple in HW, only a bit complicated for SW.

I thought of that but is makes offsets for instructions different
from offsets for data.

inst_offset = ((dest_addr & (~7)) - (branch_addr & (~7))) | (dest_addr & 7)

which makes the instruction offset calculation non-commutative arithmetic
so that ((a + b) - b) != a
and I was concerned that might cause other problems.

Re: Encoding 20 and 40 bit instructions in 128 bits

<3%tLJ.35102$t2Bb.34664@fx98.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23291&group=comp.arch#23291

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!npeer.as286.net!npeer-ng0.as286.net!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx98.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me> <c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me> <de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com> <4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com> <b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com> <_IhLJ.4280$0vE9.17@fx17.iad>
In-Reply-To: <_IhLJ.4280$0vE9.17@fx17.iad>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 92
Message-ID: <3%tLJ.35102$t2Bb.34664@fx98.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sat, 05 Feb 2022 12:23:27 UTC
Date: Sat, 05 Feb 2022 07:22:48 -0500
X-Received-Bytes: 6282
 by: EricP - Sat, 5 Feb 2022 12:22 UTC

EricP wrote:
> JimBrakefield wrote:
>> On Wednesday, February 2, 2022 at 3:39:15 PM UTC-6, JimBrakefield wrote:
>>> On Wednesday, February 2, 2022 at 11:32:49 AM UTC-6, MitchAlsup wrote:
>>>> On Wednesday, February 2, 2022 at 11:04:44 AM UTC-6, Quadibloc wrote:
>>>>> On Wednesday, February 2, 2022 at 9:39:54 AM UTC-7, Stephen Fuld
>>>>> wrote:
>>>>>> And one quibble; you may need the address of the next instruction
>>>>>> in the bundle to handle the case of the current instruction is a
>>>>>> CALL type instruction that puts the address of the next
>>>>>> instruction in a register.
>>>> <
>>>>> That is true; if you can branch into the middle of a bundle, then
>>>>> presumably branches out with returns in the middle will be allowed.
>>>>> Here, though, I think the issue will be encoding rather than
>>>>> calculation. If a 128-bit bundle is divided into 20-bit packets,
>>>>> presumably packet n will be given an address aligned on 16-bit
>>>>> boundaries which will be...
>>>> < I am surprised at you--you are making the mistake that control
>>>> flow addressing is identical to memory addressing--it is not (in
>>>> this case). Instructions are located on 20-bit boundaries, and
>>>> whether or not an instruction starts in this container is controlled
>>>> by the header to the bundle. HW has no problem decoding to 20-bit
>>>> boundaries, so the only constraint is that the range in the LOBs of
>>>> the IP is [0..5] (instead of 0..7]) !! <
>>>>> associated with one of the packets. The encoding scheme will
>>>>> presumably make those addresses consecutive, to simplify the
>>>>> calculation that needs to be done.
>>>>> John Savard
>>> Some comments: A) In the modern uP arch, instruction flow is
>>> separated from data flow. Thus there is negligible hardware cost in
>>> having instruction granularity different from data granularity. B)
>>> Why use the remaining bits to re-sync the instruction addresses? E.g.
>>> why not have 21 bit instruction granularity instead of 20 bit?
>>> Re-sync to always occur at the start of 64 or 128 or 256-bit
>>> instruction block? C) Opens a new arena for uP architecture, eg
>>> instruction formats not tied to data granularity. D) Need a name for
>>> for the concept. Maybe there is already an ISA of this nature (other
>>> than stack machines with very short instructions)? E) Need yet
>>> another name for the concept of using some of the remaining
>>> instruction block bits for (usually) decode purposes?
>>
>> The next thing needing discussion is relative branches. In order to
>> avoid division by small odd numbers, the branch displacement also
>> needs to have the container plus packet location format. The branch
>> address is then a modulo add with carry between the current
>> instruction location and the relative displacement. Leading to the
>> conclusion that instruction addresses and displacements need to be a
>> distinct category or "type" from data addresses and offsets (which use
>> whole binary numbers).
>
> The problems with relative branches can be avoided by changing the layout
> of the bundle. Instead of 20 contiguous bits that start on bit boundaries,
> have six 16-bit instruction words starting on byte boundaries,
> followed by six 4-bit fields in 3 bytes which extend those words,
> followed by 1 byte with the start bits.
>
> This means the address of the start of each instruction is byte aligned,
> and the offset from an instruction to another is always in integer bytes.
> And this is the same as RIP relative offsets to data.
>
> The byte increment from instruction to instruction inside a bundle is 2,
> except for the last instruction in slot[5] which is 6.
> But this is just like a variable length instruction set except it is
> addresses that end in 0xA that appear to be a longer format.

Or how about this...

The bundle is an aligned 16 byte packet so the high 60 bits are the
bundle number and low 4 bits are the internal byte offset.

When reading or writing bundle bytes we use the normal 64 bit address.

When executing instructions the RIP address has the same high 60 bit
bundle address and the low 4 bits denotes the instruction.
However we pretend that the 6 instructions start at bytes 0..5
and values in the range 6..15 are an illegal instruction trap.
We completely ignore the internal structure of the bundle
and leave that up to the decoder to sort out.
Its as though a bundle consists of five 1-byte and one 11-byte instructions.

That leaves RIP addresses and branch offset as normal 64-bit integers,
all the arithmetic is normal and there is no field masking or shifting.

This probably should have some test scenarios... I can see that having
the instruction size appear to change based on its address might have
some unexpected consequences.
Things to test like how compiler generates conditional branches, linker,
maybe see how one would generate a trampoline on the stack,
calculate the address into a register, and branch to that register address.

Re: Encoding 20 and 40 bit instructions in 128 bits

<sto41r$4tj$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23293&group=comp.arch#23293

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-3f18-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Sun, 6 Feb 2022 09:23:07 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <sto41r$4tj$1@newsreader4.netcologne.de>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de>
<ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>
<stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>
<2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>
<b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
<_IhLJ.4280$0vE9.17@fx17.iad> <3%tLJ.35102$t2Bb.34664@fx98.iad>
Injection-Date: Sun, 6 Feb 2022 09:23:07 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-3f18-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd6:3f18:0:7285:c2ff:fe6c:992d";
logging-data="5043"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Sun, 6 Feb 2022 09:23 UTC

EricP <ThatWouldBeTelling@thevillage.com> schrieb:
> Or how about this...
>
> The bundle is an aligned 16 byte packet so the high 60 bits are the
> bundle number and low 4 bits are the internal byte offset.
>
> When reading or writing bundle bytes we use the normal 64 bit address.
>
> When executing instructions the RIP address has the same high 60 bit
> bundle address and the low 4 bits denotes the instruction.
> However we pretend that the 6 instructions start at bytes 0..5
> and values in the range 6..15 are an illegal instruction trap.
> We completely ignore the internal structure of the bundle
> and leave that up to the decoder to sort out.
> Its as though a bundle consists of five 1-byte and one 11-byte instructions.

> That leaves RIP addresses and branch offset as normal 64-bit integers,
> all the arithmetic is normal and there is no field masking or shifting.

The question is what to do on a branch without wasting entropy.
I think that scheme would waste one bit of entropy if you also wanted
to calculate reltive branches.

If you want to use "normal" address arithmetic, it would probably
be better to pretend that the instructions start at a two-byte
boundary and shift relative offsets by one (same as you would
in an instruction set with a minimum 16-bit instruction size).

> This probably should have some test scenarios... I can see that having
> the instruction size appear to change based on its address might have
> some unexpected consequences.
> Things to test like how compiler generates conditional branches, linker,
> maybe see how one would generate a trampoline on the stack,
> calculate the address into a register, and branch to that register address.

If you want to have a trampoline, you can prescribe that it is
always aligned to a 128-bit boundary. Set the bits so that
jumping to the start would be allowed, but not jumping into
the middle.

Re: Encoding 20 and 40 bit instructions in 128 bits

<9de2cef4-0cfc-4a6b-a96a-fc7cbc836966n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23294&group=comp.arch#23294

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:20ee:: with SMTP id 14mr7997414qvk.38.1644162859930;
Sun, 06 Feb 2022 07:54:19 -0800 (PST)
X-Received: by 2002:a05:6830:2b20:: with SMTP id l32mr3023275otv.333.1644162859687;
Sun, 06 Feb 2022 07:54:19 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!3.eu.feeder.erje.net!2.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 6 Feb 2022 07:54:19 -0800 (PST)
In-Reply-To: <sto41r$4tj$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:3091:7685:381f:65ca;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:3091:7685:381f:65ca
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com> <b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
<_IhLJ.4280$0vE9.17@fx17.iad> <3%tLJ.35102$t2Bb.34664@fx98.iad> <sto41r$4tj$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <9de2cef4-0cfc-4a6b-a96a-fc7cbc836966n@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 06 Feb 2022 15:54:19 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 52
 by: MitchAlsup - Sun, 6 Feb 2022 15:54 UTC

On Sunday, February 6, 2022 at 3:23:11 AM UTC-6, Thomas Koenig wrote:
> EricP <ThatWould...@thevillage.com> schrieb:
> > Or how about this...
> >
> > The bundle is an aligned 16 byte packet so the high 60 bits are the
> > bundle number and low 4 bits are the internal byte offset.
> >
> > When reading or writing bundle bytes we use the normal 64 bit address.
> >
> > When executing instructions the RIP address has the same high 60 bit
> > bundle address and the low 4 bits denotes the instruction.
> > However we pretend that the 6 instructions start at bytes 0..5
> > and values in the range 6..15 are an illegal instruction trap.
> > We completely ignore the internal structure of the bundle
> > and leave that up to the decoder to sort out.
> > Its as though a bundle consists of five 1-byte and one 11-byte instructions.
>
> > That leaves RIP addresses and branch offset as normal 64-bit integers,
> > all the arithmetic is normal and there is no field masking or shifting.
> The question is what to do on a branch without wasting entropy.
> I think that scheme would waste one bit of entropy if you also wanted
> to calculate reltive branches.
<
Most RISC machines are already wasting 2-bits of entropy--the bits addressing
Bytes and HalfWords. Nobody seems to care.
>
> If you want to use "normal" address arithmetic, it would probably
> be better to pretend that the instructions start at a two-byte
> boundary and shift relative offsets by one (same as you would
> in an instruction set with a minimum 16-bit instruction size).
<
Since this is HW, you could always define the 20-bit containers comprising
2 fields: a 16-bit container having the more important 16-bits of the inst
and starting at Halfword boundary, and a 4-bit extension starting at the
last Word, and indexed by the same index the 16-bit part of the container
just at 4-bit size. This would be horrible for SW, and trivial for HW. But
here, all instructions start on 16-bit boundaries!
<
I hate it (and so would assemblers and linkers and JITs), but it can be made
to work. It is, however, unnecessary.
<
> > This probably should have some test scenarios... I can see that having
> > the instruction size appear to change based on its address might have
> > some unexpected consequences.
> > Things to test like how compiler generates conditional branches, linker,
> > maybe see how one would generate a trampoline on the stack,
> > calculate the address into a register, and branch to that register address.
> If you want to have a trampoline, you can prescribe that it is
> always aligned to a 128-bit boundary. Set the bits so that
> jumping to the start would be allowed, but not jumping into
> the middle.
<
All entry points should be container aligned.

Re: Encoding 20 and 40 bit instructions in 128 bits

<stuoqv$97e$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23312&group=comp.arch#23312

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-3f18-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Tue, 8 Feb 2022 21:54:39 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <stuoqv$97e$1@newsreader4.netcologne.de>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de>
<ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>
<stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>
<2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>
<b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
<_IhLJ.4280$0vE9.17@fx17.iad> <3%tLJ.35102$t2Bb.34664@fx98.iad>
<sto41r$4tj$1@newsreader4.netcologne.de>
<9de2cef4-0cfc-4a6b-a96a-fc7cbc836966n@googlegroups.com>
Injection-Date: Tue, 8 Feb 2022 21:54:39 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-3f18-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd6:3f18:0:7285:c2ff:fe6c:992d";
logging-data="9454"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Tue, 8 Feb 2022 21:54 UTC

I looked a bit at what to include in the 20-bit instruction subset.

Looking at the biggest piece of bloat^H^H^H^H^Hsoftware I can find
on POWER, chromium plus supporting shared libraries, a bit more
than 58 million instructions, I found the following instruction
frequency. The first column is the instruction, the second the
number of instructions, the third one percentage, and the fourth
one culminative percentage. The frequency will probably surprise
no one here:

ld 5837117 11.55 11.55 (load 64-bit with offset)
mr 5590458 11.06 22.61 (move register)
addi 4750662 9.40 32.00 (add with 16-bit constant)
std 3739464 7.40 39.40 (store 64-bit with offset)
bl 3390274 6.71 46.11 (branch and link)
li 2486296 4.92 51.03 (load immediate)
addis 1668924 3.30 54.33 (add immediate and shift)
b 1534024 3.03 57.36 (branch)
beq 1507796 2.98 60.34 (branch if equal)
add 1266393 2.51 62.85 (add)
cmpdi 1009923 2.00 64.85 (compare immediate)
ori 822801 1.63 66.48 (or immedaite)
lwz 810375 1.60 68.08 (load word and zero)
bne 799486 1.58 69.66 (branch if not equal)
stdu 691161 1.37 71.03 (store double with update)
stw 690701 1.37 72.39 (store word)
cmpwi 642714 1.27 73.66 (compare word immediate)
mflr 587134 1.16 74.83 (move from link register)
lbz 472965 0.94 75.76 (load byte zero)
extsw 465193 0.92 76.68 (extend sign)
subf 435575 0.86 77.54 (subtract)
sldi 370387 0.73 78.28 (shift left)
blr 358973 0.71 78.99 (branch and link)
mtctr 350664 0.69 79.68 (move to counter register)
rlwinm 327689 0.65 80.33 (word shifting)
stb 326003 0.64 80.97 (store byte with offset)

Not all offsets would fit into a 20-bit container, of course.
Looking at an instruction format consisting of

- one four-bit opcode
- two five-bit registers
- one six-bit constant

it would be possible to fit (original data too long to post
here)

ld 9.72
addi 2.26
std 6.50
li 3.87
cmpdi 1.30
lwz 1.02
sldi 0.62

into that format, close to 25,3% of instructions.

For a two or three-register format,

mr 11.06
add 2.51
extw 0.92
subf 0.86

would give 15,3% on top.

For branches, I would say that 5 bit of POWER offset would correspond
to 6 bits of the combined address, which would give a percent or two.
So 40% of half-length instructions sounds reasonable.

In other words: Recoding POWER into 20 and 40 bit chunks without
using any of the additional freedom gained by 40-bit instructions
would actually be a gain in code density, without any restrictions
in what registers to choose (such as having special instructions
for the stack pointer).

Trying to compress this into 16 bits would be much more difficult.

Re: Encoding 20 and 40 bit instructions in 128 bits

<stv51f$da9$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23313&group=comp.arch#23313

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Tue, 8 Feb 2022 19:22:52 -0600
Organization: A noiseless patient Spider
Lines: 167
Message-ID: <stv51f$da9$1@dont-email.me>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de>
<ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>
<stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>
<2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>
<b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
<_IhLJ.4280$0vE9.17@fx17.iad> <3%tLJ.35102$t2Bb.34664@fx98.iad>
<sto41r$4tj$1@newsreader4.netcologne.de>
<9de2cef4-0cfc-4a6b-a96a-fc7cbc836966n@googlegroups.com>
<stuoqv$97e$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 9 Feb 2022 01:22:55 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="2394e7d82d878e21280076653c8a4a16";
logging-data="13641"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+2MLbUw7vASLC1rMMKG2Oa"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.1
Cancel-Lock: sha1:cjB+4c06/5hC387zegcYUIwDM9I=
In-Reply-To: <stuoqv$97e$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: BGB - Wed, 9 Feb 2022 01:22 UTC

On 2/8/2022 3:54 PM, Thomas Koenig wrote:
> I looked a bit at what to include in the 20-bit instruction subset.
>
> Looking at the biggest piece of bloat^H^H^H^H^Hsoftware I can find
> on POWER, chromium plus supporting shared libraries, a bit more
> than 58 million instructions, I found the following instruction
> frequency. The first column is the instruction, the second the
> number of instructions, the third one percentage, and the fourth
> one culminative percentage. The frequency will probably surprise
> no one here:
>
> ld 5837117 11.55 11.55 (load 64-bit with offset)
> mr 5590458 11.06 22.61 (move register)
> addi 4750662 9.40 32.00 (add with 16-bit constant)
> std 3739464 7.40 39.40 (store 64-bit with offset)
> bl 3390274 6.71 46.11 (branch and link)
> li 2486296 4.92 51.03 (load immediate)
> addis 1668924 3.30 54.33 (add immediate and shift)
> b 1534024 3.03 57.36 (branch)
> beq 1507796 2.98 60.34 (branch if equal)
> add 1266393 2.51 62.85 (add)
> cmpdi 1009923 2.00 64.85 (compare immediate)
> ori 822801 1.63 66.48 (or immedaite)
> lwz 810375 1.60 68.08 (load word and zero)
> bne 799486 1.58 69.66 (branch if not equal)
> stdu 691161 1.37 71.03 (store double with update)
> stw 690701 1.37 72.39 (store word)
> cmpwi 642714 1.27 73.66 (compare word immediate)
> mflr 587134 1.16 74.83 (move from link register)
> lbz 472965 0.94 75.76 (load byte zero)
> extsw 465193 0.92 76.68 (extend sign)
> subf 435575 0.86 77.54 (subtract)
> sldi 370387 0.73 78.28 (shift left)
> blr 358973 0.71 78.99 (branch and link)
> mtctr 350664 0.69 79.68 (move to counter register)
> rlwinm 327689 0.65 80.33 (word shifting)
> stb 326003 0.64 80.97 (store byte with offset)
>
> Not all offsets would fit into a 20-bit container, of course.
> Looking at an instruction format consisting of
>
> - one four-bit opcode
> - two five-bit registers
> - one six-bit constant
>
> it would be possible to fit (original data too long to post
> here)
>
> ld 9.72
> addi 2.26
> std 6.50
> li 3.87
> cmpdi 1.30
> lwz 1.02
> sldi 0.62
>
> into that format, close to 25,3% of instructions.
>
> For a two or three-register format,
>
> mr 11.06
> add 2.51
> extw 0.92
> subf 0.86
>
> would give 15,3% on top.
>
> For branches, I would say that 5 bit of POWER offset would correspond
> to 6 bits of the combined address, which would give a percent or two.
> So 40% of half-length instructions sounds reasonable.
>
> In other words: Recoding POWER into 20 and 40 bit chunks without
> using any of the additional freedom gained by 40-bit instructions
> would actually be a gain in code density, without any restrictions
> in what registers to choose (such as having special instructions
> for the stack pointer).
>
> Trying to compress this into 16 bits would be much more difficult.

A decent chunk of the common instructions can be crammed into 16-bit
encodings on BJX2, sorta...

But, yeah, the overall rankings in my case seem to be vaguely similar.

Top ranking instructions from Doom (aggregated by mnemonic):
MOV.Q (Load/Store QWord)
MOV.L (Load/Store DWord)
BF (Branch if False)
MOVU.L (Load, Unsigned DWord)
BT (Branch if True)
MOVU.B (Load, Unsigned Byte)
MOV (Move Reg, Reg)
MOVU.W (Load, Unsigned Word)
MOV.W (Load/Store, Word)
ADD (Add, 64-bit)
TST (Bit Test, ((A&B)==0))
BRA (Unconditional Branch)
ADDS.L (Add, 32-bit, Sign-Extending)
MOV.X (Load/Store, 128-bit)
SHLD (Logical Shift, 32-bit)
OR (Bitwise OR)
CMPQGT (Compare Greater, 64-bit)
LDIZ (Load Immediate, Zero-Extended)
...

Or, aggregating by category:
Load/Store ops
Branch Ops
Common ALU ops

Within 16-bit encodings, the bulk of the most commonly encoded:
MOV (Reg, Reg)
Branch ops (Disp8)
ADD (Imm8, Reg)
LDI (Imm12, R0)
LDI (Imm8, Reg)
Load/Store with (SP, Disp4), Various
CMPEQ (Imm4, Rn)

Within 32-bit encodings:
Branch Ops (Disp20)
Load/Store Ops (Disp9)
ALU Ops (Rm, Imm9, Rn)
LDI/ADD (Imm16, Rn)
...

There is much less of a showing for Load/Store with a non-SP base
register in 16-bit land, but the likely reason here is there is a very
limited selection of displacement encodings (these encodings are very
common in terms of 32-bit encodings).

Say, for example, one wants:
4b Base Register
4b Dest Register
3b Disp
3b Format

Then, one is already looking at 14 bits of encoding space.

Could in theory cram it down to 12 bits of encoding space by using 3-bit
register fields, but this would be somewhat limiting.

This can fit a little better into a 24-bit encoding, but in a past
experiment, the savings from 24-bit Load/Store encodings and similar
were overall fairly modest vs plain 16/32.

In the current form of the ISA, the encoding space that was originally
assigned for 24-bit encodings was reused for the XGPR encodings (Namely,
32-bit instruction encodings with 6-bit register fields, covering a
"common" ISA subset; with cases which don't fit into the 32-bit encoding
falling back to a 64-bit instruction format, which while not necessarily
the most efficient possibility, this is rare enough that it doesn't
really matter).

But, in general, a 20 or 24 bit encoding could potentially be able to
fit a better range of Load/Store encodings, which could potentially be
worthwhile (though, in this case, I would probably go for a byte-aligned
16/24/32 encoding, rather than a bundle-based 20/40 encoding).

....

Re: Encoding 20 and 40 bit instructions in 128 bits

<stvf02$v7b$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23314&group=comp.arch#23314

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ggt...@yahoo.com (Brett)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Wed, 9 Feb 2022 04:12:50 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 171
Message-ID: <stvf02$v7b$1@dont-email.me>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de>
<ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>
<stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>
<2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>
<b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
<_IhLJ.4280$0vE9.17@fx17.iad>
<3%tLJ.35102$t2Bb.34664@fx98.iad>
<sto41r$4tj$1@newsreader4.netcologne.de>
<9de2cef4-0cfc-4a6b-a96a-fc7cbc836966n@googlegroups.com>
<stuoqv$97e$1@newsreader4.netcologne.de>
<stv51f$da9$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 9 Feb 2022 04:12:50 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="cd516af77883302e074593ad5fa8e726";
logging-data="31979"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19jxsgWs2MZpWsLKUucBnln"
User-Agent: NewsTap/5.5 (iPad)
Cancel-Lock: sha1:tfCL1XWxfFUOBI9Btfxl3Kj9a0Y=
sha1:3Ht4Z7e6qSh3ipYqEnDPBvvbsa0=
 by: Brett - Wed, 9 Feb 2022 04:12 UTC

BGB <cr88192@gmail.com> wrote:
> On 2/8/2022 3:54 PM, Thomas Koenig wrote:
>> I looked a bit at what to include in the 20-bit instruction subset.
>>
>> Looking at the biggest piece of bloat^H^H^H^H^Hsoftware I can find
>> on POWER, chromium plus supporting shared libraries, a bit more
>> than 58 million instructions, I found the following instruction
>> frequency. The first column is the instruction, the second the
>> number of instructions, the third one percentage, and the fourth
>> one culminative percentage. The frequency will probably surprise
>> no one here:
>>
>> ld 5837117 11.55 11.55 (load 64-bit with offset)
>> mr 5590458 11.06 22.61 (move register)
>> addi 4750662 9.40 32.00 (add with 16-bit constant)
>> std 3739464 7.40 39.40 (store 64-bit with offset)
>> bl 3390274 6.71 46.11 (branch and link)
>> li 2486296 4.92 51.03 (load immediate)
>> addis 1668924 3.30 54.33 (add immediate and shift)
>> b 1534024 3.03 57.36 (branch)
>> beq 1507796 2.98 60.34 (branch if equal)
>> add 1266393 2.51 62.85 (add)
>> cmpdi 1009923 2.00 64.85 (compare immediate)
>> ori 822801 1.63 66.48 (or immedaite)
>> lwz 810375 1.60 68.08 (load word and zero)
>> bne 799486 1.58 69.66 (branch if not equal)
>> stdu 691161 1.37 71.03 (store double with update)
>> stw 690701 1.37 72.39 (store word)
>> cmpwi 642714 1.27 73.66 (compare word immediate)
>> mflr 587134 1.16 74.83 (move from link register)
>> lbz 472965 0.94 75.76 (load byte zero)
>> extsw 465193 0.92 76.68 (extend sign)
>> subf 435575 0.86 77.54 (subtract)
>> sldi 370387 0.73 78.28 (shift left)
>> blr 358973 0.71 78.99 (branch and link)
>> mtctr 350664 0.69 79.68 (move to counter register)
>> rlwinm 327689 0.65 80.33 (word shifting)
>> stb 326003 0.64 80.97 (store byte with offset)
>>
>> Not all offsets would fit into a 20-bit container, of course.
>> Looking at an instruction format consisting of
>>
>> - one four-bit opcode
>> - two five-bit registers
>> - one six-bit constant
>>
>> it would be possible to fit (original data too long to post
>> here)
>>
>> ld 9.72
>> addi 2.26
>> std 6.50
>> li 3.87
>> cmpdi 1.30
>> lwz 1.02
>> sldi 0.62
>>
>> into that format, close to 25,3% of instructions.
>>
>> For a two or three-register format,
>>
>> mr 11.06
>> add 2.51
>> extw 0.92
>> subf 0.86
>>
>> would give 15,3% on top.
>>
>> For branches, I would say that 5 bit of POWER offset would correspond
>> to 6 bits of the combined address, which would give a percent or two.
>> So 40% of half-length instructions sounds reasonable.
>>
>> In other words: Recoding POWER into 20 and 40 bit chunks without
>> using any of the additional freedom gained by 40-bit instructions
>> would actually be a gain in code density, without any restrictions
>> in what registers to choose (such as having special instructions
>> for the stack pointer).
>>
>> Trying to compress this into 16 bits would be much more difficult.
>
> A decent chunk of the common instructions can be crammed into 16-bit
> encodings on BJX2, sorta...
>
> But, yeah, the overall rankings in my case seem to be vaguely similar.
>
>
> Top ranking instructions from Doom (aggregated by mnemonic):
> MOV.Q (Load/Store QWord)
> MOV.L (Load/Store DWord)
> BF (Branch if False)
> MOVU.L (Load, Unsigned DWord)
> BT (Branch if True)
> MOVU.B (Load, Unsigned Byte)
> MOV (Move Reg, Reg)
> MOVU.W (Load, Unsigned Word)
> MOV.W (Load/Store, Word)
> ADD (Add, 64-bit)
> TST (Bit Test, ((A&B)==0))
> BRA (Unconditional Branch)
> ADDS.L (Add, 32-bit, Sign-Extending)
> MOV.X (Load/Store, 128-bit)
> SHLD (Logical Shift, 32-bit)
> OR (Bitwise OR)
> CMPQGT (Compare Greater, 64-bit)
> LDIZ (Load Immediate, Zero-Extended)
> ...
>
> Or, aggregating by category:
> Load/Store ops
> Branch Ops
> Common ALU ops
>
>
> Within 16-bit encodings, the bulk of the most commonly encoded:
> MOV (Reg, Reg)
> Branch ops (Disp8)
> ADD (Imm8, Reg)
> LDI (Imm12, R0)
> LDI (Imm8, Reg)
> Load/Store with (SP, Disp4), Various
> CMPEQ (Imm4, Rn)
>
> Within 32-bit encodings:
> Branch Ops (Disp20)
> Load/Store Ops (Disp9)
> ALU Ops (Rm, Imm9, Rn)
> LDI/ADD (Imm16, Rn)
> ...
>
>
> There is much less of a showing for Load/Store with a non-SP base
> register in 16-bit land, but the likely reason here is there is a very
> limited selection of displacement encodings (these encodings are very
> common in terms of 32-bit encodings).
>
> Say, for example, one wants:
> 4b Base Register
> 4b Dest Register
> 3b Disp
> 3b Format
>
> Then, one is already looking at 14 bits of encoding space.
>
> Could in theory cram it down to 12 bits of encoding space by using 3-bit
> register fields, but this would be somewhat limiting.

If you do preferred split non-overlapping address-data registers like short
form 8086 then 16 bits is plenty. Only long form needs access to all
registers.

This is before you add a short belt, which cuts instruction sizes more.

> This can fit a little better into a 24-bit encoding, but in a past
> experiment, the savings from 24-bit Load/Store encodings and similar
> were overall fairly modest vs plain 16/32.
>
> In the current form of the ISA, the encoding space that was originally
> assigned for 24-bit encodings was reused for the XGPR encodings (Namely,
> 32-bit instruction encodings with 6-bit register fields, covering a
> "common" ISA subset; with cases which don't fit into the 32-bit encoding
> falling back to a 64-bit instruction format, which while not necessarily
> the most efficient possibility, this is rare enough that it doesn't
> really matter).
>
> But, in general, a 20 or 24 bit encoding could potentially be able to
> fit a better range of Load/Store encodings, which could potentially be
> worthwhile (though, in this case, I would probably go for a byte-aligned
> 16/24/32 encoding, rather than a bundle-based 20/40 encoding).

Re: Encoding 20 and 40 bit instructions in 128 bits

<stvi9q$d03$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23315&group=comp.arch#23315

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Tue, 8 Feb 2022 21:09:14 -0800
Organization: A noiseless patient Spider
Lines: 91
Message-ID: <stvi9q$d03$1@dont-email.me>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de>
<ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>
<stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>
<2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>
<b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
<_IhLJ.4280$0vE9.17@fx17.iad> <3%tLJ.35102$t2Bb.34664@fx98.iad>
<sto41r$4tj$1@newsreader4.netcologne.de>
<9de2cef4-0cfc-4a6b-a96a-fc7cbc836966n@googlegroups.com>
<stuoqv$97e$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 9 Feb 2022 05:09:15 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="5af9daa82da3ab3c4e194e9076e704ac";
logging-data="13315"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18yxiaw2lPQT0ti18y819Rr0K3SkII9rgc="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.1
Cancel-Lock: sha1:HhpJOuLbMRkkFBpMDCsnvRHmsxs=
In-Reply-To: <stuoqv$97e$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: Stephen Fuld - Wed, 9 Feb 2022 05:09 UTC

On 2/8/2022 1:54 PM, Thomas Koenig wrote:
> I looked a bit at what to include in the 20-bit instruction subset.
>
> Looking at the biggest piece of bloat^H^H^H^H^Hsoftware I can find
> on POWER, chromium plus supporting shared libraries, a bit more
> than 58 million instructions, I found the following instruction
> frequency. The first column is the instruction, the second the
> number of instructions, the third one percentage, and the fourth
> one culminative percentage. The frequency will probably surprise
> no one here:
>
> ld 5837117 11.55 11.55 (load 64-bit with offset)
> mr 5590458 11.06 22.61 (move register)
> addi 4750662 9.40 32.00 (add with 16-bit constant)
> std 3739464 7.40 39.40 (store 64-bit with offset)
> bl 3390274 6.71 46.11 (branch and link)
> li 2486296 4.92 51.03 (load immediate)
> addis 1668924 3.30 54.33 (add immediate and shift)
> b 1534024 3.03 57.36 (branch)
> beq 1507796 2.98 60.34 (branch if equal)
> add 1266393 2.51 62.85 (add)
> cmpdi 1009923 2.00 64.85 (compare immediate)
> ori 822801 1.63 66.48 (or immedaite)
> lwz 810375 1.60 68.08 (load word and zero)
> bne 799486 1.58 69.66 (branch if not equal)
> stdu 691161 1.37 71.03 (store double with update)
> stw 690701 1.37 72.39 (store word)
> cmpwi 642714 1.27 73.66 (compare word immediate)
> mflr 587134 1.16 74.83 (move from link register)
> lbz 472965 0.94 75.76 (load byte zero)
> extsw 465193 0.92 76.68 (extend sign)
> subf 435575 0.86 77.54 (subtract)
> sldi 370387 0.73 78.28 (shift left)
> blr 358973 0.71 78.99 (branch and link)
> mtctr 350664 0.69 79.68 (move to counter register)
> rlwinm 327689 0.65 80.33 (word shifting)
> stb 326003 0.64 80.97 (store byte with offset)
>
> Not all offsets would fit into a 20-bit container, of course.
> Looking at an instruction format consisting of
>
> - one four-bit opcode
> - two five-bit registers
> - one six-bit constant
>
> it would be possible to fit (original data too long to post
> here)
>
> ld 9.72
> addi 2.26
> std 6.50
> li 3.87
> cmpdi 1.30
> lwz 1.02
> sldi 0.62
>
> into that format, close to 25,3% of instructions.
>
> For a two or three-register format,
>
> mr 11.06
> add 2.51
> extw 0.92
> subf 0.86
>
> would give 15,3% on top.
>
> For branches, I would say that 5 bit of POWER offset would correspond
> to 6 bits of the combined address, which would give a percent or two.
> So 40% of half-length instructions sounds reasonable.
>
> In other words: Recoding POWER into 20 and 40 bit chunks without
> using any of the additional freedom gained by 40-bit instructions
> would actually be a gain in code density, without any restrictions
> in what registers to choose (such as having special instructions
> for the stack pointer).

I am probably missing something, but using your 40% figure

(.4 * 20) + (.6 * 40) = 8 + 24 = 32

Then you have to add say on average one bit per instruction for the 8
bit overhead.

So ISTM that at best, you break even on code density, and probably lose
a small amount.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Encoding 20 and 40 bit instructions in 128 bits

<stvmm6$s6c$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23316&group=comp.arch#23316

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-3f18-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Wed, 9 Feb 2022 06:24:06 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <stvmm6$s6c$1@newsreader4.netcologne.de>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de>
<ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>
<stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>
<2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>
<b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
<_IhLJ.4280$0vE9.17@fx17.iad> <3%tLJ.35102$t2Bb.34664@fx98.iad>
<sto41r$4tj$1@newsreader4.netcologne.de>
<9de2cef4-0cfc-4a6b-a96a-fc7cbc836966n@googlegroups.com>
<stuoqv$97e$1@newsreader4.netcologne.de> <stvi9q$d03$1@dont-email.me>
Injection-Date: Wed, 9 Feb 2022 06:24:06 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-3f18-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd6:3f18:0:7285:c2ff:fe6c:992d";
logging-data="28876"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Wed, 9 Feb 2022 06:24 UTC

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> schrieb:
> On 2/8/2022 1:54 PM, Thomas Koenig wrote:
>> I looked a bit at what to include in the 20-bit instruction subset.
>>
>> Looking at the biggest piece of bloat^H^H^H^H^Hsoftware I can find
>> on POWER, chromium plus supporting shared libraries, a bit more
>> than 58 million instructions, I found the following instruction
>> frequency. The first column is the instruction, the second the
>> number of instructions, the third one percentage, and the fourth
>> one culminative percentage. The frequency will probably surprise
>> no one here:
>>
>> ld 5837117 11.55 11.55 (load 64-bit with offset)
>> mr 5590458 11.06 22.61 (move register)
>> addi 4750662 9.40 32.00 (add with 16-bit constant)
>> std 3739464 7.40 39.40 (store 64-bit with offset)
>> bl 3390274 6.71 46.11 (branch and link)
>> li 2486296 4.92 51.03 (load immediate)
>> addis 1668924 3.30 54.33 (add immediate and shift)
>> b 1534024 3.03 57.36 (branch)
>> beq 1507796 2.98 60.34 (branch if equal)
>> add 1266393 2.51 62.85 (add)
>> cmpdi 1009923 2.00 64.85 (compare immediate)
>> ori 822801 1.63 66.48 (or immedaite)
>> lwz 810375 1.60 68.08 (load word and zero)
>> bne 799486 1.58 69.66 (branch if not equal)
>> stdu 691161 1.37 71.03 (store double with update)
>> stw 690701 1.37 72.39 (store word)
>> cmpwi 642714 1.27 73.66 (compare word immediate)
>> mflr 587134 1.16 74.83 (move from link register)
>> lbz 472965 0.94 75.76 (load byte zero)
>> extsw 465193 0.92 76.68 (extend sign)
>> subf 435575 0.86 77.54 (subtract)
>> sldi 370387 0.73 78.28 (shift left)
>> blr 358973 0.71 78.99 (branch and link)
>> mtctr 350664 0.69 79.68 (move to counter register)
>> rlwinm 327689 0.65 80.33 (word shifting)
>> stb 326003 0.64 80.97 (store byte with offset)
>>
>> Not all offsets would fit into a 20-bit container, of course.
>> Looking at an instruction format consisting of
>>
>> - one four-bit opcode
>> - two five-bit registers
>> - one six-bit constant
>>
>> it would be possible to fit (original data too long to post
>> here)
>>
>> ld 9.72
>> addi 2.26
>> std 6.50
>> li 3.87
>> cmpdi 1.30
>> lwz 1.02
>> sldi 0.62
>>
>> into that format, close to 25,3% of instructions.
>>
>> For a two or three-register format,
>>
>> mr 11.06
>> add 2.51
>> extw 0.92
>> subf 0.86
>>
>> would give 15,3% on top.
>>
>> For branches, I would say that 5 bit of POWER offset would correspond
>> to 6 bits of the combined address, which would give a percent or two.
>> So 40% of half-length instructions sounds reasonable.
>>
>> In other words: Recoding POWER into 20 and 40 bit chunks without
>> using any of the additional freedom gained by 40-bit instructions
>> would actually be a gain in code density, without any restrictions
>> in what registers to choose (such as having special instructions
>> for the stack pointer).
>
> I am probably missing something, but using your 40% figure
>
> (.4 * 20) + (.6 * 40) = 8 + 24 = 32

Density is the number of instructions per unit, you have to
add the inverse of the number of bits.

1/(0.4 / 20 + 0.6 / 40) = 1/ (0.02 + 0.015) = 28.57...

(Same as when calculating an avarage density of a mixture).

Re: Encoding 20 and 40 bit instructions in 128 bits

<stvsik$bt5$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23317&group=comp.arch#23317

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Wed, 9 Feb 2022 02:04:34 -0600
Organization: A noiseless patient Spider
Lines: 211
Message-ID: <stvsik$bt5$1@dont-email.me>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de>
<ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>
<stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>
<2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>
<b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
<_IhLJ.4280$0vE9.17@fx17.iad> <3%tLJ.35102$t2Bb.34664@fx98.iad>
<sto41r$4tj$1@newsreader4.netcologne.de>
<9de2cef4-0cfc-4a6b-a96a-fc7cbc836966n@googlegroups.com>
<stuoqv$97e$1@newsreader4.netcologne.de> <stvi9q$d03$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 9 Feb 2022 08:04:37 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="2394e7d82d878e21280076653c8a4a16";
logging-data="12197"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/jCPifmgruxjd6kOlaZyI3"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.1
Cancel-Lock: sha1:P3ChoLWE0UoX60X8ra34raYs8f4=
In-Reply-To: <stvi9q$d03$1@dont-email.me>
Content-Language: en-US
 by: BGB - Wed, 9 Feb 2022 08:04 UTC

On 2/8/2022 11:09 PM, Stephen Fuld wrote:
> On 2/8/2022 1:54 PM, Thomas Koenig wrote:
>> I looked a bit at what to include in the 20-bit instruction subset.
>>
>> Looking at the biggest piece of bloat^H^H^H^H^Hsoftware I can find
>> on POWER, chromium plus supporting shared libraries, a bit more
>> than 58 million instructions, I found the following instruction
>> frequency.  The first column is the instruction, the second the
>> number of instructions, the third one percentage, and the fourth
>> one culminative percentage.  The frequency will probably surprise
>> no one here:
>>
>> ld      5837117  11.55   11.55  (load 64-bit with offset)
>> mr      5590458  11.06   22.61  (move register)
>> addi    4750662   9.40   32.00  (add with 16-bit constant)
>> std     3739464   7.40   39.40  (store 64-bit with offset)
>> bl      3390274   6.71   46.11  (branch and link)
>> li      2486296   4.92   51.03  (load immediate)
>> addis   1668924   3.30   54.33  (add immediate and shift)
>> b       1534024   3.03   57.36  (branch)
>> beq     1507796   2.98   60.34  (branch if equal)
>> add     1266393   2.51   62.85  (add)
>> cmpdi   1009923   2.00   64.85  (compare immediate)
>> ori     822801    1.63   66.48  (or immedaite)
>> lwz     810375    1.60   68.08  (load word and zero)
>> bne     799486    1.58   69.66  (branch if not equal)
>> stdu    691161    1.37   71.03  (store double with update)
>> stw     690701    1.37   72.39  (store word)
>> cmpwi   642714    1.27   73.66  (compare word immediate)
>> mflr    587134    1.16   74.83  (move from link register)
>> lbz     472965    0.94   75.76  (load byte zero)
>> extsw   465193    0.92   76.68  (extend sign)
>> subf    435575    0.86   77.54  (subtract)
>> sldi    370387    0.73   78.28  (shift left)
>> blr     358973    0.71   78.99  (branch and link)
>> mtctr   350664    0.69   79.68  (move to counter register)
>> rlwinm  327689    0.65   80.33  (word shifting)
>> stb     326003    0.64   80.97  (store byte with offset)
>>
>> Not all offsets would fit into a 20-bit container, of course.
>> Looking at an instruction format consisting of
>>
>> - one four-bit opcode
>> - two five-bit registers
>> - one six-bit constant
>>
>> it would be possible to fit (original data too long to post
>> here)
>>
>> ld    9.72
>> addi  2.26
>> std   6.50
>> li    3.87
>> cmpdi 1.30
>> lwz   1.02
>> sldi  0.62
>>
>> into that format, close to 25,3% of instructions.
>>
>> For a two or three-register format,
>>
>> mr   11.06
>> add   2.51
>> extw  0.92
>> subf  0.86
>>
>> would give 15,3% on top.
>>
>> For branches, I would say that 5 bit of POWER offset would correspond
>> to 6 bits of the combined address, which would give a percent or two.
>> So 40% of half-length instructions sounds reasonable.
>>
>> In other words: Recoding POWER into 20 and 40 bit chunks without
>> using any of the additional freedom gained by 40-bit instructions
>> would actually be a gain in code density, without any restrictions
>> in what registers to choose (such as having special instructions
>> for the stack pointer).
>
> I am probably missing something, but using your 40% figure
>
>     (.4 * 20) + (.6 * 40) = 8 + 24 = 32
>
> Then you have to add say on average one bit per instruction for the 8
> bit overhead.
>
> So ISTM that at best, you break even on code density, and probably lose
> a small amount.
>

I suspect that in the average case, an ISA with 16/24/32 instructions
could probably beat one with a 20/40 encoding in terms of code-density.

So, trying to come up with something that loosely organizes instruction
encodings by probability...

Say, for example, we had an ISA like:
32x GPRs;
16/24/32 bit instructions;
Smaller encodings may use 4-bit registers in some forms.

So, length encoded in bits 2:0 of the first byte, transposed here:
0..7: 16-bit
8..B: 24-bit
C..F: 32-bit

Or:
ZZZZ-mmmm-nnnn-ZZZ0 (2R, 16b, typical)
Znmt-ZZZZ-tttt-ssss-nnnn-ZZ01 (3R, 24b, typical)
ZZZZ-ZZZZ-Znmt-ZZZZ-tttt-ssss-nnnn-ZZ11 (3R, 32b, typical)

ZZZZ-mmmm-nnnn-ZZZ0 (2R, 16b, typical)

Within 16b space:
0SSS-mmmm-nnnn-0000 Store (Rm, 0)
1SSS-mmmm-nnnn-0000 Load (Rm, 0)
0SSS-dddd-nnnn-0010 Store (SP, Disp4)
1SSS-dddd-nnnn-0010 Load (SP, Disp4)

SSS: 000=SB, 001=SW, 010=SL, 011=Q, 100=UB, 101=UW, 110=UL, 111=X

ZZZZ-mmmm-nnnn-0100 ALU Ops (Rm, Rn)
ZZZZ-mmmm-nnnn-0110 ALU Ops (Imm4u, Rn)
0000=ADD, 0001=SUB, 0010=ADDL, 0011=SUBL
0100=TST, 0101=AND, 0110=OR , 0111=XOR
1000=MOV, 1001=MOV, 1010=MOV, 1011=MOV (5-bit Reg)
1100=CMPEQ, 1101=CMPGT
1110=SHAD, 1111=SHLD

ADDL/SUBL=ADD/SUB with 32-bit sign extension.
SHAD/SHLD=Shift

ZZZn-ZZZZ-nnnn-1000 1R Ops (5-bit register)
ZZZZ-ZZZZ-ZZZZ-1010 0R Ops
dddd-dddd-dd00-1100 BRA Disp10s (+/- 512B)
dddd-dddd-dd01-1100 BSR Disp10s
dddd-dddd-dd10-1100 BT Disp10s
dddd-dddd-dd11-1100 BF Disp10s
dddd-dddd-dd00-1110 ADD Disp10u, SP
dddd-dddd-dd01-1110 ADD Disp10n, SP

Within 24b space:
0nmd-0SSS-dddd-mmmm-nnnn-0001 Store (Disp5)
1nmi-0SSS-iiii-mmmm-nnnn-0001 Store (Index)
0nmd-1SSS-dddd-mmmm-nnnn-0001 Load (Disp5)
1nmi-1SSS-iiii-mmmm-nnnn-0001 Load (Index)
0nmt-ZZZZ-tttt-ssss-nnnn-0101 ALU (3R)
1nmi-ZZZZ-iiii-ssss-nnnn-0101 ALU (3RI, Imm5)
...

0nmZ-ZZZZ-ZZZZ-mmmm-nnnn-1001 2R Space
1nmZ-ZZdd-dddd-mmmm-nnnn-1001 -

0ndd-0000-dddd-dddd-nnnn-1101 ADD Imm10u, Rn
0ndd-0001-dddd-dddd-nnnn-1101 ADD Imm10n, Rn
0ndd-0010-dddd-dddd-nnnn-1101 LDI Imm10u, Rn
0ndd-0011-dddd-dddd-nnnn-1101 LDI Imm10n, Rn
0ndd-0100-dddd-dddd-nnnn-1101 TST Imm10u, Rn
0ndd-0101-dddd-dddd-nnnn-1101 TST Imm10n, Rn

0ndd-1000-dddd-dddd-nnnn-1101 CMPEQ Imm10u, Rn
0ndd-1001-dddd-dddd-nnnn-1101 CMPEQ Imm10n, Rn
0ndd-1010-dddd-dddd-nnnn-1101 CMPGT Imm10u, Rn
0ndd-1011-dddd-dddd-nnnn-1101 CMPGT Imm10n, Rn
0ndd-1100-dddd-dddd-nnnn-1101 CMPGE Imm10u, Rn
0ndd-1101-dddd-dddd-nnnn-1101 CMPGE Imm10n, Rn
0ndd-1110-dddd-dddd-nnnn-1101 CMPHI Imm10u, Rn
0ndd-1111-dddd-dddd-nnnn-1101 CMPHS Imm10u, Rn

10dd-0000-dddd-dddd-dddd-1101 BRA Disp14s (+/- 8K)
10dd-0001-dddd-dddd-dddd-1101 BSR Disp14s
10dd-0010-dddd-dddd-dddd-1101 BT Disp14s
10dd-0011-dddd-dddd-dddd-1101 BF Disp14s
10dd-0100-dddd-dddd-dddd-1101 ADD Disp14u, SP (16K)
10dd-0101-dddd-dddd-dddd-1101 ADD Disp14n, SP (16K)

...

Within 32b space:
PP00-0000-0nmi-0SSS-iiii-mmmm-nnnn-0011 Store (Index)
PP00-dddd-1nmd-0SSS-dddd-mmmm-nnnn-0011 Store (Disp9u)
PP00-0000-0nmi-1SSS-iiii-mmmm-nnnn-0011 Load (Index)
PP00-dddd-1nmd-1SSS-dddd-mmmm-nnnn-0011 Load (Disp9u)
PP00-ZZZZ-0nmt-ZZZZ-tttt-ssss-nnnn-0111 ALU (3R Space)
PP00-dddd-1nmd-ZZZZ-dddd-ssss-nnnn-0111 ALU (3RI, Imm9u)
...
PPdd-dddd-0ndd-0000-dddd-dddd-nnnn-1101 ADD Imm16u, Rn
PPdd-dddd-0ndd-0001-dddd-dddd-nnnn-1101 ADD Imm16n, Rn
PPdd-dddd-0ndd-0010-dddd-dddd-nnnn-1101 LDIZ Imm16u, Rn
PPdd-dddd-0ndd-0011-dddd-dddd-nnnn-1101 LDIN Imm16n, Rn
PPdd-dddd-0ndd-0100-dddd-dddd-nnnn-1101 TST Imm16u, Rn
PPdd-dddd-0ndd-0101-dddd-dddd-nnnn-1101 LDISH Imm16u, Rn
...
PPdd-dddd-10dd-0000-dddd-dddd-dddd-1111 BRA Disp20s (+/- 512K)
PPdd-dddd-10dd-0001-dddd-dddd-dddd-1111 BSR Disp20s

ZZdd-dddd-11dd-dddd-dddd-dddd-dddd-1111 Jumbo Group

...

PP: 00=Always, 01=Wide-Execute Hint, 10=If-True, 11=If-False

....

Tweaks are possible, this is just what I have come up with at the moment
(kind of a bit-confetti design, but alas...).

Re: Encoding 20 and 40 bit instructions in 128 bits

<stvvqd$12h$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23318&group=comp.arch#23318

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd6-3f18-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Wed, 9 Feb 2022 08:59:57 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <stvvqd$12h$1@newsreader4.netcologne.de>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de>
<ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>
<stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>
<2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>
<b38538d0-7394-439b-a227-ede56b4b4040n@googlegroups.com>
<_IhLJ.4280$0vE9.17@fx17.iad> <3%tLJ.35102$t2Bb.34664@fx98.iad>
<sto41r$4tj$1@newsreader4.netcologne.de>
<9de2cef4-0cfc-4a6b-a96a-fc7cbc836966n@googlegroups.com>
<stuoqv$97e$1@newsreader4.netcologne.de> <stvi9q$d03$1@dont-email.me>
<stvmm6$s6c$1@newsreader4.netcologne.de>
Injection-Date: Wed, 9 Feb 2022 08:59:57 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd6-3f18-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd6:3f18:0:7285:c2ff:fe6c:992d";
logging-data="1105"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Wed, 9 Feb 2022 08:59 UTC

Thomas Koenig <tkoenig@netcologne.de> schrieb:
> Stephen Fuld <sfuld@alumni.cmu.edu.invalid> schrieb:

>> I am probably missing something, but using your 40% figure
>>
>> (.4 * 20) + (.6 * 40) = 8 + 24 = 32
>
> Density is the number of instructions per unit, you have to
> add the inverse of the number of bits.
>
> 1/(0.4 / 20 + 0.6 / 40) = 1/ (0.02 + 0.015) = 28.57...
>
> (Same as when calculating an avarage density of a mixture).

Ah, never mind. You are right, 40% is not quite enough.
Have to stick in some compare immediate and branch instructions
as well.

Pages:1234567891011121314
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor