Welcome to novaBBS (click a section below)

mail files register newsreader groups login

Message-ID:

<<<<< EVACUATION ROUTE <<<<<

Re: Encoding 20 and 40 bit instructions in 128 bits

Subject	Author
Encoding 20 and 40 bit instructions in 128 bits	Thomas Koenig
Re: Encoding 20 and 40 bit instructions in 128 bits	Stephen Fuld
Re: Encoding 20 and 40 bit instructions in 128 bits	Thomas Koenig
Re: Encoding 20 and 40 bit instructions in 128 bits	Stephen Fuld
Re: Encoding 20 and 40 bit instructions in 128 bits	Thomas Koenig
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	Stephen Fuld
Re: Encoding 20 and 40 bit instructions in 128 bits	Quadibloc
Re: Encoding 20 and 40 bit instructions in 128 bits	Stephen Fuld
Re: Encoding 20 and 40 bit instructions in 128 bits	Quadibloc
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	JimBrakefield
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	JimBrakefield
Re: Encoding 20 and 40 bit instructions in 128 bits	JimBrakefield
Re: Encoding 20 and 40 bit instructions in 128 bits	EricP
Re: Encoding 20 and 40 bit instructions in 128 bits	JimBrakefield
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	EricP
Re: Encoding 20 and 40 bit instructions in 128 bits	EricP
Re: Encoding 20 and 40 bit instructions in 128 bits	EricP
Re: Encoding 20 and 40 bit instructions in 128 bits	Thomas Koenig
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	Thomas Koenig
Re: Encoding 20 and 40 bit instructions in 128 bits	BGB
Re: Encoding 20 and 40 bit instructions in 128 bits	Brett
Re: Encoding 20 and 40 bit instructions in 128 bits	BGB
Re: Encoding 20 and 40 bit instructions in 128 bits	Brett
Re: Encoding 20 and 40 bit instructions in 128 bits	Quadibloc
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	Thomas Koenig
Re: Encoding 20 and 40 bit instructions in 128 bits	Stephen Fuld
Re: Encoding 20 and 40 bit instructions in 128 bits	Stefan Monnier
Re: Encoding 20 and 40 bit instructions in 128 bits	Stephen Fuld
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	Quadibloc
Re: Encoding 20 and 40 bit instructions in 128 bits	Thomas Koenig
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	Stefan Monnier
Re: Encoding 20 and 40 bit instructions in 128 bits	Bernd Linsel
Re: Encoding 20 and 40 bit instructions in 128 bits	Anton Ertl
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	Thomas Koenig
Re: Encoding 20 and 40 bit instructions in 128 bits	Brian G. Lucas
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	Anton Ertl
Re: Encoding 20 and 40 bit instructions in 128 bits	Thomas Koenig
Re: Encoding 20 and 40 bit instructions in 128 bits	BGB
Re: Encoding 20 and 40 bit instructions in 128 bits	EricP
Re: Encoding 20 and 40 bit instructions in 128 bits	BGB
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	Ivan Godard
Re: Encoding 20 and 40 bit instructions in 128 bits	Thomas Koenig
Re: Encoding 20 and 40 bit instructions in 128 bits	Ivan Godard
Re: Encoding 20 and 40 bit instructions in 128 bits	Thomas Koenig
Re: Encoding 20 and 40 bit instructions in 128 bits	Quadibloc
Re: Encoding 20 and 40 bit instructions in 128 bits	Stephen Fuld
Re: Encoding 20 and 40 bit instructions in 128 bits	Ivan Godard
Re: Encoding 20 and 40 bit instructions in 128 bits	Stefan Monnier
Re: Encoding 20 and 40 bit instructions in 128 bits	Ivan Godard
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	John Levine
Re: instruction set binding time, was Encoding 20 and 40 bit	Thomas Koenig
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	Stefan Monnier
Re: instruction set binding time, was Encoding 20 and 40 bit	Ivan Godard
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	Anton Ertl
Re: instruction set binding time, was Encoding 20 and 40 bit	Ivan Godard
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	Stefan Monnier
Re: instruction set binding time, was Encoding 20 and 40 bit	BGB
Re: instruction set binding time, was Encoding 20 and 40 bit	Ivan Godard
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	Anton Ertl
Re: instruction set binding time, was Encoding 20 and 40 bit	Thomas Koenig
Re: instruction set binding time, was Encoding 20 and 40 bit	John Levine
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	Anton Ertl
Re: instruction set binding time, was Encoding 20 and 40 bit	Terje Mathisen
Re: instruction set binding time, was Encoding 20 and 40 bit	MitchAlsup
Re: instruction set binding time, was Encoding 20 and 40 bit	BGB
Re: instruction set binding time, was Encoding 20 and 40 bit	MitchAlsup
Re: instruction set binding time, was Encoding 20 and 40 bit	Terje Mathisen
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	Anton Ertl
Re: instruction set binding time, was Encoding 20 and 40 bit	Ivan Godard
Re: instruction set binding time, was Encoding 20 and 40 bit	Thomas Koenig
Re: instruction set binding time, was Encoding 20 and 40 bit	Ivan Godard
Re: instruction set binding time, was Encoding 20 and 40 bit	Thomas Koenig
Re: instruction set binding time, was Encoding 20 and 40 bit	Ivan Godard
Re: instruction set binding time, was Encoding 20 and 40 bit	Thomas Koenig
Re: instruction set binding time, was Encoding 20 and 40 bit	Ivan Godard
Re: instruction set binding time, was Encoding 20 and 40 bit	MitchAlsup
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	Anton Ertl
Re: instruction set binding time, was Encoding 20 and 40 bit	John Levine
Re: instruction set binding time, was Encoding 20 and 40 bit	Thomas Koenig
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	Anton Ertl
Re: instruction set binding time, was Encoding 20 and 40 bit	Quadibloc
Re: instruction set binding time, was Encoding 20 and 40 bit	BGB
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	Anton Ertl
Re: instruction set binding time, was Encoding 20 and 40 bit	Scott Smader
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	Stefan Monnier
Re: instruction set binding time, was Encoding 20 and 40 bit	Scott Smader
Re: instruction set binding time, was Encoding 20 and 40 bit	Ivan Godard
Re: instruction set binding time, was Encoding 20 and 40 bit	Anton Ertl
Re: instruction set binding time, was Encoding 20 and 40 bit	Ivan Godard
Re: instruction set binding time, was Encoding 20 and 40 bit	MitchAlsup
Re: instruction set binding time, was Encoding 20 and 40 bit	Ivan Godard
Re: instruction set binding time, was Encoding 20 and 40 bit	Anton Ertl
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	James Van Buskirk
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	Anton Ertl
Statically scheduled plus run ahead.	Brett
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	Anton Ertl
Re: instruction set binding time, was Encoding 20 and 40 bit	BGB
Re: instruction set binding time, was Encoding 20 and 40 bit	MitchAlsup
Re: instruction set binding time, was Encoding 20 and 40 bit	Thomas Koenig
Re: instruction set binding time, was Encoding 20 and 40 bit	MitchAlsup
Re: instruction set binding time, was Encoding 20 and 40 bit	MitchAlsup
Re: instruction set binding time, was Encoding 20 and 40 bit instructions in 128	Anton Ertl
Re: instruction set binding time, was Encoding 20 and 40 bit	Ivan Godard
Re: instruction set binding time, was Encoding 20 and 40 bit	MitchAlsup
Re: instruction set binding time, was Encoding 20 and 40 bit	Stephen Fuld
Re: Encoding 20 and 40 bit instructions in 128 bits	Anton Ertl
Re: Encoding 20 and 40 bit instructions in 128 bits	Quadibloc
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	Quadibloc
Re: Encoding 20 and 40 bit instructions in 128 bits	Quadibloc
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	BGB
Re: Encoding 20 and 40 bit instructions in 128 bits	BGB
Re: Encoding 20 and 40 bit instructions in 128 bits	Stephen Fuld
Re: Encoding 20 and 40 bit instructions in 128 bits	Thomas Koenig
Re: Encoding 20 and 40 bit instructions in 128 bits	Quadibloc
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	Ivan Godard
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	MitchAlsup
Re: Encoding 20 and 40 bit instructions in 128 bits	Paul A. Clayton

Pages:12 3 4 5 6 7 8 9 10 11 12 13 14

Encoding 20 and 40 bit instructions in 128 bits

<ssu0r5$p2m$1@newsreader4.netcologne.de>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23163&group=comp.arch#23163

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-b65-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Encoding 20 and 40 bit instructions in 128 bits
Date: Thu, 27 Jan 2022 11:48:53 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <ssu0r5$p2m$1@newsreader4.netcologne.de>
Injection-Date: Thu, 27 Jan 2022 11:48:53 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-b65-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:b65:0:7285:c2ff:fe6c:992d";
logging-data="25686"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Thu, 27 Jan 2022 11:48 UTC

I played around a little bit with the following encoding scheme:

Instructions come in bundles of 128 bits. Each bundle has a eight
bits encoding the format, followed by six slots of 20 bits each.

There can be either a 40-bit instruction in two consecutive slots,
or a 20-bit instruction in a single slot. If there is nothing
useful to do for the last single slot, a NOP is inserted there.
This actually only gives 13 possibilities, which can be efficiently
encoded in four bits. The rest would be reserved.

The 20-bit instructions would probably comprise most Ra = op(Ra,Rb)
instructions, plus a few common loads and stores.

The 40-bit instrucions have more than enough room for Ra = op(Rb,Rc),
even a five-register instruction (Ra,Rb) = op (Rc,Rd,Re) would leave
15 bits for the opcode, assuming 5 bits for register number.

I have checked how this does for instruction density vs having the
same number of instructions in 32 bits, that instructions come
in randomly (the compiler is not reordering instructions to fit
a bundle) and that a 20-bit instruction and a 40-bit instruction
does the same work as a 32-bit instruction

Assuming one gets 40% of 20-bit instructions in the stream, this
would lead to around 10-12% of total reduction in size. This is
not a very significant advantage, but at least it is no drawback.

Comments? From past remarks by Ivan, I have probably elaborated
something close a subset of what the Mill does, but I thought it
an interesting excercise anyway.

Re: Encoding 20 and 40 bit instructions in 128 bits

<ssuf80$i60$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23166&group=comp.arch#23166

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Thu, 27 Jan 2022 07:54:38 -0800
Organization: A noiseless patient Spider
Lines: 45
Message-ID: <ssuf80$i60$1@dont-email.me>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 27 Jan 2022 15:54:40 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="130ee9313f06db75502d276c77e53968";
logging-data="18624"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+dHH1sxYO+8oYnDafAPVvBqHxZeyX6UFo="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:JFnl8ZYrKLIFLWXhQE3zDFY5cj0=
In-Reply-To: <ssu0r5$p2m$1@newsreader4.netcologne.de>
Content-Language: en-US

by: Stephen Fuld - Thu, 27 Jan 2022 15:54 UTC

On 1/27/2022 3:48 AM, Thomas Koenig wrote:
> I played around a little bit with the following encoding scheme:
>
> Instructions come in bundles of 128 bits. Each bundle has a eight
> bits encoding the format, followed by six slots of 20 bits each.
>
> There can be either a 40-bit instruction in two consecutive slots,
> or a 20-bit instruction in a single slot. If there is nothing
> useful to do for the last single slot, a NOP is inserted there.
> This actually only gives 13 possibilities, which can be efficiently
> encoded in four bits. The rest would be reserved.
>
> The 20-bit instructions would probably comprise most Ra = op(Ra,Rb)
> instructions, plus a few common loads and stores.
>
> The 40-bit instrucions have more than enough room for Ra = op(Rb,Rc),
> even a five-register instruction (Ra,Rb) = op (Rc,Rd,Re) would leave
> 15 bits for the opcode, assuming 5 bits for register number.
>
> I have checked how this does for instruction density vs having the
> same number of instructions in 32 bits, that instructions come
> in randomly (the compiler is not reordering instructions to fit
> a bundle) and that a 20-bit instruction and a 40-bit instruction
> does the same work as a 32-bit instruction
>
> Assuming one gets 40% of 20-bit instructions in the stream, this
> would lead to around 10-12% of total reduction in size. This is
> not a very significant advantage, but at least it is no drawback.
>
> Comments?

Presuming that you can't branch into the middle of a bundle there is an
additional advantage and and additional disadvantage.

The disadvantage is that you waste space at the end of each block
preceding a branch target. This reduces your 10-12% space advantage.

The advantage is that you need fewer bits for encoding branch
displacements, or alternatively can branch farther without resorting to
a register to hold the displacement.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Encoding 20 and 40 bit instructions in 128 bits

<ssuj8j$gph$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23167&group=comp.arch#23167

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Thu, 27 Jan 2022 09:03:15 -0800
Organization: A noiseless patient Spider
Lines: 36
Message-ID: <ssuj8j$gph$1@dont-email.me>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 27 Jan 2022 17:03:15 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="5eba3a068417600d975b3f62a6b3aba5";
logging-data="17201"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19GYb7sBXh12yzCeQzk0CfV"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:nTWYk+MH3uKFxKaYrj0NhgEapMA=
In-Reply-To: <ssu0r5$p2m$1@newsreader4.netcologne.de>
Content-Language: en-US

by: Ivan Godard - Thu, 27 Jan 2022 17:03 UTC

Not all that similar - Mill has a variable length bundle and bundle
assignment is significant, while you have a fixed size bundle and budle
assignment is only grouping without semantic effect.

Re: Encoding 20 and 40 bit instructions in 128 bits

<9bd86aee-38c6-47eb-9bb6-e305e3bb5794n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23168&group=comp.arch#23168

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:704f:: with SMTP id y15mr3541618qtm.550.1643305337567;
Thu, 27 Jan 2022 09:42:17 -0800 (PST)
X-Received: by 2002:a9d:7745:: with SMTP id t5mr2772534otl.254.1643305337325;
Thu, 27 Jan 2022 09:42:17 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 27 Jan 2022 09:42:17 -0800 (PST)
In-Reply-To: <ssu0r5$p2m$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:fdab:9770:403c:c245;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:fdab:9770:403c:c245
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <9bd86aee-38c6-47eb-9bb6-e305e3bb5794n@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 27 Jan 2022 17:42:17 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 43

by: MitchAlsup - Thu, 27 Jan 2022 17:42 UTC

On Thursday, January 27, 2022 at 5:48:56 AM UTC-6, Thomas Koenig wrote:
> I played around a little bit with the following encoding scheme:
>
> Instructions come in bundles of 128 bits. Each bundle has a eight
> bits encoding the format, followed by six slots of 20 bits each.
>
> There can be either a 40-bit instruction in two consecutive slots,
> or a 20-bit instruction in a single slot. If there is nothing
> useful to do for the last single slot, a NOP is inserted there.
> This actually only gives 13 possibilities, which can be efficiently
> encoded in four bits. The rest would be reserved.
>
> The 20-bit instructions would probably comprise most Ra = op(Ra,Rb)
> instructions, plus a few common loads and stores.
>
> The 40-bit instrucions have more than enough room for Ra = op(Rb,Rc),
> even a five-register instruction (Ra,Rb) = op (Rc,Rd,Re) would leave
> 15 bits for the opcode, assuming 5 bits for register number.
>
> I have checked how this does for instruction density vs having the
> same number of instructions in 32 bits, that instructions come
> in randomly (the compiler is not reordering instructions to fit
> a bundle) and that a 20-bit instruction and a 40-bit instruction
> does the same work as a 32-bit instruction
>
> Assuming one gets 40% of 20-bit instructions in the stream, this
> would lead to around 10-12% of total reduction in size. This is
> not a very significant advantage, but at least it is no drawback.
>
> Comments? From past remarks by Ivan, I have probably elaborated
> something close a subset of what the Mill does, but I thought it
> an interesting excercise anyway.
<
H&P data shows that somewhere around 9% one manufactures a
constant {immediate or displacement} using instructions in DLX
and MIPS (32-bit data). My 66000 ISA spends no instructions manu-
facturing data. So, I am curious as to how you generate constants
that do not fit in an instruction container ?
<
Secondarily, My 66000 ABI uses ENTER and EXIT instructions instead
of a series of STs and LDs to setup for and then leave called subroutines.
How many instructions does your ISA use to enter and leave a subroutine
that needs to save ½ of the register file ?

Re: Encoding 20 and 40 bit instructions in 128 bits

<ssulkf$7n0$1@newsreader4.netcologne.de>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23169&group=comp.arch#23169

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-b65-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Thu, 27 Jan 2022 17:43:43 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <ssulkf$7n0$1@newsreader4.netcologne.de>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me>
Injection-Date: Thu, 27 Jan 2022 17:43:43 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-b65-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:b65:0:7285:c2ff:fe6c:992d";
logging-data="7904"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Thu, 27 Jan 2022 17:43 UTC

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> schrieb:
> On 1/27/2022 3:48 AM, Thomas Koenig wrote:
>> I played around a little bit with the following encoding scheme:
>>
>> Instructions come in bundles of 128 bits. Each bundle has a eight
>> bits encoding the format, followed by six slots of 20 bits each.
>>
>> There can be either a 40-bit instruction in two consecutive slots,
>> or a 20-bit instruction in a single slot. If there is nothing
>> useful to do for the last single slot, a NOP is inserted there.
>> This actually only gives 13 possibilities, which can be efficiently
>> encoded in four bits. The rest would be reserved.
>>
>> The 20-bit instructions would probably comprise most Ra = op(Ra,Rb)
>> instructions, plus a few common loads and stores.
>>
>> The 40-bit instrucions have more than enough room for Ra = op(Rb,Rc),
>> even a five-register instruction (Ra,Rb) = op (Rc,Rd,Re) would leave
>> 15 bits for the opcode, assuming 5 bits for register number.
>>
>> I have checked how this does for instruction density vs having the
>> same number of instructions in 32 bits, that instructions come
>> in randomly (the compiler is not reordering instructions to fit
>> a bundle) and that a 20-bit instruction and a 40-bit instruction
>> does the same work as a 32-bit instruction
>>
>> Assuming one gets 40% of 20-bit instructions in the stream, this
>> would lead to around 10-12% of total reduction in size. This is
>> not a very significant advantage, but at least it is no drawback.
>>
>> Comments?
>
> Presuming that you can't branch into the middle of a bundle there is an
> additional advantage and and additional disadvantage.

The plan would be to allow branching into a bundle.

The address would then be the address of the bundle, plus the
number of the slot. Relative branches would also be difference
in bundle number + number of the slot.

Branching to a bundle which does not have the right decoding, or to
the second half of a 40-bit instruction, would raise an exception.

Because only 13 out of 16 possibilities are valid, it would be
easy to make an all-zero first nibble an illegal bundle.

Functions would probably start on a 128-bit boundary, there
could also be a "jump far" instruction.

> The disadvantage is that you waste space at the end of each block
> preceding a branch target. This reduces your 10-12% space advantage.
>
> The advantage is that you need fewer bits for encoding branch
> displacements, or alternatively can branch farther without resorting to
> a register to hold the displacement.

Let's look at branch on bit set for a 64-bit register.

Six bit major opcode, six bit for bit selection, five bit source,
leaves 23 bits for relative address, four of which would address
the slot.

Re: Encoding 20 and 40 bit instructions in 128 bits

<ssumn7$7n0$2@newsreader4.netcologne.de>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23170&group=comp.arch#23170

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-b65-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Thu, 27 Jan 2022 18:02:15 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <ssumn7$7n0$2@newsreader4.netcologne.de>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<9bd86aee-38c6-47eb-9bb6-e305e3bb5794n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 27 Jan 2022 18:02:15 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-b65-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:b65:0:7285:c2ff:fe6c:992d";
logging-data="7904"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Thu, 27 Jan 2022 18:02 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:

> H&P data shows that somewhere around 9% one manufactures a
> constant {immediate or displacement} using instructions in DLX
> and MIPS (32-bit data). My 66000 ISA spends no instructions manu-
> facturing data. So, I am curious as to how you generate constants
> that do not fit in an instruction container ?

Not yet worked out.

I see two possibilities, in principle: There are still four bits
left in the first byte of the bundle. Where to find constants
could be encoded there.

This could prove problematic if constants did not fit into
a bundle, and could lead to NOP hell.

Another possibility would be to have an instruction which stored
a constant into a register, either implied in the ISA or a temporary
register which could only be used by the following instruction.
This would be an overhead of eight bits per constant for 32-bit
constants, and probably two for a 64-bit constant.

This would increase the size somewhat, but probably not to the
extent that it would be worse than a standard 32-bit encoding.

> Secondarily, My 66000 ABI uses ENTER and EXIT instructions instead
> of a series of STs and LDs to setup for and then leave called subroutines.
> How many instructions does your ISA use to enter and leave a subroutine
> that needs to save ½ of the register file ?

I am about as far from a finished ISA as it is possible to be :-)
The current status probably best described as "toying with ideas".

If it ever gets that far (which I somewhat doubt) I would probably
take a page out of your book.

Re: Encoding 20 and 40 bit instructions in 128 bits

<ssun38$imq$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23171&group=comp.arch#23171

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Thu, 27 Jan 2022 10:08:38 -0800
Organization: A noiseless patient Spider
Lines: 93
Message-ID: <ssun38$imq$1@dont-email.me>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 27 Jan 2022 18:08:40 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="130ee9313f06db75502d276c77e53968";
logging-data="19162"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1978uYqNkHoFEXgm7B82+WcsyM/4r8bsLE="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:0HrmFAFB81jIAd1/gd9MkYh7kmM=
In-Reply-To: <ssulkf$7n0$1@newsreader4.netcologne.de>
Content-Language: en-US

by: Stephen Fuld - Thu, 27 Jan 2022 18:08 UTC

On 1/27/2022 9:43 AM, Thomas Koenig wrote:
> Stephen Fuld <sfuld@alumni.cmu.edu.invalid> schrieb:
>> On 1/27/2022 3:48 AM, Thomas Koenig wrote:
>>> I played around a little bit with the following encoding scheme:
>>>
>>> Instructions come in bundles of 128 bits. Each bundle has a eight
>>> bits encoding the format, followed by six slots of 20 bits each.
>>>
>>> There can be either a 40-bit instruction in two consecutive slots,
>>> or a 20-bit instruction in a single slot. If there is nothing
>>> useful to do for the last single slot, a NOP is inserted there.
>>> This actually only gives 13 possibilities, which can be efficiently
>>> encoded in four bits. The rest would be reserved.
>>>
>>> The 20-bit instructions would probably comprise most Ra = op(Ra,Rb)
>>> instructions, plus a few common loads and stores.
>>>
>>> The 40-bit instrucions have more than enough room for Ra = op(Rb,Rc),
>>> even a five-register instruction (Ra,Rb) = op (Rc,Rd,Re) would leave
>>> 15 bits for the opcode, assuming 5 bits for register number.
>>>
>>> I have checked how this does for instruction density vs having the
>>> same number of instructions in 32 bits, that instructions come
>>> in randomly (the compiler is not reordering instructions to fit
>>> a bundle) and that a 20-bit instruction and a 40-bit instruction
>>> does the same work as a 32-bit instruction
>>>
>>> Assuming one gets 40% of 20-bit instructions in the stream, this
>>> would lead to around 10-12% of total reduction in size. This is
>>> not a very significant advantage, but at least it is no drawback.
>>>
>>> Comments?
>>
>> Presuming that you can't branch into the middle of a bundle there is an
>> additional advantage and and additional disadvantage.
>
> The plan would be to allow branching into a bundle.

OK. But that has its own costs. See below.

> The address would then be the address of the bundle, plus the
> number of the slot. Relative branches would also be difference
> in bundle number + number of the slot.
>
> Branching to a bundle which does not have the right decoding, or to
> the second half of a 40-bit instruction, would raise an exception.

So you have to fetch the whole bundle into the CPU in order to check
that, even if you aren't going to execute the instructions at the
beginning of the bundle.

It also slightly complicates getting the address of the next instruction
to execute, since instead of a simple add, you have to keep track of
when you are executing the last instruction of a bundle in order to not
do the simple add, but add one to the bundle number and reset the
instruction within bundle number to zero.

> Because only 13 out of 16 possibilities are valid, it would be
> easy to make an all-zero first nibble an illegal bundle.

While I agree, I don't see how that helps you when branching into the
middle of a bundle.

>
> Functions would probably start on a 128-bit boundary, there
> could also be a "jump far" instruction.
>
>> The disadvantage is that you waste space at the end of each block
>> preceding a branch target. This reduces your 10-12% space advantage.
>>
>> The advantage is that you need fewer bits for encoding branch
>> displacements, or alternatively can branch farther without resorting to
>> a register to hold the displacement.
>
> Let's look at branch on bit set for a 64-bit register.
>
> Six bit major opcode, six bit for bit selection, five bit source,
> leaves 23 bits for relative address, four of which would address
> the slot.

Don't you want the relatively frequent small displacement branches to be
in the 20 bit category?

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Encoding 20 and 40 bit instructions in 128 bits

<7e473bbb-3696-4a13-be2c-ac06eeb110fbn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23172&group=comp.arch#23172

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:6214:e8c:: with SMTP id hf12mr4400470qvb.48.1643307644601;
Thu, 27 Jan 2022 10:20:44 -0800 (PST)
X-Received: by 2002:a05:6830:1493:: with SMTP id s19mr2815358otq.85.1643307644270;
Thu, 27 Jan 2022 10:20:44 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 27 Jan 2022 10:20:44 -0800 (PST)
In-Reply-To: <ssulkf$7n0$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:fdab:9770:403c:c245;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:fdab:9770:403c:c245
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <7e473bbb-3696-4a13-be2c-ac06eeb110fbn@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 27 Jan 2022 18:20:44 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 77

by: MitchAlsup - Thu, 27 Jan 2022 18:20 UTC

On Thursday, January 27, 2022 at 11:43:46 AM UTC-6, Thomas Koenig wrote:
> Stephen Fuld <sf...@alumni.cmu.edu.invalid> schrieb:
> > On 1/27/2022 3:48 AM, Thomas Koenig wrote:
> >> I played around a little bit with the following encoding scheme:
> >>
> >> Instructions come in bundles of 128 bits. Each bundle has a eight
> >> bits encoding the format, followed by six slots of 20 bits each.
> >>
> >> There can be either a 40-bit instruction in two consecutive slots,
> >> or a 20-bit instruction in a single slot. If there is nothing
> >> useful to do for the last single slot, a NOP is inserted there.
> >> This actually only gives 13 possibilities, which can be efficiently
> >> encoded in four bits. The rest would be reserved.
> >>
> >> The 20-bit instructions would probably comprise most Ra = op(Ra,Rb)
> >> instructions, plus a few common loads and stores.
> >>
> >> The 40-bit instrucions have more than enough room for Ra = op(Rb,Rc),
> >> even a five-register instruction (Ra,Rb) = op (Rc,Rd,Re) would leave
> >> 15 bits for the opcode, assuming 5 bits for register number.
> >>
> >> I have checked how this does for instruction density vs having the
> >> same number of instructions in 32 bits, that instructions come
> >> in randomly (the compiler is not reordering instructions to fit
> >> a bundle) and that a 20-bit instruction and a 40-bit instruction
> >> does the same work as a 32-bit instruction
> >>
> >> Assuming one gets 40% of 20-bit instructions in the stream, this
> >> would lead to around 10-12% of total reduction in size. This is
> >> not a very significant advantage, but at least it is no drawback.
> >>
> >> Comments?
> >
> > Presuming that you can't branch into the middle of a bundle there is an
> > additional advantage and and additional disadvantage.
> The plan would be to allow branching into a bundle.
>
> The address would then be the address of the bundle, plus the
> number of the slot. Relative branches would also be difference
> in bundle number + number of the slot.
>
> Branching to a bundle which does not have the right decoding, or to
> the second half of a 40-bit instruction, would raise an exception.
>
> Because only 13 out of 16 possibilities are valid, it would be
> easy to make an all-zero first nibble an illegal bundle.
>
> Functions would probably start on a 128-bit boundary, there
> could also be a "jump far" instruction.
> > The disadvantage is that you waste space at the end of each block
> > preceding a branch target. This reduces your 10-12% space advantage.
> >
> > The advantage is that you need fewer bits for encoding branch
> > displacements, or alternatively can branch farther without resorting to
> > a register to hold the displacement.
> Let's look at branch on bit set for a 64-bit register.
>
> Six bit major opcode, six bit for bit selection, five bit source,
> leaves 23 bits for relative address, four of which would address
> the slot.
<
In My 66000, I used a pair of major OpCodes and used the LoB of the
major field as the HoB of the bit-number, so the cost of the OpCode
field is only 5-bits.
So, the counting them becomes
major = 5, select = 6, operand = 5, displacement = 16 (±1/8 MB..)
Yours would have 4-more bits of ±2MB.
<
BTW, Brian's compiler is yet to encounter a conditional branch needing more..
<
Direct Branches and Calls have 26-bit displacements (±1/8 GB)

Re: Encoding 20 and 40 bit instructions in 128 bits

<ssuo1h$9rb$1@newsreader4.netcologne.de>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23173&group=comp.arch#23173

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-b65-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Thu, 27 Jan 2022 18:24:49 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <ssuo1h$9rb$1@newsreader4.netcologne.de>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de>
<ssun38$imq$1@dont-email.me>
Injection-Date: Thu, 27 Jan 2022 18:24:49 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-b65-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:b65:0:7285:c2ff:fe6c:992d";
logging-data="10091"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Thu, 27 Jan 2022 18:24 UTC

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> schrieb:
> On 1/27/2022 9:43 AM, Thomas Koenig wrote:
>> Stephen Fuld <sfuld@alumni.cmu.edu.invalid> schrieb:
>>> On 1/27/2022 3:48 AM, Thomas Koenig wrote:
>>>> I played around a little bit with the following encoding scheme:
>>>>
>>>> Instructions come in bundles of 128 bits. Each bundle has a eight
>>>> bits encoding the format, followed by six slots of 20 bits each.
>>>>
>>>> There can be either a 40-bit instruction in two consecutive slots,
>>>> or a 20-bit instruction in a single slot. If there is nothing
>>>> useful to do for the last single slot, a NOP is inserted there.
>>>> This actually only gives 13 possibilities, which can be efficiently
>>>> encoded in four bits. The rest would be reserved.
>>>>
>>>> The 20-bit instructions would probably comprise most Ra = op(Ra,Rb)
>>>> instructions, plus a few common loads and stores.
>>>>
>>>> The 40-bit instrucions have more than enough room for Ra = op(Rb,Rc),
>>>> even a five-register instruction (Ra,Rb) = op (Rc,Rd,Re) would leave
>>>> 15 bits for the opcode, assuming 5 bits for register number.
>>>>
>>>> I have checked how this does for instruction density vs having the
>>>> same number of instructions in 32 bits, that instructions come
>>>> in randomly (the compiler is not reordering instructions to fit
>>>> a bundle) and that a 20-bit instruction and a 40-bit instruction
>>>> does the same work as a 32-bit instruction
>>>>
>>>> Assuming one gets 40% of 20-bit instructions in the stream, this
>>>> would lead to around 10-12% of total reduction in size. This is
>>>> not a very significant advantage, but at least it is no drawback.
>>>>
>>>> Comments?
>>>
>>> Presuming that you can't branch into the middle of a bundle there is an
>>> additional advantage and and additional disadvantage.
>>
>> The plan would be to allow branching into a bundle.
>
> OK. But that has its own costs. See below.
>
>
>> The address would then be the address of the bundle, plus the
>> number of the slot. Relative branches would also be difference
>> in bundle number + number of the slot.
>>
>> Branching to a bundle which does not have the right decoding, or to
>> the second half of a 40-bit instruction, would raise an exception.
>
>
> So you have to fetch the whole bundle into the CPU in order to check
> that, even if you aren't going to execute the instructions at the
> beginning of the bundle.

Not necessarily.

It can be seen from the first byte (half-byte, really) if the
branch is valid or not.

> It also slightly complicates getting the address of the next instruction
> to execute, since instead of a simple add, you have to keep track of
> when you are executing the last instruction of a bundle in order to not
> do the simple add, but add one to the bundle number and reset the
> instruction within bundle number to zero.

I would expect the decoder to expand the instructions upon loading,
the short instructions into long ones, or straight into micro-ops.

But yes, the handling of the PC logic would be a bit more complex.

>> Because only 13 out of 16 possibilities are valid, it would be
>> easy to make an all-zero first nibble an illegal bundle.
>
> While I agree, I don't see how that helps you when branching into the
> middle of a bundle.

More of a general remark about branching :-)

>>
>> Functions would probably start on a 128-bit boundary, there
>> could also be a "jump far" instruction.
>>
>>> The disadvantage is that you waste space at the end of each block
>>> preceding a branch target. This reduces your 10-12% space advantage.
>>>
>>> The advantage is that you need fewer bits for encoding branch
>>> displacements, or alternatively can branch farther without resorting to
>>> a register to hold the displacement.
>>
>> Let's look at branch on bit set for a 64-bit register.
>>
>> Six bit major opcode, six bit for bit selection, five bit source,
>> leaves 23 bits for relative address, four of which would address
>> the slot.
>
> Don't you want the relatively frequent small displacement branches to be
> in the 20 bit category?

Sure, but bits are scarce there :-) so some statistics about existing
programs would have to go into what conditions to check and what range
to cover. Possibly, only BEQ, BNE, BMI and BPL would make the cut.

Re: Encoding 20 and 40 bit instructions in 128 bits

<f03bba8f-4aac-423c-a62e-6c509ea9c751n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23174&group=comp.arch#23174

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:1188:: with SMTP id m8mr3733122qtk.349.1643308101665;
Thu, 27 Jan 2022 10:28:21 -0800 (PST)
X-Received: by 2002:a05:6808:689:: with SMTP id k9mr2991838oig.281.1643308101423;
Thu, 27 Jan 2022 10:28:21 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 27 Jan 2022 10:28:21 -0800 (PST)
In-Reply-To: <ssun38$imq$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:fdab:9770:403c:c245;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:fdab:9770:403c:c245
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f03bba8f-4aac-423c-a62e-6c509ea9c751n@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 27 Jan 2022 18:28:21 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 97

by: MitchAlsup - Thu, 27 Jan 2022 18:28 UTC

On Thursday, January 27, 2022 at 12:08:43 PM UTC-6, Stephen Fuld wrote:
> On 1/27/2022 9:43 AM, Thomas Koenig wrote:
> > Stephen Fuld <sf...@alumni.cmu.edu.invalid> schrieb:
> >> On 1/27/2022 3:48 AM, Thomas Koenig wrote:
> >>> I played around a little bit with the following encoding scheme:
> >>>
> >>> Instructions come in bundles of 128 bits. Each bundle has a eight
> >>> bits encoding the format, followed by six slots of 20 bits each.
> >>>
> >>> There can be either a 40-bit instruction in two consecutive slots,
> >>> or a 20-bit instruction in a single slot. If there is nothing
> >>> useful to do for the last single slot, a NOP is inserted there.
> >>> This actually only gives 13 possibilities, which can be efficiently
> >>> encoded in four bits. The rest would be reserved.
> >>>
> >>> The 20-bit instructions would probably comprise most Ra = op(Ra,Rb)
> >>> instructions, plus a few common loads and stores.
> >>>
> >>> The 40-bit instrucions have more than enough room for Ra = op(Rb,Rc),
> >>> even a five-register instruction (Ra,Rb) = op (Rc,Rd,Re) would leave
> >>> 15 bits for the opcode, assuming 5 bits for register number.
> >>>
> >>> I have checked how this does for instruction density vs having the
> >>> same number of instructions in 32 bits, that instructions come
> >>> in randomly (the compiler is not reordering instructions to fit
> >>> a bundle) and that a 20-bit instruction and a 40-bit instruction
> >>> does the same work as a 32-bit instruction
> >>>
> >>> Assuming one gets 40% of 20-bit instructions in the stream, this
> >>> would lead to around 10-12% of total reduction in size. This is
> >>> not a very significant advantage, but at least it is no drawback.
> >>>
> >>> Comments?
> >>
> >> Presuming that you can't branch into the middle of a bundle there is an
> >> additional advantage and and additional disadvantage.
> >
> > The plan would be to allow branching into a bundle.
> OK. But that has its own costs. See below.
> > The address would then be the address of the bundle, plus the
> > number of the slot. Relative branches would also be difference
> > in bundle number + number of the slot.
> >
> > Branching to a bundle which does not have the right decoding, or to
> > the second half of a 40-bit instruction, would raise an exception.
> So you have to fetch the whole bundle into the CPU in order to check
> that, even if you aren't going to execute the instructions at the
> beginning of the bundle.
<
I do not think fetching a block of 128 bits is a problem--that is probably
the MINIMUM size of an instruction cache fetch anyway (hint I fetch
this wide in the smallest My 66000 implementation).
>
> It also slightly complicates getting the address of the next instruction
> to execute, since instead of a simple add, you have to keep track of
> when you are executing the last instruction of a bundle in order to not
> do the simple add, but add one to the bundle number and reset the
> instruction within bundle number to zero.
<
You keep track of where you are in a bundle in unary form not in
2s-complement form. When the shifter produces a bit outside of
the shift range, you fetch the next bundle. It might not be what a
SW person would do, but it is quite straightforward in HW.
<
But you COULD keep it in binary form (2s-c) and use a decoder that
spits out {8,28,48,68,88,108} as bit offsets into the bundle.
<
Believe me this is not a problem at all for HW.
<
> > Because only 13 out of 16 possibilities are valid, it would be
> > easy to make an all-zero first nibble an illegal bundle.
> While I agree, I don't see how that helps you when branching into the
> middle of a bundle.
<
If you have to fetch the bundle, looking at the format-code there seems
completely plausible while fields are being decoded. So, I see this as
nothing more than a red-herring.
> >
> > Functions would probably start on a 128-bit boundary, there
> > could also be a "jump far" instruction.
> >
> >> The disadvantage is that you waste space at the end of each block
> >> preceding a branch target. This reduces your 10-12% space advantage.
> >>
> >> The advantage is that you need fewer bits for encoding branch
> >> displacements, or alternatively can branch farther without resorting to
> >> a register to hold the displacement.
> >
> > Let's look at branch on bit set for a 64-bit register.
> >
> > Six bit major opcode, six bit for bit selection, five bit source,
> > leaves 23 bits for relative address, four of which would address
> > the slot.
> Don't you want the relatively frequent small displacement branches to be
> in the 20 bit category?
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: Encoding 20 and 40 bit instructions in 128 bits

<cf272be7-c7c2-4213-9b8d-6c2c29e2837cn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23179&group=comp.arch#23179

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:5c16:: with SMTP id i22mr4203226qti.669.1643318945852;
Thu, 27 Jan 2022 13:29:05 -0800 (PST)
X-Received: by 2002:a9d:628e:: with SMTP id x14mr3226563otk.38.1643318945638;
Thu, 27 Jan 2022 13:29:05 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 27 Jan 2022 13:29:05 -0800 (PST)
In-Reply-To: <ssu0r5$p2m$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:fdab:9770:403c:c245;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:fdab:9770:403c:c245
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <cf272be7-c7c2-4213-9b8d-6c2c29e2837cn@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 27 Jan 2022 21:29:05 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 45

by: MitchAlsup - Thu, 27 Jan 2022 21:29 UTC

On Thursday, January 27, 2022 at 5:48:56 AM UTC-6, Thomas Koenig wrote:
> I played around a little bit with the following encoding scheme:
>
> Instructions come in bundles of 128 bits. Each bundle has a eight
> bits encoding the format, followed by six slots of 20 bits each.
>
> There can be either a 40-bit instruction in two consecutive slots,
> or a 20-bit instruction in a single slot. If there is nothing
> useful to do for the last single slot, a NOP is inserted there.
> This actually only gives 13 possibilities, which can be efficiently
> encoded in four bits. The rest would be reserved.
<
if you used six (6) of these bits, you could directly specify which of the
20-bit containers contained the first ½ of an instruction (or all of an
instruction if it is 20-bits.) This gets rid of needing a NoOp, and also
facilitates placing constants in the unused instruction slots, while
satisfying the check of only branching into a bundle at an instruction
boundary.
>
> The 20-bit instructions would probably comprise most Ra = op(Ra,Rb)
> instructions, plus a few common loads and stores.
>
> The 40-bit instrucions have more than enough room for Ra = op(Rb,Rc),
> even a five-register instruction (Ra,Rb) = op (Rc,Rd,Re) would leave
> 15 bits for the opcode, assuming 5 bits for register number.
<
Separate or combined register files ?
>
> I have checked how this does for instruction density vs having the
> same number of instructions in 32 bits, that instructions come
> in randomly (the compiler is not reordering instructions to fit
> a bundle) and that a 20-bit instruction and a 40-bit instruction
> does the same work as a 32-bit instruction
>
> Assuming one gets 40% of 20-bit instructions in the stream, this
> would lead to around 10-12% of total reduction in size. This is
> not a very significant advantage, but at least it is no drawback.
<
So you have only broken even with std encoding. That 8-bit bundle
header is keeping you from gaining in the code density department.
>
> Comments? From past remarks by Ivan, I have probably elaborated
> something close a subset of what the Mill does, but I thought it
> an interesting excercise anyway.

Re: Encoding 20 and 40 bit instructions in 128 bits

<st194b$t78$1@newsreader4.netcologne.de>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23195&group=comp.arch#23195

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!news.freedyn.de!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd4-fe05-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Fri, 28 Jan 2022 17:28:43 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <st194b$t78$1@newsreader4.netcologne.de>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<cf272be7-c7c2-4213-9b8d-6c2c29e2837cn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 28 Jan 2022 17:28:43 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd4-fe05-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd4:fe05:0:7285:c2ff:fe6c:992d";
logging-data="29928"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Fri, 28 Jan 2022 17:28 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:
> On Thursday, January 27, 2022 at 5:48:56 AM UTC-6, Thomas Koenig wrote:
>> I played around a little bit with the following encoding scheme:
>>
>> Instructions come in bundles of 128 bits. Each bundle has a eight
>> bits encoding the format, followed by six slots of 20 bits each.
>>
>> There can be either a 40-bit instruction in two consecutive slots,
>> or a 20-bit instruction in a single slot. If there is nothing
>> useful to do for the last single slot, a NOP is inserted there.
>> This actually only gives 13 possibilities, which can be efficiently
>> encoded in four bits. The rest would be reserved.
><
> if you used six (6) of these bits, you could directly specify which of the
> 20-bit containers contained the first ½ of an instruction (or all of an
> instruction if it is 20-bits.)

You mean being able to split an instruction across two bundles?

> This gets rid of needing a NoOp, and also
> facilitates placing constants in the unused instruction slots, while
> satisfying the check of only branching into a bundle at an instruction
> boundary.

Yes.

>>
>> The 20-bit instructions would probably comprise most Ra = op(Ra,Rb)
>> instructions, plus a few common loads and stores.
>>
>> The 40-bit instrucions have more than enough room for Ra = op(Rb,Rc),
>> even a five-register instruction (Ra,Rb) = op (Rc,Rd,Re) would leave
>> 15 bits for the opcode, assuming 5 bits for register number.
><
> Separate or combined register files ?

I'm not that far yet :-)

>>
>> I have checked how this does for instruction density vs having the
>> same number of instructions in 32 bits, that instructions come
>> in randomly (the compiler is not reordering instructions to fit
>> a bundle) and that a 20-bit instruction and a 40-bit instruction
>> does the same work as a 32-bit instruction
>>
>> Assuming one gets 40% of 20-bit instructions in the stream, this
>> would lead to around 10-12% of total reduction in size. This is
>> not a very significant advantage, but at least it is no drawback.
><
> So you have only broken even with std encoding. That 8-bit bundle
> header is keeping you from gaining in the code density department.

Could be.

However, the main idea was not per se to save instruction density,
but rather to allow for more freedom in encoding by having 40-bit
instructions.

Re: Encoding 20 and 40 bit instructions in 128 bits

<st19v1$dmb$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23196&group=comp.arch#23196

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Fri, 28 Jan 2022 09:42:57 -0800
Organization: A noiseless patient Spider
Lines: 46
Message-ID: <st19v1$dmb$1@dont-email.me>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de>
<ssun38$imq$1@dont-email.me>
<f03bba8f-4aac-423c-a62e-6c509ea9c751n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 28 Jan 2022 17:42:57 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="76e4e5052de439f53379af31db2a6fa2";
logging-data="14027"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19mw4Qxmu2Rczi8oRlUQ5Cm4bZloj63hc4="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.1
Cancel-Lock: sha1:MPNK1RTDxbSFvCmEhSlZjgEQHKw=
In-Reply-To: <f03bba8f-4aac-423c-a62e-6c509ea9c751n@googlegroups.com>
Content-Language: en-US

by: Stephen Fuld - Fri, 28 Jan 2022 17:42 UTC

On 1/27/2022 10:28 AM, MitchAlsup wrote:
> On Thursday, January 27, 2022 at 12:08:43 PM UTC-6, Stephen Fuld wrote:
>> On 1/27/2022 9:43 AM, Thomas Koenig wrote:

snip

>>> The plan would be to allow branching into a bundle.
>> OK. But that has its own costs. See below.
>>> The address would then be the address of the bundle, plus the
>>> number of the slot. Relative branches would also be difference
>>> in bundle number + number of the slot.
>>>
>>> Branching to a bundle which does not have the right decoding, or to
>>> the second half of a 40-bit instruction, would raise an exception.
>> So you have to fetch the whole bundle into the CPU in order to check
>> that, even if you aren't going to execute the instructions at the
>> beginning of the bundle.
> <
> I do not think fetching a block of 128 bits is a problem--that is probably
> the MINIMUM size of an instruction cache fetch anyway (hint I fetch
> this wide in the smallest My 66000 implementation).
>>
>> It also slightly complicates getting the address of the next instruction
>> to execute, since instead of a simple add, you have to keep track of
>> when you are executing the last instruction of a bundle in order to not
>> do the simple add, but add one to the bundle number and reset the
>> instruction within bundle number to zero.
> <
> You keep track of where you are in a bundle in unary form not in
> 2s-complement form. When the shifter produces a bit outside of
> the shift range, you fetch the next bundle. It might not be what a
> SW person would do, but it is quite straightforward in HW.
> <
> But you COULD keep it in binary form (2s-c) and use a decoder that
> spits out {8,28,48,68,88,108} as bit offsets into the bundle.
> <
> Believe me this is not a problem at all for HW.

Interesting. Thanks.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Encoding 20 and 40 bit instructions in 128 bits

<31ea65b9-93d6-4f3e-8847-d0174e5b7fefn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23197&group=comp.arch#23197

copy link Newsgroups: comp.arch

X-Received: by 2002:ad4:594f:: with SMTP id eo15mr8080344qvb.59.1643395785101;
Fri, 28 Jan 2022 10:49:45 -0800 (PST)
X-Received: by 2002:aca:acce:: with SMTP id v197mr6576098oie.272.1643395784895;
Fri, 28 Jan 2022 10:49:44 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 28 Jan 2022 10:49:44 -0800 (PST)
In-Reply-To: <st194b$t78$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:c9be:b307:80ea:d92;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:c9be:b307:80ea:d92
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <cf272be7-c7c2-4213-9b8d-6c2c29e2837cn@googlegroups.com>
<st194b$t78$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <31ea65b9-93d6-4f3e-8847-d0174e5b7fefn@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 28 Jan 2022 18:49:45 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 71

by: MitchAlsup - Fri, 28 Jan 2022 18:49 UTC

On Friday, January 28, 2022 at 11:28:47 AM UTC-6, Thomas Koenig wrote:
> MitchAlsup <Mitch...@aol.com> schrieb:
> > On Thursday, January 27, 2022 at 5:48:56 AM UTC-6, Thomas Koenig wrote:
> >> I played around a little bit with the following encoding scheme:
> >>
> >> Instructions come in bundles of 128 bits. Each bundle has a eight
> >> bits encoding the format, followed by six slots of 20 bits each.
> >>
> >> There can be either a 40-bit instruction in two consecutive slots,
> >> or a 20-bit instruction in a single slot. If there is nothing
> >> useful to do for the last single slot, a NOP is inserted there.
> >> This actually only gives 13 possibilities, which can be efficiently
> >> encoded in four bits. The rest would be reserved.
> ><
> > if you used six (6) of these bits, you could directly specify which of the
> > 20-bit containers contained the first ½ of an instruction (or all of an
> > instruction if it is 20-bits.)
> You mean being able to split an instruction across two bundles?
<
No, I was thinking about how the parser identifies instructions. set (1)
indicates its an instruction, clear (0) indicates either the second ½ of
an instruction or a 20-bit constant, depending on what the previous
bit contained. Then if the first ½ is not-an-instruction and the second
½ is also not an instruction, then you have a 40-bit constant, and so
on.
<
> > This gets rid of needing a NoOp, and also
> > facilitates placing constants in the unused instruction slots, while
> > satisfying the check of only branching into a bundle at an instruction
> > boundary.
> Yes.
> >>
> >> The 20-bit instructions would probably comprise most Ra = op(Ra,Rb)
> >> instructions, plus a few common loads and stores.
> >>
> >> The 40-bit instrucions have more than enough room for Ra = op(Rb,Rc),
> >> even a five-register instruction (Ra,Rb) = op (Rc,Rd,Re) would leave
> >> 15 bits for the opcode, assuming 5 bits for register number.
> ><
> > Separate or combined register files ?
> I'm not that far yet :-)
> >>
> >> I have checked how this does for instruction density vs having the
> >> same number of instructions in 32 bits, that instructions come
> >> in randomly (the compiler is not reordering instructions to fit
> >> a bundle) and that a 20-bit instruction and a 40-bit instruction
> >> does the same work as a 32-bit instruction
> >>
> >> Assuming one gets 40% of 20-bit instructions in the stream, this
> >> would lead to around 10-12% of total reduction in size. This is
> >> not a very significant advantage, but at least it is no drawback.
> ><
> > So you have only broken even with std encoding. That 8-bit bundle
> > header is keeping you from gaining in the code density department.
> Could be.
>
> However, the main idea was not per se to save instruction density,
> but rather to allow for more freedom in encoding by having 40-bit
> instructions.
<
Well, I am on record as saying the right size for encoding an instruction
is 36-bits......

Re: Encoding 20 and 40 bit instructions in 128 bits

<st1l7k$6fq$1@newsreader4.netcologne.de>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23200&group=comp.arch#23200

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd4-fe05-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Fri, 28 Jan 2022 20:55:16 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <st1l7k$6fq$1@newsreader4.netcologne.de>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<cf272be7-c7c2-4213-9b8d-6c2c29e2837cn@googlegroups.com>
<st194b$t78$1@newsreader4.netcologne.de>
<31ea65b9-93d6-4f3e-8847-d0174e5b7fefn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 28 Jan 2022 20:55:16 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd4-fe05-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd4:fe05:0:7285:c2ff:fe6c:992d";
logging-data="6650"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Fri, 28 Jan 2022 20:55 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:
> On Friday, January 28, 2022 at 11:28:47 AM UTC-6, Thomas Koenig wrote:
>> MitchAlsup <Mitch...@aol.com> schrieb:
>> > On Thursday, January 27, 2022 at 5:48:56 AM UTC-6, Thomas Koenig wrote:
>> >> I played around a little bit with the following encoding scheme:
>> >>
>> >> Instructions come in bundles of 128 bits. Each bundle has a eight
>> >> bits encoding the format, followed by six slots of 20 bits each.
>> >>
>> >> There can be either a 40-bit instruction in two consecutive slots,
>> >> or a 20-bit instruction in a single slot. If there is nothing
>> >> useful to do for the last single slot, a NOP is inserted there.
>> >> This actually only gives 13 possibilities, which can be efficiently
>> >> encoded in four bits. The rest would be reserved.
>> ><
>> > if you used six (6) of these bits, you could directly specify which of the
>> > 20-bit containers contained the first ½ of an instruction (or all of an
>> > instruction if it is 20-bits.)
>> You mean being able to split an instruction across two bundles?
><
> No, I was thinking about how the parser identifies instructions. set (1)
> indicates its an instruction, clear (0) indicates either the second ½ of
> an instruction or a 20-bit constant, depending on what the previous
> bit contained. Then if the first ½ is not-an-instruction and the second
> ½ is also not an instruction, then you have a 40-bit constant, and so
> on.

Ah, I think I see what you mean (which took some time :-).

In effect, whether a slot is interpreted as a constant or not as
part of the instruction is a bit arbitrary. The six-bit encoding
would give maximum flexibility in that sense.

(Hm, come to think of it, assuming an instruction was always at
the start of a bundle, then five bits would be enough).

>> However, the main idea was not per se to save instruction density,
>> but rather to allow for more freedom in encoding by having 40-bit
>> instructions.
><
> Well, I am on record as saying the right size for encoding an instruction
> is 36-bits......

I remember ;-)

An alternative might be to do away with the header and split the bundle
into seven 18-bit chunks, which would leave two unused bits. The
limits between the instructions would then be parsed individually by
the decoder, same as any variable-length ISA, as long as the
instructions are of multiple size of a slot. 18 bits should be enough
to encode many simple instructions for better code density.

As for the first bits (or the leftover bits in the header in the
20-bit sheme) - this could signal if the bundle contains valid
jump targets for a branch, or if a jump through a register to the
start or into this bundle is valid. Replace four bytes of endbr64
with a bit... not too bad :-)

Re: Encoding 20 and 40 bit instructions in 128 bits

<d025f8a5-3148-4efc-884f-f2cd8a09b14an@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23203&group=comp.arch#23203

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:594b:: with SMTP id 11mr7580592qtz.463.1643406567268;
Fri, 28 Jan 2022 13:49:27 -0800 (PST)
X-Received: by 2002:a05:6808:1a0c:: with SMTP id bk12mr11172140oib.64.1643406567049;
Fri, 28 Jan 2022 13:49:27 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 28 Jan 2022 13:49:26 -0800 (PST)
In-Reply-To: <st1l7k$6fq$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:6542:406c:754e:48de;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:6542:406c:754e:48de
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <cf272be7-c7c2-4213-9b8d-6c2c29e2837cn@googlegroups.com>
<st194b$t78$1@newsreader4.netcologne.de> <31ea65b9-93d6-4f3e-8847-d0174e5b7fefn@googlegroups.com>
<st1l7k$6fq$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <d025f8a5-3148-4efc-884f-f2cd8a09b14an@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 28 Jan 2022 21:49:27 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 75

by: MitchAlsup - Fri, 28 Jan 2022 21:49 UTC

On Friday, January 28, 2022 at 2:55:19 PM UTC-6, Thomas Koenig wrote:
> MitchAlsup <Mitch...@aol.com> schrieb:
> > On Friday, January 28, 2022 at 11:28:47 AM UTC-6, Thomas Koenig wrote:
> >> MitchAlsup <Mitch...@aol.com> schrieb:
> >> > On Thursday, January 27, 2022 at 5:48:56 AM UTC-6, Thomas Koenig wrote:
> >> >> I played around a little bit with the following encoding scheme:
> >> >>
> >> >> Instructions come in bundles of 128 bits. Each bundle has a eight
> >> >> bits encoding the format, followed by six slots of 20 bits each.
> >> >>
> >> >> There can be either a 40-bit instruction in two consecutive slots,
> >> >> or a 20-bit instruction in a single slot. If there is nothing
> >> >> useful to do for the last single slot, a NOP is inserted there.
> >> >> This actually only gives 13 possibilities, which can be efficiently
> >> >> encoded in four bits. The rest would be reserved.
> >> ><
> >> > if you used six (6) of these bits, you could directly specify which of the
> >> > 20-bit containers contained the first ½ of an instruction (or all of an
> >> > instruction if it is 20-bits.)
> >> You mean being able to split an instruction across two bundles?
> ><
> > No, I was thinking about how the parser identifies instructions. set (1)
> > indicates its an instruction, clear (0) indicates either the second ½ of
> > an instruction or a 20-bit constant, depending on what the previous
> > bit contained. Then if the first ½ is not-an-instruction and the second
> > ½ is also not an instruction, then you have a 40-bit constant, and so
> > on.
> Ah, I think I see what you mean (which took some time :-).
>
> In effect, whether a slot is interpreted as a constant or not as
> part of the instruction is a bit arbitrary. The six-bit encoding
> would give maximum flexibility in that sense.
>
> (Hm, come to think of it, assuming an instruction was always at
> the start of a bundle, then five bits would be enough).
<
Unless you allow second ½s or constants to fall over a bundle boundary..
<
> >> However, the main idea was not per se to save instruction density,
> >> but rather to allow for more freedom in encoding by having 40-bit
> >> instructions.
> ><
> > Well, I am on record as saying the right size for encoding an instruction
> > is 36-bits......
<
> I remember ;-)
>
> An alternative might be to do away with the header and split the bundle
> into seven 18-bit chunks, which would leave two unused bits. The
> limits between the instructions would then be parsed individually by
> the decoder, same as any variable-length ISA, as long as the
> instructions are of multiple size of a slot. 18 bits should be enough
> to encode many simple instructions for better code density.
>
> As for the first bits (or the leftover bits in the header in the
> 20-bit sheme) - this could signal if the bundle contains valid
> jump targets for a branch, or if a jump through a register to the
> start or into this bundle is valid. Replace four bytes of endbr64
> with a bit... not too bad :-)
<
Once you learn the decoder trick (several posts above by me) you can
jump into the middle of a bundle without issues.

Re: Encoding 20 and 40 bit instructions in 128 bits

<st32mr$qha$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23208&group=comp.arch#23208

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Sat, 29 Jan 2022 03:51:21 -0600
Organization: A noiseless patient Spider
Lines: 137
Message-ID: <st32mr$qha$1@dont-email.me>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<cf272be7-c7c2-4213-9b8d-6c2c29e2837cn@googlegroups.com>
<st194b$t78$1@newsreader4.netcologne.de>
<31ea65b9-93d6-4f3e-8847-d0174e5b7fefn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 29 Jan 2022 09:51:23 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="1d21dc40a1333712d24716d567c24071";
logging-data="27178"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1945DBxps0G0z2xVl0f6l+u"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:vRazAMauyYTKjgm78gd6Frod6VE=
In-Reply-To: <31ea65b9-93d6-4f3e-8847-d0174e5b7fefn@googlegroups.com>
Content-Language: en-US

by: BGB - Sat, 29 Jan 2022 09:51 UTC

On 1/28/2022 12:49 PM, MitchAlsup wrote:
> On Friday, January 28, 2022 at 11:28:47 AM UTC-6, Thomas Koenig wrote:
>> MitchAlsup <Mitch...@aol.com> schrieb:
>>> On Thursday, January 27, 2022 at 5:48:56 AM UTC-6, Thomas Koenig wrote:
>>>> I played around a little bit with the following encoding scheme:
>>>>
>>>> Instructions come in bundles of 128 bits. Each bundle has a eight
>>>> bits encoding the format, followed by six slots of 20 bits each.
>>>>
>>>> There can be either a 40-bit instruction in two consecutive slots,
>>>> or a 20-bit instruction in a single slot. If there is nothing
>>>> useful to do for the last single slot, a NOP is inserted there.
>>>> This actually only gives 13 possibilities, which can be efficiently
>>>> encoded in four bits. The rest would be reserved.
>>> <
>>> if you used six (6) of these bits, you could directly specify which of the
>>> 20-bit containers contained the first ½ of an instruction (or all of an
>>> instruction if it is 20-bits.)
>> You mean being able to split an instruction across two bundles?
> <
> No, I was thinking about how the parser identifies instructions. set (1)
> indicates its an instruction, clear (0) indicates either the second ½ of
> an instruction or a 20-bit constant, depending on what the previous
> bit contained. Then if the first ½ is not-an-instruction and the second
> ½ is also not an instruction, then you have a 40-bit constant, and so
> on.
> <
>>> This gets rid of needing a NoOp, and also
>>> facilitates placing constants in the unused instruction slots, while
>>> satisfying the check of only branching into a bundle at an instruction
>>> boundary.
>> Yes.
>>>>
>>>> The 20-bit instructions would probably comprise most Ra = op(Ra,Rb)
>>>> instructions, plus a few common loads and stores.
>>>>
>>>> The 40-bit instrucions have more than enough room for Ra = op(Rb,Rc),
>>>> even a five-register instruction (Ra,Rb) = op (Rc,Rd,Re) would leave
>>>> 15 bits for the opcode, assuming 5 bits for register number.
>>> <
>>> Separate or combined register files ?
>> I'm not that far yet :-)
>>>>
>>>> I have checked how this does for instruction density vs having the
>>>> same number of instructions in 32 bits, that instructions come
>>>> in randomly (the compiler is not reordering instructions to fit
>>>> a bundle) and that a 20-bit instruction and a 40-bit instruction
>>>> does the same work as a 32-bit instruction
>>>>
>>>> Assuming one gets 40% of 20-bit instructions in the stream, this
>>>> would lead to around 10-12% of total reduction in size. This is
>>>> not a very significant advantage, but at least it is no drawback.
>>> <
>>> So you have only broken even with std encoding. That 8-bit bundle
>>> header is keeping you from gaining in the code density department.
>> Could be.
>>
>> However, the main idea was not per se to save instruction density,
>> but rather to allow for more freedom in encoding by having 40-bit
>> instructions.
> <
> Well, I am on record as saying the right size for encoding an instruction
> is 36-bits......

Some of this brought up a random idea...

Idle thought here:
128-bit bundle, encoding between 1 and 3 instructions.

So, say, bundle format:
(127:120): Tag
(119: 80): Instruction C
( 79: 40): Instruction B
( 39: 0): Instruction A

Then, Sub-Forms
Form A-I:
(127:120): Tag
(119: 40): Immed (80b with 64b Immed)
( 39: 0): Instruction A
Form C-I:
(127:120): Tag
(119: 80): Immed (40b 33b Immed)
( 79: 40): Instruction B
( 39: 0): Instruction A

Then, say, registers are 7 bit:
00..3F: R0..R63 (GPRs, each is 64 bits)
40..5F: C0..C31 (CR)
60..7F: V0..V31 (Virtual Registers)

Say, for example:
(39:28): Opcode
(27:24): ?
(23:21): Predicate
(20:14): Rn
(13: 7): Rs
( 6: 0): Rt

Or, if Immed:
(39:33): Opcode
(32: 0): Immed (33 bit)

The Immed instruction behaving like a NOP, but its value can be used by
other instructions in the bundle via a virtual registers:
V0: ZR (Zero Register)
V1: IMM_B (Imm33 in B)
V2: IMM_C (Imm33 in C)
V3: IMM_BC (Imm64 in B and C)

Then, say, top-level Ops (6 bit):
00: NOP/Imm
01: LD/ST
02: ALU
...

For extra fun, maybe we eliminate interlocks. If a pipeline stall is
needed, the stall is encoded explicitly ("Wait N cycles before
proceeding"), however, it doesn't actually stall the pipeline, but
instead stuffs NOP bundles into the pipeline for N cycles.

Maybe also branches have a delay slot, so the branch needs to be
initiated 1 or 2 cycles before it takes effect.

....

Pros:
Simplistic Decoder.

Cons:
Code density would be terrible.

Re: Encoding 20 and 40 bit instructions in 128 bits

<64261f26-e596-40d9-a6af-36ee1493f48fn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23244&group=comp.arch#23244

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:192:: with SMTP id s18mr15256661qtw.43.1643643878275;
Mon, 31 Jan 2022 07:44:38 -0800 (PST)
X-Received: by 2002:a05:6808:1598:: with SMTP id t24mr17683433oiw.50.1643643878013;
Mon, 31 Jan 2022 07:44:38 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 31 Jan 2022 07:44:37 -0800 (PST)
In-Reply-To: <ssu0r5$p2m$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=64.26.97.60; posting-account=6JNn0QoAAAD-Scrkl0ClrfutZTkrOS9S
NNTP-Posting-Host: 64.26.97.60
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <64261f26-e596-40d9-a6af-36ee1493f48fn@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: paaroncl...@gmail.com (Paul A. Clayton)
Injection-Date: Mon, 31 Jan 2022 15:44:38 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 188

by: Paul A. Clayton - Mon, 31 Jan 2022 15:44 UTC

On Thursday, January 27, 2022 at 6:48:56 AM UTC-5, Thomas Koenig wrote:
> I played around a little bit with the following encoding scheme:
> Instructions come in bundles of 128 bits. Each bundle has a eight
> bits encoding the format, followed by six slots of 20 bits each.
>
> There can be either a 40-bit instruction in two consecutive slots,
> or a 20-bit instruction in a single slot. If there is nothing
> useful to do for the last single slot, a NOP is inserted there.
> This actually only gives 13 possibilities, which can be efficiently
> encoded in four bits. The rest would be reserved.
[snip]
> Comments?

I interpret "the main idea was not per se to save instruction
density, but rather to allow for more freedom in encoding by
having 40-bit instructions" to meant that 20-bit instructions
are provided to compensate for the lost code density from
40-bit instructions and the aligned 128-bit bundles are
intended to facilitate wide parsing of a stream of operations
into individual operations. These may not be the best design
goals for a general purpose high performance processor given
current implementation tradeoffs.

(If one just wants 40-bit instructions, one does not
necessarily have to fill power-of-two chunks of storage; IBM
produced a 40-bit fixed instruction size design for HPC
research that used word addressing for instructions.)

I suspect the latency cost of narrower (pre)decoding on
retrieval from an off-chip memory and the extra energy cost of
generating this decode metadata on last level cache misses and
storing the metadata would not be substantial costs. Off-chip
memory access is already so high latency and energy (and
usually reduced bandwidth) that extra decode costs may be
minor. (A denser in-memory format *might* reduce bandwidth
utilization and thus total energy)

One advantage of a bundle encoding constraint would be that a
cache block chunk of code could be predecoded outside of
control flow determination. (Speculative target generation
might be wrong if not kept coherent or if compressed — e.g.,
a partial tag BTB with offset target encoding — so that
entry points would have to be confirmed and predecoding redone
if wrong entry points were assumed. Unflowed through part of
a cache block would also be of unknown entry; such cache hits
would either have to be predecoded with assumed entry points.)

However, that advantage does not require a trivially
interpreted (relatively sparse) encoding as seems to be implied
for the 8-bit template field. (Aside: this seems reminiscent of
Itanium with its 128-bit bundles and 5-bit template field,
though the template was used for instruction type/routing and
stop indicators with 41-bit instructions with a few cases
where two instruction fields could be merged.)

How much and how variably predecoding will expand an instruction
bundle seems a significant design consideration. If the storage
space for this extra metadata has other potential uses,
variability may not be as important (with proper consideration
of the impact of complexity on design effort and validation).

The expansion relative to data should also be considered. If
one has strong motivation for associating a comparable amount
of metadata with non-code memory in cache (with similar access
expectations), this might not be critical but this does
introduce potential binding of microarchitecture to architecture.
(If translation costs are modest for all reasonable expected
implementations, the binding might not be too tight.) A shared
instruction/data L1 cache can still be a reasonable design, so
storage compatibility is important. One would also need to
consider the tradeoffs of retranslation when data is to be used
as instructions.

I would also argue that with a bundle-based encoding the
internal parsing of components may not benefit as much from
fixed operation-based formatting. This is effectively just
extending the context-based interpretation of fields to the
entire bundle. A classic RISC might have a major opcode field
that determined whether additional fields were interpreted as
literal data, register names, or opcode extensions; a bundle-
based encoding might have a tree of interpretation bits,
perhaps with a trunk in the middle of the bundle and limbs
distributed in the bundle if such helps.

(One need not even force operations to be listed in execution
order. A traditional sequential encoding *feels* suboptimal,
but I do not know even if an encoding more suitable — for
efficient current and near future hardware implementation
executing general-purpose code — is possible.)

There might be advantages to bunching together dynamic
operand (register) names (which fields might be alternatively
interpreted as opcode extensions or immediate components,
implying that those roles would influence placement).

I suspect *operation* routing is more friendly to dynamic
generation than implied by VLIW/EPIC designs (especially with
in-order execution and uniform operand access), but operand
locality may be less friendly to dynamic generation. This
would seem to have implications for what information should be
included and where the information should be placed (and how
it should be encoded).

(One wacky idea I had was using bit lending/borrowing, where
unused bits in preceding instructions are placed in a buffer
(bank☺) from which following instructions can borrow. Overflow
of the buffer would just lose bits; underflow could provide
default bits. A similar method might apply to bundles where an
otherwise unusable parcel might be 'appended' to the following
bundle presumably with the constraint that following bundles
can be predecoded without reference to buffer contents. [The
buffer might also be saved and restored at call boundaries or
simply cleared on calls. Indirect jumps might clear the buffer.]
More significant immediate bits might fit this best, but
opcode extensions that do not concern operation routing might
be acceptable use. x86 uses prefixes to add bits to a 'following
instruction' and SPARC64 VIIIfx used a "Set eXtended Arithmetic
Register" instruction to add bits to the following two
instructions — an SIMD indicator bit and three bits to four
register fields for each instruction — but such are not
examples of elegant greenfield encoding practice but of
extending a legacy encoding.)

With respect to instruction encoding, one might also want to
consider the impact of atomic operations. Being able to
atomically insert a debug exception instruction (break) or
function call or jump with a single store instruction/operation
can be useful.

Constraining function calls to aligned bundles may exploit
bundle-granular encoding to provide something like My 66000's
ENTER instruction with slightly greater code density than an
independent instruction encoding — possibly saving an entire
major opcode for all but one of the instructions, truly
massive savings☺.

(If one had distinct 'instruction space' stores, it might be
practical to store privileged call keys in jump instructions
with the key insertion checked at the time of the store. The
'lock' could be encoded at the target, allowing immediate-style
encoding of a call gate. The owner of the function might be
derived based on the function address encoded in the call
instruction. Key revocation might be painful. This is not
terribly dissimilar to Itanium's Enter Privileged Code, which
entered privileged mode if the EPC instruction was in a page
marked as allowing such, but provides little extra value (more
diversity of privilege) and seems to add even more
complications.)

In terms of bundle size, 256 bits (32 bytes) might be more
appropriate. Cache blocks are unlikely to be smaller than that
(though flash memory fetch chunks for microcontrollers might
be smaller?) and that seems to be the size that matters if
some predecoding is done at cache insertion.

Placing the least significant bits of branch/jump targets
(sufficient to index L1 cache) in a fixed location *might* be
useful, but a more variable encoding might be almost as easy
to decode and provide other advantages.

Density in memory, last level cache, and L1 cache, ease of
predecoding to fit the specific core (and possibly even the
mode of that core — a power-saving mode might have different
encoding tradeoffs, e.g.), ability to suit information
availability with criticality (immediates used in arithmetic
or to be stored will generally be less critical as would
functional unit control information), utility of information
for processing (e.g., can forwarding paths be reduced if the
encoding makes certain guarantees or is specifically biased),
and presumably other factors (such as compiler generation,
checking of compiler output, debuggability, and even
resilience to errors — MIPS' zero nop slide was a mistake)
would be considered in designing an instruction encoding.

The translation overhead across heterogeneous cores is another
consideration. A huge core can better afford some cache fill
translation overhead (and is likely to have a separate Icache)
and code primarily targeting a small core is likely to be less
performance critical. Architectural incompatibility (constrained
connection to accelerators or I/O devices is similar; latency
or bandwidth factors present another form of incompatibility
both for processor actions and communication between agents)
or even cost of accessing extra-core resources (including cache
migration overhead, on-chip network bandwidth use, and hop
count between communicating nodes) can be considerations with
respect to migration of processing.

Click here to read the complete article

Re: Encoding 20 and 40 bit instructions in 128 bits

<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23250&group=comp.arch#23250

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:153:: with SMTP id v19mr22684544qtw.323.1643794980955;
Wed, 02 Feb 2022 01:43:00 -0800 (PST)
X-Received: by 2002:a05:6808:1707:: with SMTP id bc7mr3977387oib.179.1643794976672;
Wed, 02 Feb 2022 01:42:56 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 2 Feb 2022 01:42:56 -0800 (PST)
In-Reply-To: <ssun38$imq$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:e422:8397:757e:6665;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:e422:8397:757e:6665
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 02 Feb 2022 09:43:00 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 19

by: Quadibloc - Wed, 2 Feb 2022 09:42 UTC

On Thursday, January 27, 2022 at 11:08:43 AM UTC-7, Stephen Fuld wrote:

> So you have to fetch the whole bundle into the CPU in order to check
> that, even if you aren't going to execute the instructions at the
> beginning of the bundle.
>
> It also slightly complicates getting the address of the next instruction
> to execute, since instead of a simple add, you have to keep track of
> when you are executing the last instruction of a bundle in order to not
> do the simple add, but add one to the bundle number and reset the
> instruction within bundle number to zero.

Because you have to fetch the whole bundle into the CPU to execute
any of the instructions in it, the _second_ objection goes away. After
you execute one instruction in the bundle, you execute the next
instruction in the bundle; _nothing_ has to be calculated, until you
get to the end of the bundle - and then the calculation is simple, 128 bits
past the beginning of the current bundle.

John Savard

Re: Encoding 20 and 40 bit instructions in 128 bits

<stec4m$kg0$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23254&group=comp.arch#23254

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
Date: Wed, 2 Feb 2022 08:39:50 -0800
Organization: A noiseless patient Spider
Lines: 31
Message-ID: <stec4m$kg0$1@dont-email.me>
References: <ssu0r5$p2m$1@newsreader4.netcologne.de>
<ssuf80$i60$1@dont-email.me> <ssulkf$7n0$1@newsreader4.netcologne.de>
<ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 2 Feb 2022 16:39:50 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="e4e3628219f143f9b44608cbec386121";
logging-data="20992"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/3/Y30O15UfJG3ZFZqtvKQRZSqqlJ9uHg="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.1
Cancel-Lock: sha1:3rQpMQZsQsn80hWV0G4mVPE3Og4=
In-Reply-To: <c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com>
Content-Language: en-US

by: Stephen Fuld - Wed, 2 Feb 2022 16:39 UTC

On 2/2/2022 1:42 AM, Quadibloc wrote:
> On Thursday, January 27, 2022 at 11:08:43 AM UTC-7, Stephen Fuld wrote:
>
>> So you have to fetch the whole bundle into the CPU in order to check
>> that, even if you aren't going to execute the instructions at the
>> beginning of the bundle.
>>
>> It also slightly complicates getting the address of the next instruction
>> to execute, since instead of a simple add, you have to keep track of
>> when you are executing the last instruction of a bundle in order to not
>> do the simple add, but add one to the bundle number and reset the
>> instruction within bundle number to zero.
>
> Because you have to fetch the whole bundle into the CPU to execute
> any of the instructions in it, the _second_ objection goes away. After
> you execute one instruction in the bundle, you execute the next
> instruction in the bundle; _nothing_ has to be calculated, until you
> get to the end of the bundle - and then the calculation is simple, 128 bits
> past the beginning of the current bundle.

Good observation. First let me acknowledge Mitch's observation that in
hardware, keeping track of the next instruction is trivial. And one
quibble; you may need the address of the next instruction in the bundle
to handle the case of the current instruction is a CALL type instruction
that puts the address of the next instruction in a register.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Encoding 20 and 40 bit instructions in 128 bits

<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23255&group=comp.arch#23255

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:1105:: with SMTP id e5mr23465216qty.190.1643821482654;
Wed, 02 Feb 2022 09:04:42 -0800 (PST)
X-Received: by 2002:a05:6830:2807:: with SMTP id w7mr10198439otu.94.1643821482412;
Wed, 02 Feb 2022 09:04:42 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 2 Feb 2022 09:04:42 -0800 (PST)
In-Reply-To: <stec4m$kg0$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:419a:8d43:7988:efe9;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:419a:8d43:7988:efe9
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 02 Feb 2022 17:04:42 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 14

by: Quadibloc - Wed, 2 Feb 2022 17:04 UTC

On Wednesday, February 2, 2022 at 9:39:54 AM UTC-7, Stephen Fuld wrote:
> And one
> quibble; you may need the address of the next instruction in the bundle
> to handle the case of the current instruction is a CALL type instruction
> that puts the address of the next instruction in a register.

That is true; if you can branch into the middle of a bundle, then presumably
branches out with returns in the middle will be allowed. Here, though, I
think the issue will be encoding rather than calculation. If a 128-bit bundle
is divided into 20-bit packets, presumably packet n will be given an
address aligned on 16-bit boundaries which will be... associated with one
of the packets. The encoding scheme will presumably make those addresses
consecutive, to simplify the calculation that needs to be done.

John Savard

Re: Encoding 20 and 40 bit instructions in 128 bits

<c792bdc9-c884-4d34-b433-63ee3ee2be3bn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23256&group=comp.arch#23256

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:620a:4154:: with SMTP id k20mr20957522qko.293.1643822956790;
Wed, 02 Feb 2022 09:29:16 -0800 (PST)
X-Received: by 2002:a05:6830:1493:: with SMTP id s19mr17409362otq.85.1643822956577;
Wed, 02 Feb 2022 09:29:16 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 2 Feb 2022 09:29:16 -0800 (PST)
In-Reply-To: <stec4m$kg0$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:80d5:8b42:92ed:f97b;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:80d5:8b42:92ed:f97b
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c792bdc9-c884-4d34-b433-63ee3ee2be3bn@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 02 Feb 2022 17:29:16 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 36

by: MitchAlsup - Wed, 2 Feb 2022 17:29 UTC

On Wednesday, February 2, 2022 at 10:39:54 AM UTC-6, Stephen Fuld wrote:
> On 2/2/2022 1:42 AM, Quadibloc wrote:
> > On Thursday, January 27, 2022 at 11:08:43 AM UTC-7, Stephen Fuld wrote:
> >
> >> So you have to fetch the whole bundle into the CPU in order to check
> >> that, even if you aren't going to execute the instructions at the
> >> beginning of the bundle.
> >>
> >> It also slightly complicates getting the address of the next instruction
> >> to execute, since instead of a simple add, you have to keep track of
> >> when you are executing the last instruction of a bundle in order to not
> >> do the simple add, but add one to the bundle number and reset the
> >> instruction within bundle number to zero.
> >
> > Because you have to fetch the whole bundle into the CPU to execute
> > any of the instructions in it, the _second_ objection goes away. After
> > you execute one instruction in the bundle, you execute the next
> > instruction in the bundle; _nothing_ has to be calculated, until you
> > get to the end of the bundle - and then the calculation is simple, 128 bits
> > past the beginning of the current bundle.
<
> Good observation. First let me acknowledge Mitch's observation that in
> hardware, keeping track of the next instruction is trivial. And one
> quibble; you may need the address of the next instruction in the bundle
> to handle the case of the current instruction is a CALL type instruction
> that puts the address of the next instruction in a register.
<
Given only 6 potential locations for instructions in a container, the HOBs
of the address are the address of the return point container, and the LOBs
are the instruction within the container with range [0..5]. Let HW do all of
decoding work.
<
If control is transferred to a point that is not an instruction starting point
then raise CONTROL-TRANSFER-ERROR.
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: Encoding 20 and 40 bit instructions in 128 bits

<2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23257&group=comp.arch#23257

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:620a:849:: with SMTP id u9mr20184794qku.408.1643823167800;
Wed, 02 Feb 2022 09:32:47 -0800 (PST)
X-Received: by 2002:a05:6808:2003:: with SMTP id q3mr5344786oiw.133.1643823167527;
Wed, 02 Feb 2022 09:32:47 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 2 Feb 2022 09:32:47 -0800 (PST)
In-Reply-To: <de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:80d5:8b42:92ed:f97b;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:80d5:8b42:92ed:f97b
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 02 Feb 2022 17:32:47 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 24

by: MitchAlsup - Wed, 2 Feb 2022 17:32 UTC

On Wednesday, February 2, 2022 at 11:04:44 AM UTC-6, Quadibloc wrote:
> On Wednesday, February 2, 2022 at 9:39:54 AM UTC-7, Stephen Fuld wrote:
> > And one
> > quibble; you may need the address of the next instruction in the bundle
> > to handle the case of the current instruction is a CALL type instruction
> > that puts the address of the next instruction in a register.
<
> That is true; if you can branch into the middle of a bundle, then presumably
> branches out with returns in the middle will be allowed. Here, though, I
> think the issue will be encoding rather than calculation. If a 128-bit bundle
> is divided into 20-bit packets, presumably packet n will be given an
> address aligned on 16-bit boundaries which will be...
<
I am surprised at you--you are making the mistake that control flow addressing
is identical to memory addressing--it is not (in this case). Instructions are located
on 20-bit boundaries, and whether or not an instruction starts in this container
is controlled by the header to the bundle. HW has no problem decoding to 20-bit
boundaries, so the only constraint is that the range in the LOBs of the IP is [0..5]
(instead of 0..7]) !!
<
> associated with one
> of the packets. The encoding scheme will presumably make those addresses
> consecutive, to simplify the calculation that needs to be done.
>
> John Savard

Re: Encoding 20 and 40 bit instructions in 128 bits

<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23258&group=comp.arch#23258

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:6214:76a:: with SMTP id f10mr28684124qvz.85.1643837954064;
Wed, 02 Feb 2022 13:39:14 -0800 (PST)
X-Received: by 2002:a05:6808:1707:: with SMTP id bc7mr5865780oib.179.1643837953807;
Wed, 02 Feb 2022 13:39:13 -0800 (PST)
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!news-out.netnews.com!news.alt.net!fdc2.netnews.com!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 2 Feb 2022 13:39:13 -0800 (PST)
In-Reply-To: <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.253.102; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.253.102
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: jim.brak...@ieee.org (JimBrakefield)
Injection-Date: Wed, 02 Feb 2022 21:39:14 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 3729

by: JimBrakefield - Wed, 2 Feb 2022 21:39 UTC

On Wednesday, February 2, 2022 at 11:32:49 AM UTC-6, MitchAlsup wrote:
> On Wednesday, February 2, 2022 at 11:04:44 AM UTC-6, Quadibloc wrote:
> > On Wednesday, February 2, 2022 at 9:39:54 AM UTC-7, Stephen Fuld wrote:
> > > And one
> > > quibble; you may need the address of the next instruction in the bundle
> > > to handle the case of the current instruction is a CALL type instruction
> > > that puts the address of the next instruction in a register.
> <
> > That is true; if you can branch into the middle of a bundle, then presumably
> > branches out with returns in the middle will be allowed. Here, though, I
> > think the issue will be encoding rather than calculation. If a 128-bit bundle
> > is divided into 20-bit packets, presumably packet n will be given an
> > address aligned on 16-bit boundaries which will be...
> <
> I am surprised at you--you are making the mistake that control flow addressing
> is identical to memory addressing--it is not (in this case). Instructions are located
> on 20-bit boundaries, and whether or not an instruction starts in this container
> is controlled by the header to the bundle. HW has no problem decoding to 20-bit
> boundaries, so the only constraint is that the range in the LOBs of the IP is [0..5]
> (instead of 0..7]) !!
> <
> > associated with one
> > of the packets. The encoding scheme will presumably make those addresses
> > consecutive, to simplify the calculation that needs to be done.
> >
> > John Savard

Some comments:
A) In the modern uP arch, instruction flow is separated from data flow. Thus there is negligible hardware cost in having instruction granularity different from data granularity.
B) Why use the remaining bits to re-sync the instruction addresses? E.g. why not have 21 bit instruction granularity instead of 20 bit? Re-sync to always occur at the start of 64 or 128 or 256-bit instruction block?
C) Opens a new arena for uP architecture, eg instruction formats not tied to data granularity.
D) Need a name for for the concept. Maybe there is already an ISA of this nature (other than stack machines with very short instructions)?
E) Need yet another name for the concept of using some of the remaining instruction block bits for (usually) decode purposes?

Re: Encoding 20 and 40 bit instructions in 128 bits

<9a3958f2-0b31-47a2-9403-fcaada993190n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=23259&group=comp.arch#23259

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:620a:3181:: with SMTP id bi1mr21641367qkb.691.1643840232909;
Wed, 02 Feb 2022 14:17:12 -0800 (PST)
X-Received: by 2002:aca:acce:: with SMTP id v197mr6102644oie.272.1643840232596;
Wed, 02 Feb 2022 14:17:12 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!feeder1.cambriumusenet.nl!feed.tweak.nl!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 2 Feb 2022 14:17:12 -0800 (PST)
In-Reply-To: <4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:80d5:8b42:92ed:f97b;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:80d5:8b42:92ed:f97b
References: <ssu0r5$p2m$1@newsreader4.netcologne.de> <ssuf80$i60$1@dont-email.me>
<ssulkf$7n0$1@newsreader4.netcologne.de> <ssun38$imq$1@dont-email.me>
<c9f3b05a-cc34-4ec9-a7da-88c1fb31614dn@googlegroups.com> <stec4m$kg0$1@dont-email.me>
<de6ecdef-8e30-40aa-838f-df08d10389e7n@googlegroups.com> <2fd5e668-fbe5-4399-bf74-f5e509d669ebn@googlegroups.com>
<4fca5742-1815-4b31-8ea9-2da1592f3456n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <9a3958f2-0b31-47a2-9403-fcaada993190n@googlegroups.com>
Subject: Re: Encoding 20 and 40 bit instructions in 128 bits
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 02 Feb 2022 22:17:12 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

by: MitchAlsup - Wed, 2 Feb 2022 22:17 UTC

On Wednesday, February 2, 2022 at 3:39:15 PM UTC-6, JimBrakefield wrote:

> Some comments:
> A) In the modern uP arch, instruction flow is separated from data flow. Thus there is negligible hardware cost in having instruction granularity different from data granularity.
<
Indeed--clever observation
<
> B) Why use the remaining bits to re-sync the instruction addresses? E.g. why not have 21 bit instruction granularity instead of 20 bit? Re-sync to always occur at the start of 64 or 128 or 256-bit instruction block?
<
two (2) bits are not used 21×6 = 126
<
> C) Opens a new arena for uP architecture, eg instruction formats not tied to data granularity.
<
Indeed
<
> D) Need a name for for the concept. Maybe there is already an ISA of this nature (other than stack machines with very short instructions)?
<
Mill instruction formats and layout are similar, but not based on fixed sized containers and are also model specific.
<
> E) Need yet another name for the concept of using some of the remaining instruction block bits for (usually) decode purposes?
<
Instruction Containerization ! {And I give permission for you to steal this name)

Pages:12 3 4 5 6 7 8 9 10 11 12 13 14

server_pubkey.txt

rocksolid light 0.9.81
clearnet tor