Message-ID:

6 May, 2024: The networking issue during the past two days has been identified and fixed.

devel / comp.arch / Variable branch length encoding?

Would it be possible or just a bad idea to allow some form of
variable/fp encoding of branch offsets?

I.e. all small (byte) offsets would be encoded directly, but for longer
ones you could require the target to be word/dword/qword aligned, and
therefore allow a much greater range.

It would probably only be useful for inter-module calls/branches...

The main problem is probably that the target address adder is already in
the critical path, so you can't just add a little shifter there as well.

(Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of this,
right?)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Variable branch length encoding?

<5991ff75-f3bc-4624-81c3-d53a7fa7518bn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30984&group=comp.arch#30984

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:4049:0:b0:3b7:fda5:e05 with SMTP id j9-20020ac84049000000b003b7fda50e05mr1700703qtl.9.1677673052864;
Wed, 01 Mar 2023 04:17:32 -0800 (PST)
X-Received: by 2002:a05:6870:d1ce:b0:16d:cb48:e195 with SMTP id
b14-20020a056870d1ce00b0016dcb48e195mr7131652oac.3.1677673052644; Wed, 01 Mar
2023 04:17:32 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 1 Mar 2023 04:17:32 -0800 (PST)
In-Reply-To: <ttn1gq$3taue$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=162.157.97.93; posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 162.157.97.93
References: <ttn1gq$3taue$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <5991ff75-f3bc-4624-81c3-d53a7fa7518bn@googlegroups.com>
Subject: Re: Variable branch length encoding?
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 01 Mar 2023 12:17:32 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2029

by: Quadibloc - Wed, 1 Mar 2023 12:17 UTC

On Wednesday, March 1, 2023 at 1:14:22 AM UTC-7, Terje Mathisen wrote:
> Would it be possible or just a bad idea to allow some form of
> variable/fp encoding of branch offsets?
>
> I.e. all small (byte) offsets would be encoded directly, but for longer
> ones you could require the target to be word/dword/qword aligned, and
> therefore allow a much greater range.

Some instruction sets _do_ include two types of branch instructions.

One has a single-byte relative offset, and the other uses the normal
form of memory addressing; usually, the instructions are 16 bits and
32 bits in length respectively.

The alignment restriction is the same in both cases, that of an
instruction. So the one-byte relative offset would be a multiple of
16 bits, for example, while the full-length offset would be in bytes,
but an odd value would be invalid (wasting a bit).

John Savard

Terje Mathisen wrote:
> Would it be possible or just a bad idea to allow some form of
> variable/fp encoding of branch offsets?
>
> I.e. all small (byte) offsets would be encoded directly, but for longer
> ones you could require the target to be word/dword/qword aligned, and
> therefore allow a much greater range.
>
> It would probably only be useful for inter-module calls/branches...
>
> The main problem is probably that the target address adder is already in
> the critical path, so you can't just add a little shifter there as well.
>
> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of this,
> right?)
>
> Terje

As with any variable length instruction, extracting its size
in the parse or decode stage is critical for concurrent decode.
So ideally this offset *won't* be an "fp-like" data item with an
"exponent" field appended to the opcode, but the instruction size
in words is in the first byte and the internal instruction format
is sorted out by decode.

RIP-relative offsets are often relative to the incremented RIP
and include the variable instruction's length. The alignment
is for both the 'from' and 'to' addresses and aligning just the
destination to 32 or 64 bits won't allow encoding a larger range.

The issue I had is when instructions are not byte aligned,
in my case 16-bit words. For branches the immediate field is a
relative _word_ count to get one extra bit of offset range.
But what to do when the immediate field is a 64-bit value:
should that still be a word offset count,
or should it switch to being a byte offset count?

For LD and ST the immediate offset is always a byte offset,
even if it is RIP-relative addressing.

In my naming convention, Branch is to relative offset,
Jump is to an absolute address.

I have a "BR reg" with the offset in a register,
intended for use in position-independent SWITCH statements.
A scaled-indexed LD of the offset onto a register, then BR reg.
But should that register offset be a byte count or a word count?
Or should I have two "BR reg" instructions, one for each?

Re: Variable branch length encoding?

<tto37q$pil$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30990&group=comp.arch#30990

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Variable branch length encoding?
Date: Wed, 1 Mar 2023 09:49:44 -0800
Organization: A noiseless patient Spider
Lines: 71
Message-ID: <tto37q$pil$1@dont-email.me>
References: <ttn1gq$3taue$1@dont-email.me> <lKKLL.884821$gGD7.549447@fx11.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 1 Mar 2023 17:49:46 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="f7cb52a965043efe1c5a2e95c0f402fb";
logging-data="26197"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19QdAJyAbtfgr4Rzg7TJQFLrcARM5B3FWM="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.8.0
Cancel-Lock: sha1:IiIswg1aewd8mLEYxQYmDibS3c0=
Content-Language: en-US
In-Reply-To: <lKKLL.884821$gGD7.549447@fx11.iad>

by: Stephen Fuld - Wed, 1 Mar 2023 17:49 UTC

On 3/1/2023 8:06 AM, EricP wrote:
> Terje Mathisen wrote:
>> Would it be possible or just a bad idea to allow some form of
>> variable/fp encoding of branch offsets?
>>
>> I.e. all small (byte) offsets would be encoded directly, but for
>> longer ones you could require the target to be word/dword/qword
>> aligned, and therefore allow a much greater range.
>>
>> It would probably only be useful for inter-module calls/branches...
>>
>> The main problem is probably that the target address adder is already
>> in the critical path, so you can't just add a little shifter there as
>> well.
>>
>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of
>> this, right?)
>>
>> Terje
>
> As with any variable length instruction, extracting its size
> in the parse or decode stage is critical for concurrent decode.
> So ideally this offset *won't* be an "fp-like" data item with an
> "exponent" field appended to the opcode, but the instruction size
> in words is in the first byte and the internal instruction format
> is sorted out by decode.
>
> RIP-relative offsets are often relative to the incremented RIP
> and include the variable instruction's length. The alignment
> is for both the 'from' and 'to' addresses and aligning just the
> destination to 32 or 64 bits won't allow encoding a larger range.
>
> The issue I had is when instructions are not byte aligned,
> in my case 16-bit words. For branches the immediate field is a
> relative _word_ count to get one extra bit of offset range.
> But what to do when the immediate field is a 64-bit value:
> should that still be a word offset count,
> or should it switch to being a byte offset count?
>
> For LD and ST the immediate offset is always a byte offset,
> even if it is RIP-relative addressing.
>
> In my naming convention, Branch is to relative offset,
> Jump is to an absolute address.
>
> I have a "BR reg" with the offset in a register,
> intended for use in position-independent SWITCH statements.
> A scaled-indexed LD of the offset onto a register, then BR reg.
> But should that register offset be a byte count or a word count?
> Or should I have two "BR reg" instructions, one for each?

I, perhaps mistakenly, interpreted Terje's suggestion differently. I
don't think he was suggesting variable length instructions, but an
orthogonal idea. Let's take an example. Suppose you have 10 bits
available for a branch offset. If the high order bit is zero, the low
order nine bits give a byte address. But if the high order bit is one,
you get the address by appending say 3 zero bits to the low order 9
bits, giving a possible displacement of 12 bits, but restricted to an 8
byte aligned address.

Of course, this idea isn't applicable to just byte aligned instructions,
though its benefit diminishes as you increase the instruction alignment
requirements.

Whether this is a good idea or not, I don't know.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Stephen Fuld wrote:
> On 3/1/2023 8:06 AM, EricP wrote:
>> Terje Mathisen wrote:
>>> Would it be possible or just a bad idea to allow some form of
>>> variable/fp encoding of branch offsets?
>>>
>>> I.e. all small (byte) offsets would be encoded directly, but for
>>> longer ones you could require the target to be word/dword/qword
>>> aligned, and therefore allow a much greater range.
>>>
>>> It would probably only be useful for inter-module calls/branches...
>>>
>>> The main problem is probably that the target address adder is already
>>> in the critical path, so you can't just add a little shifter there as
>>> well.
>>>
>>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of
>>> this, right?)
>>>
>>> Terje
>>
>> As with any variable length instruction, extracting its size
>> in the parse or decode stage is critical for concurrent decode.
>> So ideally this offset *won't* be an "fp-like" data item with an
>> "exponent" field appended to the opcode, but the instruction size
>> in words is in the first byte and the internal instruction format
>> is sorted out by decode.
>>
>> RIP-relative offsets are often relative to the incremented RIP
>> and include the variable instruction's length. The alignment
>> is for both the 'from' and 'to' addresses and aligning just the
>> destination to 32 or 64 bits won't allow encoding a larger range.
>>
>> The issue I had is when instructions are not byte aligned,
>> in my case 16-bit words. For branches the immediate field is a
>> relative _word_ count to get one extra bit of offset range.
>> But what to do when the immediate field is a 64-bit value:
>> should that still be a word offset count,
>> or should it switch to being a byte offset count?
>>
>> For LD and ST the immediate offset is always a byte offset,
>> even if it is RIP-relative addressing.
>>
>> In my naming convention, Branch is to relative offset,
>> Jump is to an absolute address.
>>
>> I have a "BR reg" with the offset in a register,
>> intended for use in position-independent SWITCH statements.
>> A scaled-indexed LD of the offset onto a register, then BR reg.
>> But should that register offset be a byte count or a word count?
>> Or should I have two "BR reg" instructions, one for each?
>
> I, perhaps mistakenly, interpreted Terje's suggestion differently. I
> don't think he was suggesting variable length instructions, but an
> orthogonal idea. Let's take an example. Suppose you have 10 bits
> available for a branch offset. If the high order bit is zero, the low
> order nine bits give a byte address. But if the high order bit is one,
> you get the address by appending say 3 zero bits to the low order 9
> bits, giving a possible displacement of 12 bits, but restricted to an 8
> byte aligned address.
>
> Of course, this idea isn't applicable to just byte aligned instructions,
> though its benefit diminishes as you increase the instruction alignment
> requirements.
>
> Whether this is a good idea or not, I don't know.

Ok, I understand. So the immediate field is fixed size and has
scale data type bits indicating it is a count of int8,int16,int32,
and a signed offset count of those sized words.

The scale would be that of the smallest alignment of the 'from' and 'to'
instructions, which is scaled up by inserting alignment NOPs before both.

I suppose it depends on the size of the immediate field as to how
often it would have to insert NOPs so that it can use a larger scale
to bring a destination back into range.

Re: Variable branch length encoding?

<f7d4bf44-d8cb-4359-8d32-dbe06bb247ebn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30992&group=comp.arch#30992

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:620a:1a87:b0:72b:25b4:565d with SMTP id bl7-20020a05620a1a8700b0072b25b4565dmr6723927qkb.3.1677695625368;
Wed, 01 Mar 2023 10:33:45 -0800 (PST)
X-Received: by 2002:a05:6808:45:b0:36e:f6f7:bb1a with SMTP id
v5-20020a056808004500b0036ef6f7bb1amr2517882oic.5.1677695623565; Wed, 01 Mar
2023 10:33:43 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 1 Mar 2023 10:33:43 -0800 (PST)
In-Reply-To: <ttn1gq$3taue$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7093:88db:960c:e543;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7093:88db:960c:e543
References: <ttn1gq$3taue$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f7d4bf44-d8cb-4359-8d32-dbe06bb247ebn@googlegroups.com>
Subject: Re: Variable branch length encoding?
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 01 Mar 2023 18:33:45 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2686

by: MitchAlsup - Wed, 1 Mar 2023 18:33 UTC

On Wednesday, March 1, 2023 at 2:14:22 AM UTC-6, Terje Mathisen wrote:
> Would it be possible or just a bad idea to allow some form of
> variable/fp encoding of branch offsets?
>
> I.e. all small (byte) offsets would be encoded directly, but for longer
> ones you could require the target to be word/dword/qword aligned, and
> therefore allow a much greater range.
>
> It would probably only be useful for inter-module calls/branches...
<
It is generally OK to place entry-points on cache line boundaries (about 64Byte
alignments) so you get 8 low order bits (which word instruction machines
already use the lowest 2-bits.) A gain of 6-bits.
<
It is generally not OK to require cache line alignment on labels within a
subroutine.
<
It is generally a good idea that BR and CALL use the same multiplexer
and address generator. So, mainly my first paragraph above is in conflict
with the leading sentence of this paragraph.
>
> The main problem is probably that the target address adder is already in
> the critical path, so you can't just add a little shifter there as well.
>
> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of this,
> right?)
<
In a pipeline microarchitecture, the added calculating branch targets is
seldom the adder performing memory address generation.
>
> Terje
>
> --
> - <Terje.Mathisen at tmsw.no>
> "almost all programming can be viewed as an exercise in caching"

Re: Variable branch length encoding?

<c5d0d2c9-ca73-4d4f-a9ba-a6f872bacaf8n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30993&group=comp.arch#30993

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:459a:0:b0:3bf:bfd9:a4a0 with SMTP id l26-20020ac8459a000000b003bfbfd9a4a0mr1101626qtn.12.1677695853537;
Wed, 01 Mar 2023 10:37:33 -0800 (PST)
X-Received: by 2002:a4a:3956:0:b0:525:4c8e:131c with SMTP id
x22-20020a4a3956000000b005254c8e131cmr2481826oog.0.1677695853387; Wed, 01 Mar
2023 10:37:33 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 1 Mar 2023 10:37:33 -0800 (PST)
In-Reply-To: <lKKLL.884821$gGD7.549447@fx11.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7093:88db:960c:e543;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7093:88db:960c:e543
References: <ttn1gq$3taue$1@dont-email.me> <lKKLL.884821$gGD7.549447@fx11.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c5d0d2c9-ca73-4d4f-a9ba-a6f872bacaf8n@googlegroups.com>
Subject: Re: Variable branch length encoding?
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 01 Mar 2023 18:37:33 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3615

by: MitchAlsup - Wed, 1 Mar 2023 18:37 UTC

On Wednesday, March 1, 2023 at 10:06:45 AM UTC-6, EricP wrote:
> Terje Mathisen wrote:
> > Would it be possible or just a bad idea to allow some form of
> > variable/fp encoding of branch offsets?
> >
> > I.e. all small (byte) offsets would be encoded directly, but for longer
> > ones you could require the target to be word/dword/qword aligned, and
> > therefore allow a much greater range.
> >
> > It would probably only be useful for inter-module calls/branches...
> >
> > The main problem is probably that the target address adder is already in
> > the critical path, so you can't just add a little shifter there as well..
> >
> > (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of this,
> > right?)
> >
> > Terje
> As with any variable length instruction, extracting its size
> in the parse or decode stage is critical for concurrent decode.
> So ideally this offset *won't* be an "fp-like" data item with an
> "exponent" field appended to the opcode, but the instruction size
> in words is in the first byte and the internal instruction format
> is sorted out by decode.
>
> RIP-relative offsets are often relative to the incremented RIP
> and include the variable instruction's length. The alignment
> is for both the 'from' and 'to' addresses and aligning just the
> destination to 32 or 64 bits won't allow encoding a larger range.
>
> The issue I had is when instructions are not byte aligned,
> in my case 16-bit words. For branches the immediate field is a
> relative _word_ count to get one extra bit of offset range.
> But what to do when the immediate field is a 64-bit value:
> should that still be a word offset count,
> or should it switch to being a byte offset count?
>
> For LD and ST the immediate offset is always a byte offset,
> even if it is RIP-relative addressing.
>
> In my naming convention, Branch is to relative offset,
> Jump is to an absolute address.
>
> I have a "BR reg" with the offset in a register,
> intended for use in position-independent SWITCH statements.
> A scaled-indexed LD of the offset onto a register, then BR reg.
> But should that register offset be a byte count or a word count?
> Or should I have two "BR reg" instructions, one for each?

Re: Variable branch length encoding?

<5c89b9b6-b1eb-4100-97ce-ed21d9e6ef06n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30994&group=comp.arch#30994

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:620a:51d4:b0:742:5695:d6c4 with SMTP id cx20-20020a05620a51d400b007425695d6c4mr1883170qkb.5.1677696245152;
Wed, 01 Mar 2023 10:44:05 -0800 (PST)
X-Received: by 2002:a9d:51c6:0:b0:68b:dc77:a1dc with SMTP id
d6-20020a9d51c6000000b0068bdc77a1dcmr2598767oth.0.1677696244949; Wed, 01 Mar
2023 10:44:04 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 1 Mar 2023 10:44:04 -0800 (PST)
In-Reply-To: <lKKLL.884821$gGD7.549447@fx11.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7093:88db:960c:e543;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7093:88db:960c:e543
References: <ttn1gq$3taue$1@dont-email.me> <lKKLL.884821$gGD7.549447@fx11.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <5c89b9b6-b1eb-4100-97ce-ed21d9e6ef06n@googlegroups.com>
Subject: Re: Variable branch length encoding?
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 01 Mar 2023 18:44:05 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 4207

by: MitchAlsup - Wed, 1 Mar 2023 18:44 UTC

On Wednesday, March 1, 2023 at 10:06:45 AM UTC-6, EricP wrote:
> Terje Mathisen wrote:
> > Would it be possible or just a bad idea to allow some form of
> > variable/fp encoding of branch offsets?
> >
> > I.e. all small (byte) offsets would be encoded directly, but for longer
> > ones you could require the target to be word/dword/qword aligned, and
> > therefore allow a much greater range.
> >
> > It would probably only be useful for inter-module calls/branches...
> >
> > The main problem is probably that the target address adder is already in
> > the critical path, so you can't just add a little shifter there as well..
> >
> > (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of this,
> > right?)
> >
> > Terje
> As with any variable length instruction, extracting its size
> in the parse or decode stage is critical for concurrent decode.
> So ideally this offset *won't* be an "fp-like" data item with an
> "exponent" field appended to the opcode, but the instruction size
> in words is in the first byte and the internal instruction format
> is sorted out by decode.
>
> RIP-relative offsets are often relative to the incremented RIP
> and include the variable instruction's length. The alignment
> is for both the 'from' and 'to' addresses and aligning just the
> destination to 32 or 64 bits won't allow encoding a larger range.
>
> The issue I had is when instructions are not byte aligned,
> in my case 16-bit words. For branches the immediate field is a
> relative _word_ count to get one extra bit of offset range.
> But what to do when the immediate field is a 64-bit value:
> should that still be a word offset count,
> or should it switch to being a byte offset count?
<
Item 1
>
> For LD and ST the immediate offset is always a byte offset,
> even if it is RIP-relative addressing.
>
> In my naming convention, Branch is to relative offset,
> Jump is to an absolute address.
>
> I have a "BR reg" with the offset in a register,
> intended for use in position-independent SWITCH statements.
> A scaled-indexed LD of the offset onto a register, then BR reg.
> But should that register offset be a byte count or a word count?
> Or should I have two "BR reg" instructions, one for each?
<
I agree that Branches should be to displacements (IP = IP+DISP)
I agree that JUMPs should be to absolute locations (IP = value)
<
The timing in the pipeline where reg becomes available is in conflict
with the time when the next fetch address is needed. Reading and
forwarding take significantly longer than IP+DISP calculation. So,
this typically adds 1 cycle of delay to BR reg forms.
<
Item 1::
<
My BR and CALL instructions have a 2^28-bit range from 26-bit
displacement.
But when I switch to the forms with constants, I switch to
absolute addressing.

Re: Variable branch length encoding?

<tto6fe$163j$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30995&group=comp.arch#30995

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Variable branch length encoding?
Date: Wed, 1 Mar 2023 10:45:01 -0800
Organization: A noiseless patient Spider
Lines: 94
Message-ID: <tto6fe$163j$1@dont-email.me>
References: <ttn1gq$3taue$1@dont-email.me> <lKKLL.884821$gGD7.549447@fx11.iad>
<tto37q$pil$1@dont-email.me> <PPMLL.37185$qpNc.1967@fx03.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 1 Mar 2023 18:45:02 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="f7cb52a965043efe1c5a2e95c0f402fb";
logging-data="39027"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+bZoH2V9lyu6mAvZJTyaDqVzXqFr54s4Y="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.8.0
Cancel-Lock: sha1:5y25cN+aGwNKuEx/lWwRTp5Y+ks=
Content-Language: en-US
In-Reply-To: <PPMLL.37185$qpNc.1967@fx03.iad>

by: Stephen Fuld - Wed, 1 Mar 2023 18:45 UTC

On 3/1/2023 10:28 AM, EricP wrote:
> Stephen Fuld wrote:
>> On 3/1/2023 8:06 AM, EricP wrote:
>>> Terje Mathisen wrote:
>>>> Would it be possible or just a bad idea to allow some form of
>>>> variable/fp encoding of branch offsets?
>>>>
>>>> I.e. all small (byte) offsets would be encoded directly, but for
>>>> longer ones you could require the target to be word/dword/qword
>>>> aligned, and therefore allow a much greater range.
>>>>
>>>> It would probably only be useful for inter-module calls/branches...
>>>>
>>>> The main problem is probably that the target address adder is
>>>> already in the critical path, so you can't just add a little shifter
>>>> there as well.
>>>>
>>>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of
>>>> this, right?)
>>>>
>>>> Terje
>>>
>>> As with any variable length instruction, extracting its size
>>> in the parse or decode stage is critical for concurrent decode.
>>> So ideally this offset *won't* be an "fp-like" data item with an
>>> "exponent" field appended to the opcode, but the instruction size
>>> in words is in the first byte and the internal instruction format
>>> is sorted out by decode.
>>>
>>> RIP-relative offsets are often relative to the incremented RIP
>>> and include the variable instruction's length. The alignment
>>> is for both the 'from' and 'to' addresses and aligning just the
>>> destination to 32 or 64 bits won't allow encoding a larger range.
>>>
>>> The issue I had is when instructions are not byte aligned,
>>> in my case 16-bit words. For branches the immediate field is a
>>> relative _word_ count to get one extra bit of offset range.
>>> But what to do when the immediate field is a 64-bit value:
>>> should that still be a word offset count,
>>> or should it switch to being a byte offset count?
>>>
>>> For LD and ST the immediate offset is always a byte offset,
>>> even if it is RIP-relative addressing.
>>>
>>> In my naming convention, Branch is to relative offset,
>>> Jump is to an absolute address.
>>>
>>> I have a "BR reg" with the offset in a register,
>>> intended for use in position-independent SWITCH statements.
>>> A scaled-indexed LD of the offset onto a register, then BR reg.
>>> But should that register offset be a byte count or a word count?
>>> Or should I have two "BR reg" instructions, one for each?
>>
>> I, perhaps mistakenly, interpreted Terje's suggestion differently. I
>> don't think he was suggesting variable length instructions, but an
>> orthogonal idea. Let's take an example. Suppose you have 10 bits
>> available for a branch offset. If the high order bit is zero, the low
>> order nine bits give a byte address. But if the high order bit is
>> one, you get the address by appending say 3 zero bits to the low order
>> 9 bits, giving a possible displacement of 12 bits, but restricted to
>> an 8 byte aligned address.
>>
>> Of course, this idea isn't applicable to just byte aligned
>> instructions, though its benefit diminishes as you increase the
>> instruction alignment requirements.
>>
>> Whether this is a good idea or not, I don't know.
>
> Ok, I understand. So the immediate field is fixed size and has
> scale data type bits indicating it is a count of int8,int16,int32,
> and a signed offset count of those sized words.

Yes, although I doubt you need to support all three sizes.

> The scale would be that of the smallest alignment of the 'from' and 'to'
> instructions, which is scaled up by inserting alignment NOPs before both.

No need to align the "from" instruction.

> I suppose it depends on the size of the immediate field as to how
> often it would have to insert NOPs so that it can use a larger scale
> to bring a destination back into range.

Yes. And also the mimimum size of an instruction. E.g. for CPUs with
32 bit instructions, you "automatically" get a factor of four in the
range of the branch instructions.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Stephen Fuld wrote:
> On 3/1/2023 10:28 AM, EricP wrote:
>> Stephen Fuld wrote:
>>> On 3/1/2023 8:06 AM, EricP wrote:
>>>> Terje Mathisen wrote:
>>>>> Would it be possible or just a bad idea to allow some form of
>>>>> variable/fp encoding of branch offsets?
>>>>>
>>>>> I.e. all small (byte) offsets would be encoded directly, but for
>>>>> longer ones you could require the target to be word/dword/qword
>>>>> aligned, and therefore allow a much greater range.
>>>>>
>>>>> It would probably only be useful for inter-module calls/branches...
>>>>>
>>>>> The main problem is probably that the target address adder is
>>>>> already in the critical path, so you can't just add a little
>>>>> shifter there as well.
>>>>>
>>>>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of
>>>>> this, right?)
>>>>>
>>>>> Terje
>>>>
>>>> As with any variable length instruction, extracting its size
>>>> in the parse or decode stage is critical for concurrent decode.
>>>> So ideally this offset *won't* be an "fp-like" data item with an
>>>> "exponent" field appended to the opcode, but the instruction size
>>>> in words is in the first byte and the internal instruction format
>>>> is sorted out by decode.
>>>>
>>>> RIP-relative offsets are often relative to the incremented RIP
>>>> and include the variable instruction's length. The alignment
>>>> is for both the 'from' and 'to' addresses and aligning just the
>>>> destination to 32 or 64 bits won't allow encoding a larger range.
>>>>
>>>> The issue I had is when instructions are not byte aligned,
>>>> in my case 16-bit words. For branches the immediate field is a
>>>> relative _word_ count to get one extra bit of offset range.
>>>> But what to do when the immediate field is a 64-bit value:
>>>> should that still be a word offset count,
>>>> or should it switch to being a byte offset count?
>>>>
>>>> For LD and ST the immediate offset is always a byte offset,
>>>> even if it is RIP-relative addressing.
>>>>
>>>> In my naming convention, Branch is to relative offset,
>>>> Jump is to an absolute address.
>>>>
>>>> I have a "BR reg" with the offset in a register,
>>>> intended for use in position-independent SWITCH statements.
>>>> A scaled-indexed LD of the offset onto a register, then BR reg.
>>>> But should that register offset be a byte count or a word count?
>>>> Or should I have two "BR reg" instructions, one for each?
>>>
>>> I, perhaps mistakenly, interpreted Terje's suggestion differently. I
>>> don't think he was suggesting variable length instructions, but an
>>> orthogonal idea. Let's take an example. Suppose you have 10 bits
>>> available for a branch offset. If the high order bit is zero, the low
>>> order nine bits give a byte address. But if the high order bit is
>>> one, you get the address by appending say 3 zero bits to the low
>>> order 9 bits, giving a possible displacement of 12 bits, but
>>> restricted to an 8 byte aligned address.
>>>
>>> Of course, this idea isn't applicable to just byte aligned
>>> instructions, though its benefit diminishes as you increase the
>>> instruction alignment requirements.
>>>
>>> Whether this is a good idea or not, I don't know.
>>
>> Ok, I understand. So the immediate field is fixed size and has
>> scale data type bits indicating it is a count of int8,int16,int32,
>> and a signed offset count of those sized words.
>
> Yes, although I doubt you need to support all three sizes.

If we have a 16-bit immediate field,
and bit[15]==0 means bits [14:0] are a signed int8 offset count,
and bits [15:14]==10 means int16 offset count

then the 15-bit int8 count is redundant with the 14-bit int16 count.

So that prefix should select a different word size, maybe int24?
Or just have two scale sizes, int8 and int32.

>> The scale would be that of the smallest alignment of the 'from' and 'to'
>> instructions, which is scaled up by inserting alignment NOPs before both.
>
> No need to align the "from" instruction.

If you want the offset to be a count of int32's
then both 'from' and 'to' need the same mod-4 alignment.

That is, in order to use an int32 count as the offset then
both BR and target instruction addresses bits [1:0] must be the same.

>> I suppose it depends on the size of the immediate field as to how
>> often it would have to insert NOPs so that it can use a larger scale
>> to bring a destination back into range.
>
> Yes. And also the mimimum size of an instruction. E.g. for CPUs with
> 32 bit instructions, you "automatically" get a factor of four in the
> range of the branch instructions.

Re: Variable branch length encoding?

<54098589-0ce6-4c7f-86c6-d0516c7c3be9n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30999&group=comp.arch#30999

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:701e:0:b0:3bf:d313:40e with SMTP id x30-20020ac8701e000000b003bfd313040emr2131400qtm.13.1677705043952;
Wed, 01 Mar 2023 13:10:43 -0800 (PST)
X-Received: by 2002:a05:6808:1509:b0:37a:2bed:5758 with SMTP id
u9-20020a056808150900b0037a2bed5758mr9066680oiw.2.1677705043703; Wed, 01 Mar
2023 13:10:43 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 1 Mar 2023 13:10:43 -0800 (PST)
In-Reply-To: <cONLL.947025$8_id.671747@fx09.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:b87c:65a9:ad3c:d9a7;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:b87c:65a9:ad3c:d9a7
References: <ttn1gq$3taue$1@dont-email.me> <lKKLL.884821$gGD7.549447@fx11.iad>
<tto37q$pil$1@dont-email.me> <PPMLL.37185$qpNc.1967@fx03.iad>
<tto6fe$163j$1@dont-email.me> <cONLL.947025$8_id.671747@fx09.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <54098589-0ce6-4c7f-86c6-d0516c7c3be9n@googlegroups.com>
Subject: Re: Variable branch length encoding?
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 01 Mar 2023 21:10:43 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 6861

by: MitchAlsup - Wed, 1 Mar 2023 21:10 UTC

On Wednesday, March 1, 2023 at 1:35:40 PM UTC-6, EricP wrote:
> Stephen Fuld wrote:
> > On 3/1/2023 10:28 AM, EricP wrote:
> >> Stephen Fuld wrote:
> >>> On 3/1/2023 8:06 AM, EricP wrote:
> >>>> Terje Mathisen wrote:
> >>>>> Would it be possible or just a bad idea to allow some form of
> >>>>> variable/fp encoding of branch offsets?
> >>>>>
> >>>>> I.e. all small (byte) offsets would be encoded directly, but for
> >>>>> longer ones you could require the target to be word/dword/qword
> >>>>> aligned, and therefore allow a much greater range.
> >>>>>
> >>>>> It would probably only be useful for inter-module calls/branches...
> >>>>>
> >>>>> The main problem is probably that the target address adder is
> >>>>> already in the critical path, so you can't just add a little
> >>>>> shifter there as well.
> >>>>>
> >>>>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of
> >>>>> this, right?)
> >>>>>
> >>>>> Terje
> >>>>
> >>>> As with any variable length instruction, extracting its size
> >>>> in the parse or decode stage is critical for concurrent decode.
> >>>> So ideally this offset *won't* be an "fp-like" data item with an
> >>>> "exponent" field appended to the opcode, but the instruction size
> >>>> in words is in the first byte and the internal instruction format
> >>>> is sorted out by decode.
> >>>>
> >>>> RIP-relative offsets are often relative to the incremented RIP
> >>>> and include the variable instruction's length. The alignment
> >>>> is for both the 'from' and 'to' addresses and aligning just the
> >>>> destination to 32 or 64 bits won't allow encoding a larger range.
> >>>>
> >>>> The issue I had is when instructions are not byte aligned,
> >>>> in my case 16-bit words. For branches the immediate field is a
> >>>> relative _word_ count to get one extra bit of offset range.
> >>>> But what to do when the immediate field is a 64-bit value:
> >>>> should that still be a word offset count,
> >>>> or should it switch to being a byte offset count?
> >>>>
> >>>> For LD and ST the immediate offset is always a byte offset,
> >>>> even if it is RIP-relative addressing.
> >>>>
> >>>> In my naming convention, Branch is to relative offset,
> >>>> Jump is to an absolute address.
> >>>>
> >>>> I have a "BR reg" with the offset in a register,
> >>>> intended for use in position-independent SWITCH statements.
> >>>> A scaled-indexed LD of the offset onto a register, then BR reg.
> >>>> But should that register offset be a byte count or a word count?
> >>>> Or should I have two "BR reg" instructions, one for each?
> >>>
> >>> I, perhaps mistakenly, interpreted Terje's suggestion differently. I
> >>> don't think he was suggesting variable length instructions, but an
> >>> orthogonal idea. Let's take an example. Suppose you have 10 bits
> >>> available for a branch offset. If the high order bit is zero, the low
> >>> order nine bits give a byte address. But if the high order bit is
> >>> one, you get the address by appending say 3 zero bits to the low
> >>> order 9 bits, giving a possible displacement of 12 bits, but
> >>> restricted to an 8 byte aligned address.
> >>>
> >>> Of course, this idea isn't applicable to just byte aligned
> >>> instructions, though its benefit diminishes as you increase the
> >>> instruction alignment requirements.
> >>>
> >>> Whether this is a good idea or not, I don't know.
> >>
> >> Ok, I understand. So the immediate field is fixed size and has
> >> scale data type bits indicating it is a count of int8,int16,int32,
> >> and a signed offset count of those sized words.
> >
> > Yes, although I doubt you need to support all three sizes.
> If we have a 16-bit immediate field,
> and bit[15]==0 means bits [14:0] are a signed int8 offset count,
> and bits [15:14]==10 means int16 offset count
>
> then the 15-bit int8 count is redundant with the 14-bit int16 count.
>
> So that prefix should select a different word size, maybe int24?
> Or just have two scale sizes, int8 and int32.
> >> The scale would be that of the smallest alignment of the 'from' and 'to'
> >> instructions, which is scaled up by inserting alignment NOPs before both.
> >
> > No need to align the "from" instruction.
> If you want the offset to be a count of int32's
> then both 'from' and 'to' need the same mod-4 alignment.
<
M88K, MIPS, SPARC, Alpha all had the IP<1..0> == 2b'00 as an alignment
condition.
>
> That is, in order to use an int32 count as the offset then
> both BR and target instruction addresses bits [1:0] must be the same.
<
At the time control transfer address is generated, IP<1..0> == 0 and
target_address<1..0> == 0. Thus, one can always shift a constant field
up by 2 bits and gain displacement range at no real cost.
<
> >> I suppose it depends on the size of the immediate field as to how
> >> often it would have to insert NOPs so that it can use a larger scale
> >> to bring a destination back into range.
> >
> > Yes. And also the mimimum size of an instruction. E.g. for CPUs with
> > 32 bit instructions, you "automatically" get a factor of four in the
> > range of the branch instructions.

Re: Variable branch length encoding?

<ttof1c$267e$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=31000&group=comp.arch#31000

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Variable branch length encoding?
Date: Wed, 1 Mar 2023 13:11:07 -0800
Organization: A noiseless patient Spider
Lines: 113
Message-ID: <ttof1c$267e$1@dont-email.me>
References: <ttn1gq$3taue$1@dont-email.me> <lKKLL.884821$gGD7.549447@fx11.iad>
<tto37q$pil$1@dont-email.me> <PPMLL.37185$qpNc.1967@fx03.iad>
<tto6fe$163j$1@dont-email.me> <cONLL.947025$8_id.671747@fx09.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 1 Mar 2023 21:11:08 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="f7cb52a965043efe1c5a2e95c0f402fb";
logging-data="71918"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+cNgxptc6gQXhmpel2Ccy5pNjxeWZr/mM="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.8.0
Cancel-Lock: sha1:JMykttp5CoKSgeZymYkN56dy8XA=
In-Reply-To: <cONLL.947025$8_id.671747@fx09.iad>
Content-Language: en-US

by: Stephen Fuld - Wed, 1 Mar 2023 21:11 UTC

On 3/1/2023 11:35 AM, EricP wrote:
> Stephen Fuld wrote:
>> On 3/1/2023 10:28 AM, EricP wrote:
>>> Stephen Fuld wrote:
>>>> On 3/1/2023 8:06 AM, EricP wrote:
>>>>> Terje Mathisen wrote:
>>>>>> Would it be possible or just a bad idea to allow some form of
>>>>>> variable/fp encoding of branch offsets?
>>>>>>
>>>>>> I.e. all small (byte) offsets would be encoded directly, but for
>>>>>> longer ones you could require the target to be word/dword/qword
>>>>>> aligned, and therefore allow a much greater range.
>>>>>>
>>>>>> It would probably only be useful for inter-module calls/branches...
>>>>>>
>>>>>> The main problem is probably that the target address adder is
>>>>>> already in the critical path, so you can't just add a little
>>>>>> shifter there as well.
>>>>>>
>>>>>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of
>>>>>> this, right?)
>>>>>>
>>>>>> Terje
>>>>>
>>>>> As with any variable length instruction, extracting its size
>>>>> in the parse or decode stage is critical for concurrent decode.
>>>>> So ideally this offset *won't* be an "fp-like" data item with an
>>>>> "exponent" field appended to the opcode, but the instruction size
>>>>> in words is in the first byte and the internal instruction format
>>>>> is sorted out by decode.
>>>>>
>>>>> RIP-relative offsets are often relative to the incremented RIP
>>>>> and include the variable instruction's length. The alignment
>>>>> is for both the 'from' and 'to' addresses and aligning just the
>>>>> destination to 32 or 64 bits won't allow encoding a larger range.
>>>>>
>>>>> The issue I had is when instructions are not byte aligned,
>>>>> in my case 16-bit words. For branches the immediate field is a
>>>>> relative _word_ count to get one extra bit of offset range.
>>>>> But what to do when the immediate field is a 64-bit value:
>>>>> should that still be a word offset count,
>>>>> or should it switch to being a byte offset count?
>>>>>
>>>>> For LD and ST the immediate offset is always a byte offset,
>>>>> even if it is RIP-relative addressing.
>>>>>
>>>>> In my naming convention, Branch is to relative offset,
>>>>> Jump is to an absolute address.
>>>>>
>>>>> I have a "BR reg" with the offset in a register,
>>>>> intended for use in position-independent SWITCH statements.
>>>>> A scaled-indexed LD of the offset onto a register, then BR reg.
>>>>> But should that register offset be a byte count or a word count?
>>>>> Or should I have two "BR reg" instructions, one for each?
>>>>
>>>> I, perhaps mistakenly, interpreted Terje's suggestion differently.
>>>> I don't think he was suggesting variable length instructions, but an
>>>> orthogonal idea. Let's take an example. Suppose you have 10 bits
>>>> available for a branch offset. If the high order bit is zero, the
>>>> low order nine bits give a byte address. But if the high order bit
>>>> is one, you get the address by appending say 3 zero bits to the low
>>>> order 9 bits, giving a possible displacement of 12 bits, but
>>>> restricted to an 8 byte aligned address.
>>>>
>>>> Of course, this idea isn't applicable to just byte aligned
>>>> instructions, though its benefit diminishes as you increase the
>>>> instruction alignment requirements.
>>>>
>>>> Whether this is a good idea or not, I don't know.
>>>
>>> Ok, I understand. So the immediate field is fixed size and has
>>> scale data type bits indicating it is a count of int8,int16,int32,
>>> and a signed offset count of those sized words.
>>
>> Yes, although I doubt you need to support all three sizes.
>
> If we have a 16-bit immediate field,
> and bit[15]==0 means bits [14:0] are a signed int8 offset count,
> and bits [15:14]==10 means int16 offset count
>
> then the 15-bit int8 count is redundant with the 14-bit int16 count.
>
> So that prefix should select a different word size, maybe int24?
> Or just have two scale sizes, int8 and int32.

That is what I would suggest.

>>> The scale would be that of the smallest alignment of the 'from' and 'to'
>>> instructions, which is scaled up by inserting alignment NOPs before
>>> both.
>>
>> No need to align the "from" instruction.
>
> If you want the offset to be a count of int32's
> then both 'from' and 'to' need the same mod-4 alignment.

Or, you have the rule that the address of the "from" instruction is
rounded up to the nearest size used for the displacement field. i.e. if
the address of the from instruction is say xxx03, and the displacement
is scaled to 4 byte values, the the xxx03 is "rounded up" to xxx04.

> That is, in order to use an int32 count as the offset then
> both BR and target instruction addresses bits [1:0] must be the same.

Not necessarily. See above.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Variable branch length encoding?

<bafd7e03-efba-4a3b-9028-db6971311aedn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=31001&group=comp.arch#31001

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:6214:4a43:b0:56e:9197:4ccd with SMTP id ph3-20020a0562144a4300b0056e91974ccdmr25339qvb.0.1677710523696;
Wed, 01 Mar 2023 14:42:03 -0800 (PST)
X-Received: by 2002:a05:6870:5b15:b0:175:cd67:dfd6 with SMTP id
ds21-20020a0568705b1500b00175cd67dfd6mr1435225oab.0.1677710523488; Wed, 01
Mar 2023 14:42:03 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 1 Mar 2023 14:42:03 -0800 (PST)
In-Reply-To: <ttof1c$267e$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:b87c:65a9:ad3c:d9a7;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:b87c:65a9:ad3c:d9a7
References: <ttn1gq$3taue$1@dont-email.me> <lKKLL.884821$gGD7.549447@fx11.iad>
<tto37q$pil$1@dont-email.me> <PPMLL.37185$qpNc.1967@fx03.iad>
<tto6fe$163j$1@dont-email.me> <cONLL.947025$8_id.671747@fx09.iad> <ttof1c$267e$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <bafd7e03-efba-4a3b-9028-db6971311aedn@googlegroups.com>
Subject: Re: Variable branch length encoding?
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 01 Mar 2023 22:42:03 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 7072

by: MitchAlsup - Wed, 1 Mar 2023 22:42 UTC

On Wednesday, March 1, 2023 at 3:11:12 PM UTC-6, Stephen Fuld wrote:
> On 3/1/2023 11:35 AM, EricP wrote:
> > Stephen Fuld wrote:
> >> On 3/1/2023 10:28 AM, EricP wrote:
> >>> Stephen Fuld wrote:
> >>>> On 3/1/2023 8:06 AM, EricP wrote:
> >>>>> Terje Mathisen wrote:
> >>>>>> Would it be possible or just a bad idea to allow some form of
> >>>>>> variable/fp encoding of branch offsets?
> >>>>>>
> >>>>>> I.e. all small (byte) offsets would be encoded directly, but for
> >>>>>> longer ones you could require the target to be word/dword/qword
> >>>>>> aligned, and therefore allow a much greater range.
> >>>>>>
> >>>>>> It would probably only be useful for inter-module calls/branches....
> >>>>>>
> >>>>>> The main problem is probably that the target address adder is
> >>>>>> already in the critical path, so you can't just add a little
> >>>>>> shifter there as well.
> >>>>>>
> >>>>>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of
> >>>>>> this, right?)
> >>>>>>
> >>>>>> Terje
> >>>>>
> >>>>> As with any variable length instruction, extracting its size
> >>>>> in the parse or decode stage is critical for concurrent decode.
> >>>>> So ideally this offset *won't* be an "fp-like" data item with an
> >>>>> "exponent" field appended to the opcode, but the instruction size
> >>>>> in words is in the first byte and the internal instruction format
> >>>>> is sorted out by decode.
> >>>>>
> >>>>> RIP-relative offsets are often relative to the incremented RIP
> >>>>> and include the variable instruction's length. The alignment
> >>>>> is for both the 'from' and 'to' addresses and aligning just the
> >>>>> destination to 32 or 64 bits won't allow encoding a larger range.
> >>>>>
> >>>>> The issue I had is when instructions are not byte aligned,
> >>>>> in my case 16-bit words. For branches the immediate field is a
> >>>>> relative _word_ count to get one extra bit of offset range.
> >>>>> But what to do when the immediate field is a 64-bit value:
> >>>>> should that still be a word offset count,
> >>>>> or should it switch to being a byte offset count?
> >>>>>
> >>>>> For LD and ST the immediate offset is always a byte offset,
> >>>>> even if it is RIP-relative addressing.
> >>>>>
> >>>>> In my naming convention, Branch is to relative offset,
> >>>>> Jump is to an absolute address.
> >>>>>
> >>>>> I have a "BR reg" with the offset in a register,
> >>>>> intended for use in position-independent SWITCH statements.
> >>>>> A scaled-indexed LD of the offset onto a register, then BR reg.
> >>>>> But should that register offset be a byte count or a word count?
> >>>>> Or should I have two "BR reg" instructions, one for each?
> >>>>
> >>>> I, perhaps mistakenly, interpreted Terje's suggestion differently.
> >>>> I don't think he was suggesting variable length instructions, but an
> >>>> orthogonal idea. Let's take an example. Suppose you have 10 bits
> >>>> available for a branch offset. If the high order bit is zero, the
> >>>> low order nine bits give a byte address. But if the high order bit
> >>>> is one, you get the address by appending say 3 zero bits to the low
> >>>> order 9 bits, giving a possible displacement of 12 bits, but
> >>>> restricted to an 8 byte aligned address.
> >>>>
> >>>> Of course, this idea isn't applicable to just byte aligned
> >>>> instructions, though its benefit diminishes as you increase the
> >>>> instruction alignment requirements.
> >>>>
> >>>> Whether this is a good idea or not, I don't know.
> >>>
> >>> Ok, I understand. So the immediate field is fixed size and has
> >>> scale data type bits indicating it is a count of int8,int16,int32,
> >>> and a signed offset count of those sized words.
> >>
> >> Yes, although I doubt you need to support all three sizes.
> >
> > If we have a 16-bit immediate field,
> > and bit[15]==0 means bits [14:0] are a signed int8 offset count,
> > and bits [15:14]==10 means int16 offset count
> >
> > then the 15-bit int8 count is redundant with the 14-bit int16 count.
> >
> > So that prefix should select a different word size, maybe int24?
> > Or just have two scale sizes, int8 and int32.
> That is what I would suggest.
> >>> The scale would be that of the smallest alignment of the 'from' and 'to'
> >>> instructions, which is scaled up by inserting alignment NOPs before
> >>> both.
> >>
> >> No need to align the "from" instruction.
> >
> > If you want the offset to be a count of int32's
> > then both 'from' and 'to' need the same mod-4 alignment.
<
> Or, you have the rule that the address of the "from" instruction is
> rounded up to the nearest size used for the displacement field. i.e. if
> the address of the from instruction is say xxx03, and the displacement
> is scaled to 4 byte values, the the xxx03 is "rounded up" to xxx04.
<
Rounding is significantly (several gates of delay) more expensive than
truncation (free).
<
The majority of architectures past and present require that "what comes
out of the branch adder" is what can be used to access instructions.
<
> > That is, in order to use an int32 count as the offset then
> > both BR and target instruction addresses bits [1:0] must be the same.
> Not necessarily. See above.
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: Variable branch length encoding?

<ttom2p$30u8$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=31002&group=comp.arch#31002

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Variable branch length encoding?
Date: Wed, 1 Mar 2023 15:11:18 -0800
Organization: A noiseless patient Spider
Lines: 115
Message-ID: <ttom2p$30u8$1@dont-email.me>
References: <ttn1gq$3taue$1@dont-email.me> <lKKLL.884821$gGD7.549447@fx11.iad>
<tto37q$pil$1@dont-email.me> <PPMLL.37185$qpNc.1967@fx03.iad>
<tto6fe$163j$1@dont-email.me> <cONLL.947025$8_id.671747@fx09.iad>
<ttof1c$267e$1@dont-email.me>
<bafd7e03-efba-4a3b-9028-db6971311aedn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 1 Mar 2023 23:11:21 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="fd4b42794d234373d0e29aeba4c0750f";
logging-data="99272"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+GN3Q3gDdnYsDwCH0w1EDrmT3xfbM/n/I="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.8.0
Cancel-Lock: sha1:iTLFqBNMZd1gt5Sn8/3GTVo9puk=
In-Reply-To: <bafd7e03-efba-4a3b-9028-db6971311aedn@googlegroups.com>
Content-Language: en-US

by: Stephen Fuld - Wed, 1 Mar 2023 23:11 UTC

On 3/1/2023 2:42 PM, MitchAlsup wrote:
> On Wednesday, March 1, 2023 at 3:11:12 PM UTC-6, Stephen Fuld wrote:
>> On 3/1/2023 11:35 AM, EricP wrote:
>>> Stephen Fuld wrote:
>>>> On 3/1/2023 10:28 AM, EricP wrote:
>>>>> Stephen Fuld wrote:
>>>>>> On 3/1/2023 8:06 AM, EricP wrote:
>>>>>>> Terje Mathisen wrote:
>>>>>>>> Would it be possible or just a bad idea to allow some form of
>>>>>>>> variable/fp encoding of branch offsets?
>>>>>>>>
>>>>>>>> I.e. all small (byte) offsets would be encoded directly, but for
>>>>>>>> longer ones you could require the target to be word/dword/qword
>>>>>>>> aligned, and therefore allow a much greater range.
>>>>>>>>
>>>>>>>> It would probably only be useful for inter-module calls/branches...
>>>>>>>>
>>>>>>>> The main problem is probably that the target address adder is
>>>>>>>> already in the critical path, so you can't just add a little
>>>>>>>> shifter there as well.
>>>>>>>>
>>>>>>>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of
>>>>>>>> this, right?)
>>>>>>>>
>>>>>>>> Terje
>>>>>>>
>>>>>>> As with any variable length instruction, extracting its size
>>>>>>> in the parse or decode stage is critical for concurrent decode.
>>>>>>> So ideally this offset *won't* be an "fp-like" data item with an
>>>>>>> "exponent" field appended to the opcode, but the instruction size
>>>>>>> in words is in the first byte and the internal instruction format
>>>>>>> is sorted out by decode.
>>>>>>>
>>>>>>> RIP-relative offsets are often relative to the incremented RIP
>>>>>>> and include the variable instruction's length. The alignment
>>>>>>> is for both the 'from' and 'to' addresses and aligning just the
>>>>>>> destination to 32 or 64 bits won't allow encoding a larger range.
>>>>>>>
>>>>>>> The issue I had is when instructions are not byte aligned,
>>>>>>> in my case 16-bit words. For branches the immediate field is a
>>>>>>> relative _word_ count to get one extra bit of offset range.
>>>>>>> But what to do when the immediate field is a 64-bit value:
>>>>>>> should that still be a word offset count,
>>>>>>> or should it switch to being a byte offset count?
>>>>>>>
>>>>>>> For LD and ST the immediate offset is always a byte offset,
>>>>>>> even if it is RIP-relative addressing.
>>>>>>>
>>>>>>> In my naming convention, Branch is to relative offset,
>>>>>>> Jump is to an absolute address.
>>>>>>>
>>>>>>> I have a "BR reg" with the offset in a register,
>>>>>>> intended for use in position-independent SWITCH statements.
>>>>>>> A scaled-indexed LD of the offset onto a register, then BR reg.
>>>>>>> But should that register offset be a byte count or a word count?
>>>>>>> Or should I have two "BR reg" instructions, one for each?
>>>>>>
>>>>>> I, perhaps mistakenly, interpreted Terje's suggestion differently.
>>>>>> I don't think he was suggesting variable length instructions, but an
>>>>>> orthogonal idea. Let's take an example. Suppose you have 10 bits
>>>>>> available for a branch offset. If the high order bit is zero, the
>>>>>> low order nine bits give a byte address. But if the high order bit
>>>>>> is one, you get the address by appending say 3 zero bits to the low
>>>>>> order 9 bits, giving a possible displacement of 12 bits, but
>>>>>> restricted to an 8 byte aligned address.
>>>>>>
>>>>>> Of course, this idea isn't applicable to just byte aligned
>>>>>> instructions, though its benefit diminishes as you increase the
>>>>>> instruction alignment requirements.
>>>>>>
>>>>>> Whether this is a good idea or not, I don't know.
>>>>>
>>>>> Ok, I understand. So the immediate field is fixed size and has
>>>>> scale data type bits indicating it is a count of int8,int16,int32,
>>>>> and a signed offset count of those sized words.
>>>>
>>>> Yes, although I doubt you need to support all three sizes.
>>>
>>> If we have a 16-bit immediate field,
>>> and bit[15]==0 means bits [14:0] are a signed int8 offset count,
>>> and bits [15:14]==10 means int16 offset count
>>>
>>> then the 15-bit int8 count is redundant with the 14-bit int16 count.
>>>
>>> So that prefix should select a different word size, maybe int24?
>>> Or just have two scale sizes, int8 and int32.
>> That is what I would suggest.
>>>>> The scale would be that of the smallest alignment of the 'from' and 'to'
>>>>> instructions, which is scaled up by inserting alignment NOPs before
>>>>> both.
>>>>
>>>> No need to align the "from" instruction.
>>>
>>> If you want the offset to be a count of int32's
>>> then both 'from' and 'to' need the same mod-4 alignment.
> <
>> Or, you have the rule that the address of the "from" instruction is
>> rounded up to the nearest size used for the displacement field. i.e. if
>> the address of the from instruction is say xxx03, and the displacement
>> is scaled to 4 byte values, the the xxx03 is "rounded up" to xxx04.
> <
> Rounding is significantly (several gates of delay) more expensive than
> truncation (free).

Fair enough. I think this whole thing is a little overblown, as it
really applies only to systems with byte aligned (or perhaps half word
aligned without a pairing requirement) systems, which, other than X86 I
think are pretty rare. If you stick with 32 bit instructions, then I
think Terje's original idea is just not worth it.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Variable branch length encoding?

<63b23795-83c4-4519-b272-7b12cb22189fn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=31003&group=comp.arch#31003

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:4049:0:b0:3b7:fda5:e05 with SMTP id j9-20020ac84049000000b003b7fda50e05mr2421688qtl.9.1677732211060;
Wed, 01 Mar 2023 20:43:31 -0800 (PST)
X-Received: by 2002:a05:6808:987:b0:383:db64:65 with SMTP id
a7-20020a056808098700b00383db640065mr400038oic.4.1677732210801; Wed, 01 Mar
2023 20:43:30 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 1 Mar 2023 20:43:30 -0800 (PST)
In-Reply-To: <ttom2p$30u8$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=99.251.79.92; posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 99.251.79.92
References: <ttn1gq$3taue$1@dont-email.me> <lKKLL.884821$gGD7.549447@fx11.iad>
<tto37q$pil$1@dont-email.me> <PPMLL.37185$qpNc.1967@fx03.iad>
<tto6fe$163j$1@dont-email.me> <cONLL.947025$8_id.671747@fx09.iad>
<ttof1c$267e$1@dont-email.me> <bafd7e03-efba-4a3b-9028-db6971311aedn@googlegroups.com>
<ttom2p$30u8$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <63b23795-83c4-4519-b272-7b12cb22189fn@googlegroups.com>
Subject: Re: Variable branch length encoding?
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Thu, 02 Mar 2023 04:43:31 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 8122

by: robf...@gmail.com - Thu, 2 Mar 2023 04:43 UTC

On Wednesday, March 1, 2023 at 6:11:25 PM UTC-5, Stephen Fuld wrote:
> On 3/1/2023 2:42 PM, MitchAlsup wrote:
> > On Wednesday, March 1, 2023 at 3:11:12 PM UTC-6, Stephen Fuld wrote:
> >> On 3/1/2023 11:35 AM, EricP wrote:
> >>> Stephen Fuld wrote:
> >>>> On 3/1/2023 10:28 AM, EricP wrote:
> >>>>> Stephen Fuld wrote:
> >>>>>> On 3/1/2023 8:06 AM, EricP wrote:
> >>>>>>> Terje Mathisen wrote:
> >>>>>>>> Would it be possible or just a bad idea to allow some form of
> >>>>>>>> variable/fp encoding of branch offsets?
> >>>>>>>>
> >>>>>>>> I.e. all small (byte) offsets would be encoded directly, but for
> >>>>>>>> longer ones you could require the target to be word/dword/qword
> >>>>>>>> aligned, and therefore allow a much greater range.
> >>>>>>>>
> >>>>>>>> It would probably only be useful for inter-module calls/branches....
> >>>>>>>>
> >>>>>>>> The main problem is probably that the target address adder is
> >>>>>>>> already in the critical path, so you can't just add a little
> >>>>>>>> shifter there as well.
> >>>>>>>>
> >>>>>>>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of
> >>>>>>>> this, right?)
> >>>>>>>>
> >>>>>>>> Terje
> >>>>>>>
> >>>>>>> As with any variable length instruction, extracting its size
> >>>>>>> in the parse or decode stage is critical for concurrent decode.
> >>>>>>> So ideally this offset *won't* be an "fp-like" data item with an
> >>>>>>> "exponent" field appended to the opcode, but the instruction size
> >>>>>>> in words is in the first byte and the internal instruction format
> >>>>>>> is sorted out by decode.
> >>>>>>>
> >>>>>>> RIP-relative offsets are often relative to the incremented RIP
> >>>>>>> and include the variable instruction's length. The alignment
> >>>>>>> is for both the 'from' and 'to' addresses and aligning just the
> >>>>>>> destination to 32 or 64 bits won't allow encoding a larger range.
> >>>>>>>
> >>>>>>> The issue I had is when instructions are not byte aligned,
> >>>>>>> in my case 16-bit words. For branches the immediate field is a
> >>>>>>> relative _word_ count to get one extra bit of offset range.
> >>>>>>> But what to do when the immediate field is a 64-bit value:
> >>>>>>> should that still be a word offset count,
> >>>>>>> or should it switch to being a byte offset count?
> >>>>>>>
> >>>>>>> For LD and ST the immediate offset is always a byte offset,
> >>>>>>> even if it is RIP-relative addressing.
> >>>>>>>
> >>>>>>> In my naming convention, Branch is to relative offset,
> >>>>>>> Jump is to an absolute address.
> >>>>>>>
> >>>>>>> I have a "BR reg" with the offset in a register,
> >>>>>>> intended for use in position-independent SWITCH statements.
> >>>>>>> A scaled-indexed LD of the offset onto a register, then BR reg.
> >>>>>>> But should that register offset be a byte count or a word count?
> >>>>>>> Or should I have two "BR reg" instructions, one for each?
> >>>>>>
> >>>>>> I, perhaps mistakenly, interpreted Terje's suggestion differently.
> >>>>>> I don't think he was suggesting variable length instructions, but an
> >>>>>> orthogonal idea. Let's take an example. Suppose you have 10 bits
> >>>>>> available for a branch offset. If the high order bit is zero, the
> >>>>>> low order nine bits give a byte address. But if the high order bit
> >>>>>> is one, you get the address by appending say 3 zero bits to the low
> >>>>>> order 9 bits, giving a possible displacement of 12 bits, but
> >>>>>> restricted to an 8 byte aligned address.
> >>>>>>
> >>>>>> Of course, this idea isn't applicable to just byte aligned
> >>>>>> instructions, though its benefit diminishes as you increase the
> >>>>>> instruction alignment requirements.
> >>>>>>
> >>>>>> Whether this is a good idea or not, I don't know.
> >>>>>
> >>>>> Ok, I understand. So the immediate field is fixed size and has
> >>>>> scale data type bits indicating it is a count of int8,int16,int32,
> >>>>> and a signed offset count of those sized words.
> >>>>
> >>>> Yes, although I doubt you need to support all three sizes.
> >>>
> >>> If we have a 16-bit immediate field,
> >>> and bit[15]==0 means bits [14:0] are a signed int8 offset count,
> >>> and bits [15:14]==10 means int16 offset count
> >>>
> >>> then the 15-bit int8 count is redundant with the 14-bit int16 count.
> >>>
> >>> So that prefix should select a different word size, maybe int24?
> >>> Or just have two scale sizes, int8 and int32.
> >> That is what I would suggest.
> >>>>> The scale would be that of the smallest alignment of the 'from' and 'to'
> >>>>> instructions, which is scaled up by inserting alignment NOPs before
> >>>>> both.
> >>>>
> >>>> No need to align the "from" instruction.
> >>>
> >>> If you want the offset to be a count of int32's
> >>> then both 'from' and 'to' need the same mod-4 alignment.
> > <
> >> Or, you have the rule that the address of the "from" instruction is
> >> rounded up to the nearest size used for the displacement field. i.e. if
> >> the address of the from instruction is say xxx03, and the displacement
> >> is scaled to 4 byte values, the the xxx03 is "rounded up" to xxx04.
> > <
> > Rounding is significantly (several gates of delay) more expensive than
> > truncation (free).
> Fair enough. I think this whole thing is a little overblown, as it
> really applies only to systems with byte aligned (or perhaps half word
> aligned without a pairing requirement) systems, which, other than X86 I
> think are pretty rare. If you stick with 32 bit instructions, then I
> think Terje's original idea is just not worth it.
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

I agree. With 32-bit or more instructions the displacement field can be large
enough to cover practically all branch cases. There are seldom code modules
larger than 16MB meaning 24-bits is probably adequate. The branch
displacement does not need to be able to cover the whole physical address
space when virtual addressing is available. For conditional branches a
displacement of 16-bits covers virtually all cases. If there is the odd case
where the branch displacement is not large enough it is possible to load a
value into a register and branch to that.

Re: Variable branch length encoding?

<ttpfsm$81ki$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=31004&group=comp.arch#31004

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Variable branch length encoding?
Date: Wed, 1 Mar 2023 22:31:48 -0800
Organization: A noiseless patient Spider
Lines: 137
Message-ID: <ttpfsm$81ki$1@dont-email.me>
References: <ttn1gq$3taue$1@dont-email.me> <lKKLL.884821$gGD7.549447@fx11.iad>
<tto37q$pil$1@dont-email.me> <PPMLL.37185$qpNc.1967@fx03.iad>
<tto6fe$163j$1@dont-email.me> <cONLL.947025$8_id.671747@fx09.iad>
<ttof1c$267e$1@dont-email.me>
<bafd7e03-efba-4a3b-9028-db6971311aedn@googlegroups.com>
<ttom2p$30u8$1@dont-email.me>
<63b23795-83c4-4519-b272-7b12cb22189fn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 2 Mar 2023 06:31:50 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="fcdb962efcdf496d0bf6a2b25d2c3fb7";
logging-data="263826"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/SxkB69ZEJXXsvn+d+gIADsiyUN3BFsFM="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.8.0
Cancel-Lock: sha1:PEd3IPzK7130CQod+5KuF0QlMpA=
Content-Language: en-US
In-Reply-To: <63b23795-83c4-4519-b272-7b12cb22189fn@googlegroups.com>

by: Stephen Fuld - Thu, 2 Mar 2023 06:31 UTC

On 3/1/2023 8:43 PM, robf...@gmail.com wrote:
> On Wednesday, March 1, 2023 at 6:11:25 PM UTC-5, Stephen Fuld wrote:
>> On 3/1/2023 2:42 PM, MitchAlsup wrote:
>>> On Wednesday, March 1, 2023 at 3:11:12 PM UTC-6, Stephen Fuld wrote:
>>>> On 3/1/2023 11:35 AM, EricP wrote:
>>>>> Stephen Fuld wrote:
>>>>>> On 3/1/2023 10:28 AM, EricP wrote:
>>>>>>> Stephen Fuld wrote:
>>>>>>>> On 3/1/2023 8:06 AM, EricP wrote:
>>>>>>>>> Terje Mathisen wrote:
>>>>>>>>>> Would it be possible or just a bad idea to allow some form of
>>>>>>>>>> variable/fp encoding of branch offsets?
>>>>>>>>>>
>>>>>>>>>> I.e. all small (byte) offsets would be encoded directly, but for
>>>>>>>>>> longer ones you could require the target to be word/dword/qword
>>>>>>>>>> aligned, and therefore allow a much greater range.
>>>>>>>>>>
>>>>>>>>>> It would probably only be useful for inter-module calls/branches...
>>>>>>>>>>
>>>>>>>>>> The main problem is probably that the target address adder is
>>>>>>>>>> already in the critical path, so you can't just add a little
>>>>>>>>>> shifter there as well.
>>>>>>>>>>
>>>>>>>>>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of
>>>>>>>>>> this, right?)
>>>>>>>>>>
>>>>>>>>>> Terje
>>>>>>>>>
>>>>>>>>> As with any variable length instruction, extracting its size
>>>>>>>>> in the parse or decode stage is critical for concurrent decode.
>>>>>>>>> So ideally this offset *won't* be an "fp-like" data item with an
>>>>>>>>> "exponent" field appended to the opcode, but the instruction size
>>>>>>>>> in words is in the first byte and the internal instruction format
>>>>>>>>> is sorted out by decode.
>>>>>>>>>
>>>>>>>>> RIP-relative offsets are often relative to the incremented RIP
>>>>>>>>> and include the variable instruction's length. The alignment
>>>>>>>>> is for both the 'from' and 'to' addresses and aligning just the
>>>>>>>>> destination to 32 or 64 bits won't allow encoding a larger range.
>>>>>>>>>
>>>>>>>>> The issue I had is when instructions are not byte aligned,
>>>>>>>>> in my case 16-bit words. For branches the immediate field is a
>>>>>>>>> relative _word_ count to get one extra bit of offset range.
>>>>>>>>> But what to do when the immediate field is a 64-bit value:
>>>>>>>>> should that still be a word offset count,
>>>>>>>>> or should it switch to being a byte offset count?
>>>>>>>>>
>>>>>>>>> For LD and ST the immediate offset is always a byte offset,
>>>>>>>>> even if it is RIP-relative addressing.
>>>>>>>>>
>>>>>>>>> In my naming convention, Branch is to relative offset,
>>>>>>>>> Jump is to an absolute address.
>>>>>>>>>
>>>>>>>>> I have a "BR reg" with the offset in a register,
>>>>>>>>> intended for use in position-independent SWITCH statements.
>>>>>>>>> A scaled-indexed LD of the offset onto a register, then BR reg.
>>>>>>>>> But should that register offset be a byte count or a word count?
>>>>>>>>> Or should I have two "BR reg" instructions, one for each?
>>>>>>>>
>>>>>>>> I, perhaps mistakenly, interpreted Terje's suggestion differently.
>>>>>>>> I don't think he was suggesting variable length instructions, but an
>>>>>>>> orthogonal idea. Let's take an example. Suppose you have 10 bits
>>>>>>>> available for a branch offset. If the high order bit is zero, the
>>>>>>>> low order nine bits give a byte address. But if the high order bit
>>>>>>>> is one, you get the address by appending say 3 zero bits to the low
>>>>>>>> order 9 bits, giving a possible displacement of 12 bits, but
>>>>>>>> restricted to an 8 byte aligned address.
>>>>>>>>
>>>>>>>> Of course, this idea isn't applicable to just byte aligned
>>>>>>>> instructions, though its benefit diminishes as you increase the
>>>>>>>> instruction alignment requirements.
>>>>>>>>
>>>>>>>> Whether this is a good idea or not, I don't know.
>>>>>>>
>>>>>>> Ok, I understand. So the immediate field is fixed size and has
>>>>>>> scale data type bits indicating it is a count of int8,int16,int32,
>>>>>>> and a signed offset count of those sized words.
>>>>>>
>>>>>> Yes, although I doubt you need to support all three sizes.
>>>>>
>>>>> If we have a 16-bit immediate field,
>>>>> and bit[15]==0 means bits [14:0] are a signed int8 offset count,
>>>>> and bits [15:14]==10 means int16 offset count
>>>>>
>>>>> then the 15-bit int8 count is redundant with the 14-bit int16 count.
>>>>>
>>>>> So that prefix should select a different word size, maybe int24?
>>>>> Or just have two scale sizes, int8 and int32.
>>>> That is what I would suggest.
>>>>>>> The scale would be that of the smallest alignment of the 'from' and 'to'
>>>>>>> instructions, which is scaled up by inserting alignment NOPs before
>>>>>>> both.
>>>>>>
>>>>>> No need to align the "from" instruction.
>>>>>
>>>>> If you want the offset to be a count of int32's
>>>>> then both 'from' and 'to' need the same mod-4 alignment.
>>> <
>>>> Or, you have the rule that the address of the "from" instruction is
>>>> rounded up to the nearest size used for the displacement field. i.e. if
>>>> the address of the from instruction is say xxx03, and the displacement
>>>> is scaled to 4 byte values, the the xxx03 is "rounded up" to xxx04.
>>> <
>>> Rounding is significantly (several gates of delay) more expensive than
>>> truncation (free).
>> Fair enough. I think this whole thing is a little overblown, as it
>> really applies only to systems with byte aligned (or perhaps half word
>> aligned without a pairing requirement) systems, which, other than X86 I
>> think are pretty rare. If you stick with 32 bit instructions, then I
>> think Terje's original idea is just not worth it.
>> --
>> - Stephen Fuld
>> (e-mail address disguised to prevent spam)
>
> I agree. With 32-bit or more instructions the displacement field can be large
> enough to cover practically all branch cases. There are seldom code modules
> larger than 16MB meaning 24-bits is probably adequate.

Remember, if all instructions are on 32 bit boundaries, you "gain" two
bits of range by appending two zero bits to the displacement. So 16MP
only requires 14 bits, not 16.

> The branch
> displacement does not need to be able to cover the whole physical address
> space when virtual addressing is available. For conditional branches a
> displacement of 16-bits covers virtually all cases. If there is the odd case
> where the branch displacement is not large enough it is possible to load a
> value into a register and branch to that.

Yes, I agree.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Variable branch length encoding?

<ttpg6a$826k$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=31005&group=comp.arch#31005

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Variable branch length encoding?
Date: Wed, 1 Mar 2023 22:36:58 -0800
Organization: A noiseless patient Spider
Lines: 138
Message-ID: <ttpg6a$826k$1@dont-email.me>
References: <ttn1gq$3taue$1@dont-email.me> <lKKLL.884821$gGD7.549447@fx11.iad>
<tto37q$pil$1@dont-email.me> <PPMLL.37185$qpNc.1967@fx03.iad>
<tto6fe$163j$1@dont-email.me> <cONLL.947025$8_id.671747@fx09.iad>
<ttof1c$267e$1@dont-email.me>
<bafd7e03-efba-4a3b-9028-db6971311aedn@googlegroups.com>
<ttom2p$30u8$1@dont-email.me>
<63b23795-83c4-4519-b272-7b12cb22189fn@googlegroups.com>
<ttpfsm$81ki$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 2 Mar 2023 06:36:58 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="fcdb962efcdf496d0bf6a2b25d2c3fb7";
logging-data="264404"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19PwEBlPfLF8pnTalaBbtiUtCziKTRTDj4="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.8.0
Cancel-Lock: sha1:9o9S1pNbpq4khRc/nobNTuSUv10=
In-Reply-To: <ttpfsm$81ki$1@dont-email.me>
Content-Language: en-US

by: Stephen Fuld - Thu, 2 Mar 2023 06:36 UTC

On 3/1/2023 10:31 PM, Stephen Fuld wrote:
> On 3/1/2023 8:43 PM, robf...@gmail.com wrote:
>> On Wednesday, March 1, 2023 at 6:11:25 PM UTC-5, Stephen Fuld wrote:
>>> On 3/1/2023 2:42 PM, MitchAlsup wrote:
>>>> On Wednesday, March 1, 2023 at 3:11:12 PM UTC-6, Stephen Fuld wrote:
>>>>> On 3/1/2023 11:35 AM, EricP wrote:
>>>>>> Stephen Fuld wrote:
>>>>>>> On 3/1/2023 10:28 AM, EricP wrote:
>>>>>>>> Stephen Fuld wrote:
>>>>>>>>> On 3/1/2023 8:06 AM, EricP wrote:
>>>>>>>>>> Terje Mathisen wrote:
>>>>>>>>>>> Would it be possible or just a bad idea to allow some form of
>>>>>>>>>>> variable/fp encoding of branch offsets?
>>>>>>>>>>>
>>>>>>>>>>> I.e. all small (byte) offsets would be encoded directly, but for
>>>>>>>>>>> longer ones you could require the target to be word/dword/qword
>>>>>>>>>>> aligned, and therefore allow a much greater range.
>>>>>>>>>>>
>>>>>>>>>>> It would probably only be useful for inter-module
>>>>>>>>>>> calls/branches...
>>>>>>>>>>>
>>>>>>>>>>> The main problem is probably that the target address adder is
>>>>>>>>>>> already in the critical path, so you can't just add a little
>>>>>>>>>>> shifter there as well.
>>>>>>>>>>>
>>>>>>>>>>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing
>>>>>>>>>>> some of
>>>>>>>>>>> this, right?)
>>>>>>>>>>>
>>>>>>>>>>> Terje
>>>>>>>>>>
>>>>>>>>>> As with any variable length instruction, extracting its size
>>>>>>>>>> in the parse or decode stage is critical for concurrent decode.
>>>>>>>>>> So ideally this offset *won't* be an "fp-like" data item with an
>>>>>>>>>> "exponent" field appended to the opcode, but the instruction size
>>>>>>>>>> in words is in the first byte and the internal instruction format
>>>>>>>>>> is sorted out by decode.
>>>>>>>>>>
>>>>>>>>>> RIP-relative offsets are often relative to the incremented RIP
>>>>>>>>>> and include the variable instruction's length. The alignment
>>>>>>>>>> is for both the 'from' and 'to' addresses and aligning just the
>>>>>>>>>> destination to 32 or 64 bits won't allow encoding a larger range.
>>>>>>>>>>
>>>>>>>>>> The issue I had is when instructions are not byte aligned,
>>>>>>>>>> in my case 16-bit words. For branches the immediate field is a
>>>>>>>>>> relative _word_ count to get one extra bit of offset range.
>>>>>>>>>> But what to do when the immediate field is a 64-bit value:
>>>>>>>>>> should that still be a word offset count,
>>>>>>>>>> or should it switch to being a byte offset count?
>>>>>>>>>>
>>>>>>>>>> For LD and ST the immediate offset is always a byte offset,
>>>>>>>>>> even if it is RIP-relative addressing.
>>>>>>>>>>
>>>>>>>>>> In my naming convention, Branch is to relative offset,
>>>>>>>>>> Jump is to an absolute address.
>>>>>>>>>>
>>>>>>>>>> I have a "BR reg" with the offset in a register,
>>>>>>>>>> intended for use in position-independent SWITCH statements.
>>>>>>>>>> A scaled-indexed LD of the offset onto a register, then BR reg.
>>>>>>>>>> But should that register offset be a byte count or a word count?
>>>>>>>>>> Or should I have two "BR reg" instructions, one for each?
>>>>>>>>>
>>>>>>>>> I, perhaps mistakenly, interpreted Terje's suggestion differently.
>>>>>>>>> I don't think he was suggesting variable length instructions,
>>>>>>>>> but an
>>>>>>>>> orthogonal idea. Let's take an example. Suppose you have 10 bits
>>>>>>>>> available for a branch offset. If the high order bit is zero, the
>>>>>>>>> low order nine bits give a byte address. But if the high order bit
>>>>>>>>> is one, you get the address by appending say 3 zero bits to the
>>>>>>>>> low
>>>>>>>>> order 9 bits, giving a possible displacement of 12 bits, but
>>>>>>>>> restricted to an 8 byte aligned address.
>>>>>>>>>
>>>>>>>>> Of course, this idea isn't applicable to just byte aligned
>>>>>>>>> instructions, though its benefit diminishes as you increase the
>>>>>>>>> instruction alignment requirements.
>>>>>>>>>
>>>>>>>>> Whether this is a good idea or not, I don't know.
>>>>>>>>
>>>>>>>> Ok, I understand. So the immediate field is fixed size and has
>>>>>>>> scale data type bits indicating it is a count of int8,int16,int32,
>>>>>>>> and a signed offset count of those sized words.
>>>>>>>
>>>>>>> Yes, although I doubt you need to support all three sizes.
>>>>>>
>>>>>> If we have a 16-bit immediate field,
>>>>>> and bit[15]==0 means bits [14:0] are a signed int8 offset count,
>>>>>> and bits [15:14]==10 means int16 offset count
>>>>>>
>>>>>> then the 15-bit int8 count is redundant with the 14-bit int16 count.
>>>>>>
>>>>>> So that prefix should select a different word size, maybe int24?
>>>>>> Or just have two scale sizes, int8 and int32.
>>>>> That is what I would suggest.
>>>>>>>> The scale would be that of the smallest alignment of the 'from'
>>>>>>>> and 'to'
>>>>>>>> instructions, which is scaled up by inserting alignment NOPs before
>>>>>>>> both.
>>>>>>>
>>>>>>> No need to align the "from" instruction.
>>>>>>
>>>>>> If you want the offset to be a count of int32's
>>>>>> then both 'from' and 'to' need the same mod-4 alignment.
>>>> <
>>>>> Or, you have the rule that the address of the "from" instruction is
>>>>> rounded up to the nearest size used for the displacement field.
>>>>> i.e. if
>>>>> the address of the from instruction is say xxx03, and the displacement
>>>>> is scaled to 4 byte values, the the xxx03 is "rounded up" to xxx04.
>>>> <
>>>> Rounding is significantly (several gates of delay) more expensive than
>>>> truncation (free).
>>> Fair enough. I think this whole thing is a little overblown, as it
>>> really applies only to systems with byte aligned (or perhaps half word
>>> aligned without a pairing requirement) systems, which, other than X86 I
>>> think are pretty rare. If you stick with 32 bit instructions, then I
>>> think Terje's original idea is just not worth it.
>>> --
>>> - Stephen Fuld
>>> (e-mail address disguised to prevent spam)
>>
>> I agree. With 32-bit or more instructions the displacement field can
>> be large
>> enough to cover practically all branch cases. There are seldom code
>> modules
>> larger than 16MB meaning 24-bits is probably adequate.
>
> Remember, if all instructions are on 32 bit boundaries, you "gain" two
> bits of range by appending two zero bits to the displacement. So 16MP
> only requires 14 bits, not 16.

Click here to read the complete article

Re: Variable branch length encoding?

<ttpk5u$8dtu$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=31006&group=comp.arch#31006

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Variable branch length encoding?
Date: Thu, 2 Mar 2023 08:45:01 +0100
Organization: A noiseless patient Spider
Lines: 43
Message-ID: <ttpk5u$8dtu$1@dont-email.me>
References: <ttn1gq$3taue$1@dont-email.me>
<f7d4bf44-d8cb-4359-8d32-dbe06bb247ebn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 2 Mar 2023 07:45:02 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="aba2ed9c8c39f127d5d9d7f87751415f";
logging-data="276414"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+TdT1hqrvmOoVB9pQKhyGMNv6zzrK6PANHpuLW9OKhuQ=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.15
Cancel-Lock: sha1:kW/SG/27mztON6UmxQ1ozbHk04o=
In-Reply-To: <f7d4bf44-d8cb-4359-8d32-dbe06bb247ebn@googlegroups.com>

by: Terje Mathisen - Thu, 2 Mar 2023 07:45 UTC

MitchAlsup wrote:
> On Wednesday, March 1, 2023 at 2:14:22â¯AM UTC-6, Terje Mathisen wrote:
>> Would it be possible or just a bad idea to allow some form of
>> variable/fp encoding of branch offsets?
>>
>> I.e. all small (byte) offsets would be encoded directly, but for longer
>> ones you could require the target to be word/dword/qword aligned, and
>> therefore allow a much greater range.
>>
>> It would probably only be useful for inter-module calls/branches...
> <
> It is generally OK to place entry-points on cache line boundaries (about 64Byte
> alignments) so you get 8 low order bits (which word instruction machines
> already use the lowest 2-bits.) A gain of 6-bits.
> <
> It is generally not OK to require cache line alignment on labels within a
> subroutine.

Yeah, that was my assumption (see above), so that this would only be
applicable for OS/library interfaces, but I suspect that it is far
better to simply allow large/arbitrary sized immediates to be able to
reach wherever the target is located.
> <
> It is generally a good idea that BR and CALL use the same multiplexer
> and address generator. So, mainly my first paragraph above is in conflict
> with the leading sentence of this paragraph.
>>
>> The main problem is probably that the target address adder is already in
>> the critical path, so you can't just add a little shifter there as well.
>>
>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of this,
>> right?)
> <
> In a pipeline microarchitecture, the added calculating branch targets is
> seldom the adder performing memory address generation.

??? I don't think I understand this paragraph?

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Variable branch length encoding?

<ttpklt$8fdv$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=31007&group=comp.arch#31007

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Variable branch length encoding?
Date: Thu, 2 Mar 2023 08:53:32 +0100
Organization: A noiseless patient Spider
Lines: 122
Message-ID: <ttpklt$8fdv$1@dont-email.me>
References: <ttn1gq$3taue$1@dont-email.me> <lKKLL.884821$gGD7.549447@fx11.iad>
<tto37q$pil$1@dont-email.me> <PPMLL.37185$qpNc.1967@fx03.iad>
<tto6fe$163j$1@dont-email.me> <cONLL.947025$8_id.671747@fx09.iad>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 2 Mar 2023 07:53:33 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="aba2ed9c8c39f127d5d9d7f87751415f";
logging-data="277951"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18LvJtATMpcdzva1hPvMySJ9njZYxdBBkMYuu5Lsw6bpw=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.15
Cancel-Lock: sha1:1Ovdezj7OaG5ImHgZyhY75kAZq8=
In-Reply-To: <cONLL.947025$8_id.671747@fx09.iad>

by: Terje Mathisen - Thu, 2 Mar 2023 07:53 UTC

EricP wrote:
> Stephen Fuld wrote:
>> On 3/1/2023 10:28 AM, EricP wrote:
>>> Stephen Fuld wrote:
>>>> On 3/1/2023 8:06 AM, EricP wrote:
>>>>> Terje Mathisen wrote:
>>>>>> Would it be possible or just a bad idea to allow some form of
>>>>>> variable/fp encoding of branch offsets?
>>>>>>
>>>>>> I.e. all small (byte) offsets would be encoded directly, but for
>>>>>> longer ones you could require the target to be word/dword/qword
>>>>>> aligned, and therefore allow a much greater range.
>>>>>>
>>>>>> It would probably only be useful for inter-module calls/branches...
>>>>>>
>>>>>> The main problem is probably that the target address adder is
>>>>>> already in the critical path, so you can't just add a little
>>>>>> shifter there as well.
>>>>>>
>>>>>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of
>>>>>> this, right?)
>>>>>>
>>>>>> Terje
>>>>>
>>>>> As with any variable length instruction, extracting its size
>>>>> in the parse or decode stage is critical for concurrent decode.
>>>>> So ideally this offset *won't* be an "fp-like" data item with an
>>>>> "exponent" field appended to the opcode, but the instruction size
>>>>> in words is in the first byte and the internal instruction format
>>>>> is sorted out by decode.
>>>>>
>>>>> RIP-relative offsets are often relative to the incremented RIP
>>>>> and include the variable instruction's length. The alignment
>>>>> is for both the 'from' and 'to' addresses and aligning just the
>>>>> destination to 32 or 64 bits won't allow encoding a larger range.
>>>>>
>>>>> The issue I had is when instructions are not byte aligned,
>>>>> in my case 16-bit words. For branches the immediate field is a
>>>>> relative _word_ count to get one extra bit of offset range.
>>>>> But what to do when the immediate field is a 64-bit value:
>>>>> should that still be a word offset count,
>>>>> or should it switch to being a byte offset count?
>>>>>
>>>>> For LD and ST the immediate offset is always a byte offset,
>>>>> even if it is RIP-relative addressing.
>>>>>
>>>>> In my naming convention, Branch is to relative offset,
>>>>> Jump is to an absolute address.
>>>>>
>>>>> I have a "BR reg" with the offset in a register,
>>>>> intended for use in position-independent SWITCH statements.
>>>>> A scaled-indexed LD of the offset onto a register, then BR reg.
>>>>> But should that register offset be a byte count or a word count?
>>>>> Or should I have two "BR reg" instructions, one for each?
>>>>
>>>> I, perhaps mistakenly, interpreted Terje's suggestion differently.
>>>> I don't think he was suggesting variable length instructions, but an
>>>> orthogonal idea. Let's take an example. Suppose you have 10 bits
>>>> available for a branch offset. If the high order bit is zero, the
>>>> low order nine bits give a byte address. But if the high order bit
>>>> is one, you get the address by appending say 3 zero bits to the low
>>>> order 9 bits, giving a possible displacement of 12 bits, but
>>>> restricted to an 8 byte aligned address.
>>>>
>>>> Of course, this idea isn't applicable to just byte aligned
>>>> instructions, though its benefit diminishes as you increase the
>>>> instruction alignment requirements.
>>>>
>>>> Whether this is a good idea or not, I don't know.
>>>
>>> Ok, I understand. So the immediate field is fixed size and has
>>> scale data type bits indicating it is a count of int8,int16,int32,
>>> and a signed offset count of those sized words.
>>
>> Yes, although I doubt you need to support all three sizes.
>
> If we have a 16-bit immediate field,
> and bit[15]==0 means bits [14:0] are a signed int8 offset count,
> and bits [15:14]==10 means int16 offset count
>
> then the 15-bit int8 count is redundant with the 14-bit int16 count.

Right, so this is obviously not feasible for a single-bit scaling.
Using 4/8/16-byte alignment makes a bit more sense. (But probably still
a bad idea.)

>
> So that prefix should select a different word size, maybe int24?
> Or just have two scale sizes, int8 and int32.
>
>>> The scale would be that of the smallest alignment of the 'from' and 'to'
>>> instructions, which is scaled up by inserting alignment NOPs before
>>> both.
>>
>> No need to align the "from" instruction.
>
> If you want the offset to be a count of int32's
> then both 'from' and 'to' need the same mod-4 alignment.
>
> That is, in order to use an int32 count as the offset then
> both BR and target instruction addresses bits [1:0] must be the same.

I did consider this and realized that for a dword-aligned target, the
source address would simply ignore any lower-order bytes and instead use
the truncated dword as the source. (Same for 8/16/32/64-byte aligned
destinations, whichever you decide that it makes sense to support.

Terje
>
>>> I suppose it depends on the size of the immediate field as to how
>>> often it would have to insert NOPs so that it can use a larger scale
>>> to bring a destination back into range.
>>
>> Yes. And also the mimimum size of an instruction. E.g. for CPUs with
>> 32 bit instructions, you "automatically" get a factor of four in the
>> range of the branch instructions.
>

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Variable branch length encoding?

<71ad5eec-2d02-4dee-84b3-e489924233den@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=31008&group=comp.arch#31008

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:620a:8413:b0:733:4e2d:7834 with SMTP id pc19-20020a05620a841300b007334e2d7834mr2514515qkn.4.1677754208683;
Thu, 02 Mar 2023 02:50:08 -0800 (PST)
X-Received: by 2002:a05:6830:26e8:b0:68b:e0dc:abc7 with SMTP id
m40-20020a05683026e800b0068be0dcabc7mr3298237otu.4.1677754208469; Thu, 02 Mar
2023 02:50:08 -0800 (PST)
Path: i2pn2.org!rocksolid2!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 2 Mar 2023 02:50:08 -0800 (PST)
In-Reply-To: <ttn1gq$3taue$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=128.76.247.189; posting-account=tYjOgQoAAACRs74arwcusKjVVQt_fFMX
NNTP-Posting-Host: 128.76.247.189
References: <ttn1gq$3taue$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <71ad5eec-2d02-4dee-84b3-e489924233den@googlegroups.com>
Subject: Re: Variable branch length encoding?
From: agf...@dtu.dk (Agner Fog)
Injection-Date: Thu, 02 Mar 2023 10:50:08 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2094

by: Agner Fog - Thu, 2 Mar 2023 10:50 UTC

Terje wrote:
>The main problem is probably that the target address adder is already in
>the critical path, so you can't just add a little shifter there as well.

It is not in a critical path if you have branch prediction, and the target address
is needed only at the end of a long pipeline.

The main problem is that you don't know if a target address in a different module
is aligned by a higher power of 2. A misalignment error will only be detected at
the link stage.

>(Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of this, right?)
The LEA instruction in x86 is implemented in the ALU, rather than the AGU in modern
processors. Sometimes, it requires 2 µops. Some old processors that have LEA in
the AGU have an extra delay because the address calculation is in a different
pipeline stage than the ALU.

Some cases of reading from a memory operand have an
extra delay if the address calculation is complex.

Re: Variable branch length encoding?

<900a0aeb-4350-49f9-aee4-65b2bb0895bdn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=31009&group=comp.arch#31009

copy link Newsgroups: comp.arch

X-Received: by 2002:ae9:e50b:0:b0:742:7464:5bde with SMTP id w11-20020ae9e50b000000b0074274645bdemr2830286qkf.8.1677783016477;
Thu, 02 Mar 2023 10:50:16 -0800 (PST)
X-Received: by 2002:a05:6808:278c:b0:384:3129:f59e with SMTP id
es12-20020a056808278c00b003843129f59emr3800901oib.4.1677783016184; Thu, 02
Mar 2023 10:50:16 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 2 Mar 2023 10:50:15 -0800 (PST)
In-Reply-To: <ttpk5u$8dtu$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:310b:56d:17e7:2e66;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:310b:56d:17e7:2e66
References: <ttn1gq$3taue$1@dont-email.me> <f7d4bf44-d8cb-4359-8d32-dbe06bb247ebn@googlegroups.com>
<ttpk5u$8dtu$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <900a0aeb-4350-49f9-aee4-65b2bb0895bdn@googlegroups.com>
Subject: Re: Variable branch length encoding?
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 02 Mar 2023 18:50:16 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3592

by: MitchAlsup - Thu, 2 Mar 2023 18:50 UTC

On Thursday, March 2, 2023 at 1:45:05 AM UTC-6, Terje Mathisen wrote:
> MitchAlsup wrote:
> > On Wednesday, March 1, 2023 at 2:14:22 AM UTC-6, Terje Mathisen wrote:
> >> Would it be possible or just a bad idea to allow some form of
> >> variable/fp encoding of branch offsets?
> >>
> >> I.e. all small (byte) offsets would be encoded directly, but for longer
> >> ones you could require the target to be word/dword/qword aligned, and
> >> therefore allow a much greater range.
> >>
> >> It would probably only be useful for inter-module calls/branches...
> > <
> > It is generally OK to place entry-points on cache line boundaries (about 64Byte
> > alignments) so you get 8 low order bits (which word instruction machines
> > already use the lowest 2-bits.) A gain of 6-bits.
> > <
> > It is generally not OK to require cache line alignment on labels within a
> > subroutine.
> Yeah, that was my assumption (see above), so that this would only be
> applicable for OS/library interfaces, but I suspect that it is far
> better to simply allow large/arbitrary sized immediates to be able to
> reach wherever the target is located.
> > <
> > It is generally a good idea that BR and CALL use the same multiplexer
> > and address generator. So, mainly my first paragraph above is in conflict
> > with the leading sentence of this paragraph.
> >>
> >> The main problem is probably that the target address adder is already in
> >> the critical path, so you can't just add a little shifter there as well.
> >>
> >> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of this,
> >> right?)
> > <
> > In a pipeline microarchitecture, the added calculating branch targets is
> > seldom the adder performing memory address generation.
<
> ??? I don't think I understand this paragraph?
<
There is an adder dedicated to branch target calculations that exists in the
DECODE stage of the pipeline. There is the data AGEN adder that exists in
the EXECUTE stage of the pipeline. You don't move logic across stage
boundaries.
<
> Terje
>
> --
> - <Terje.Mathisen at tmsw.no>
> "almost all programming can be viewed as an exercise in caching"

Re: Variable branch length encoding?

<ttquhb$chnk$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=31010&group=comp.arch#31010

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Variable branch length encoding?
Date: Thu, 2 Mar 2023 20:47:55 +0100
Organization: A noiseless patient Spider
Lines: 54
Message-ID: <ttquhb$chnk$1@dont-email.me>
References: <ttn1gq$3taue$1@dont-email.me>
<f7d4bf44-d8cb-4359-8d32-dbe06bb247ebn@googlegroups.com>
<ttpk5u$8dtu$1@dont-email.me>
<900a0aeb-4350-49f9-aee4-65b2bb0895bdn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 2 Mar 2023 19:47:55 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="aba2ed9c8c39f127d5d9d7f87751415f";
logging-data="411380"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18hU6mHPMOkRJLHeO/BbnPIJkJJbmCVTs8U1HnYEObCbw=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.15
Cancel-Lock: sha1:TRwAWZK6HIYuWvmDDEftLEUyNcY=
In-Reply-To: <900a0aeb-4350-49f9-aee4-65b2bb0895bdn@googlegroups.com>

by: Terje Mathisen - Thu, 2 Mar 2023 19:47 UTC

MitchAlsup wrote:
> On Thursday, March 2, 2023 at 1:45:05â¯AM UTC-6, Terje Mathisen wrote:
>> MitchAlsup wrote:
>>> On Wednesday, March 1, 2023 at 2:14:22â¯AM UTC-6, Terje Mathisen wrote:
>>>> Would it be possible or just a bad idea to allow some form of
>>>> variable/fp encoding of branch offsets?
>>>>
>>>> I.e. all small (byte) offsets would be encoded directly, but for longer
>>>> ones you could require the target to be word/dword/qword aligned, and
>>>> therefore allow a much greater range.
>>>>
>>>> It would probably only be useful for inter-module calls/branches...
>>> <
>>> It is generally OK to place entry-points on cache line boundaries (about 64Byte
>>> alignments) so you get 8 low order bits (which word instruction machines
>>> already use the lowest 2-bits.) A gain of 6-bits.
>>> <
>>> It is generally not OK to require cache line alignment on labels within a
>>> subroutine.
>> Yeah, that was my assumption (see above), so that this would only be
>> applicable for OS/library interfaces, but I suspect that it is far
>> better to simply allow large/arbitrary sized immediates to be able to
>> reach wherever the target is located.
>>> <
>>> It is generally a good idea that BR and CALL use the same multiplexer
>>> and address generator. So, mainly my first paragraph above is in conflict
>>> with the leading sentence of this paragraph.
>>>>
>>>> The main problem is probably that the target address adder is already in
>>>> the critical path, so you can't just add a little shifter there as well.
>>>>
>>>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of this,
>>>> right?)
>>> <
>>> In a pipeline microarchitecture, the added calculating branch targets is
>>> seldom the adder performing memory address generation.
> <
>> ??? I don't think I understand this paragraph?
> <
> There is an adder dedicated to branch target calculations that exists in the
> DECODE stage of the pipeline. There is the data AGEN adder that exists in
> the EXECUTE stage of the pipeline. You don't move logic across stage
> boundaries.

OK, that makes sense. It was the small typo where you wrote "the added
calculating branch targets" which I now understand should have been "the
adder calculating..." which caused my confusion.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Agner Fog <agfo@dtu.dk> schrieb:
> Terje wrote:
>>The main problem is probably that the target address adder is already in
>>the critical path, so you can't just add a little shifter there as well.
>
> It is not in a critical path if you have branch prediction, and the target address
> is needed only at the end of a long pipeline.
>
> The main problem is that you don't know if a target address in a different module
> is aligned by a higher power of 2. A misalignment error will only be detected at
> the link stage.

That can be addressed in the ABI prescribing the alignment of functions
in general.

It would also be possible to prescribe different alignment for internal
and external functions, to save space. But that would have to adddress
the question of call via function pointers (e.g. a vtab). And
if the compiler does devirtualization... that can get messy.

Re: Variable branch length encoding?

<c25a36b6-7a49-4d4a-b39c-3229e76cd423n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=31015&group=comp.arch#31015

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:620a:9cd:b0:71f:b8ba:ff42 with SMTP id y13-20020a05620a09cd00b0071fb8baff42mr643703qky.0.1677864050750;
Fri, 03 Mar 2023 09:20:50 -0800 (PST)
X-Received: by 2002:aca:1201:0:b0:384:2615:f63 with SMTP id
1-20020aca1201000000b0038426150f63mr786911ois.3.1677864050535; Fri, 03 Mar
2023 09:20:50 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 3 Mar 2023 09:20:50 -0800 (PST)
In-Reply-To: <tts4mu$1kj6p$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=162.157.97.93; posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 162.157.97.93
References: <ttn1gq$3taue$1@dont-email.me> <71ad5eec-2d02-4dee-84b3-e489924233den@googlegroups.com>
<tts4mu$1kj6p$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c25a36b6-7a49-4d4a-b39c-3229e76cd423n@googlegroups.com>
Subject: Re: Variable branch length encoding?
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 03 Mar 2023 17:20:50 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3212

by: Quadibloc - Fri, 3 Mar 2023 17:20 UTC

On Thursday, March 2, 2023 at 11:39:30 PM UTC-7, Thomas Koenig wrote:
> Agner Fog <ag...@dtu.dk> schrieb:
> > Terje wrote:
> >>The main problem is probably that the target address adder is already in
> >>the critical path, so you can't just add a little shifter there as well..
> >
> > It is not in a critical path if you have branch prediction, and the target address
> > is needed only at the end of a long pipeline.
> >
> > The main problem is that you don't know if a target address in a different module
> > is aligned by a higher power of 2. A misalignment error will only be detected at
> > the link stage.
> That can be addressed in the ABI prescribing the alignment of functions
> in general.
>
> It would also be possible to prescribe different alignment for internal
> and external functions, to save space. But that would have to adddress
> the question of call via function pointers (e.g. a vtab). And
> if the compiler does devirtualization... that can get messy.

Of course, in the typical case which I've seen actually existing in the
wild, this issue simply does not arise.

There is the short form of branch, which is 16 bits long, and contains
an 8-bit relative offset. This is a multiple of 16 bits, since instructions
are aligned on 16 bit boundaries, as it's important not to waste bits
in such a short address.

There is the long form of branch, which is 32 bits long. There is a
16-bit address field in the instruction. This is in bytes; the last bit is
therefore wasted, this is for consistency with all other memory
reference instructions, where the address field is also in bytes.

Obviously, all jumps to external addresses must be in the long form.

Attempts to... basically recompile programs in the link stage... are
a bad idea, I would think. Of course, some new advanced architecture
might do things differently from traditional ones.

John Savard

Re: Variable branch length encoding?

<tttef2$muvs$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=31017&group=comp.arch#31017

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Variable branch length encoding?
Date: Fri, 3 Mar 2023 12:32:00 -0600
Organization: A noiseless patient Spider
Lines: 142
Message-ID: <tttef2$muvs$1@dont-email.me>
References: <ttn1gq$3taue$1@dont-email.me> <lKKLL.884821$gGD7.549447@fx11.iad>
<5c89b9b6-b1eb-4100-97ce-ed21d9e6ef06n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 3 Mar 2023 18:32:02 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="09f719e250b9c3d906a1b82a27ab5c53";
logging-data="752636"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/13DoDNjaF8ebei1Kjhz++"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.7.2
Cancel-Lock: sha1:Tdl89I+jaLLOkfa744V6KcNaxo4=
Content-Language: en-US
In-Reply-To: <5c89b9b6-b1eb-4100-97ce-ed21d9e6ef06n@googlegroups.com>

by: BGB - Fri, 3 Mar 2023 18:32 UTC

On 3/1/2023 12:44 PM, MitchAlsup wrote:
> On Wednesday, March 1, 2023 at 10:06:45 AM UTC-6, EricP wrote:
>> Terje Mathisen wrote:
>>> Would it be possible or just a bad idea to allow some form of
>>> variable/fp encoding of branch offsets?
>>>
>>> I.e. all small (byte) offsets would be encoded directly, but for longer
>>> ones you could require the target to be word/dword/qword aligned, and
>>> therefore allow a much greater range.
>>>
>>> It would probably only be useful for inter-module calls/branches...
>>>
>>> The main problem is probably that the target address adder is already in
>>> the critical path, so you can't just add a little shifter there as well.
>>>
>>> (Even though LEA r1,[r2+8*r3+0x12345678] is already doing some of this,
>>> right?)
>>>
>>> Terje
>> As with any variable length instruction, extracting its size
>> in the parse or decode stage is critical for concurrent decode.
>> So ideally this offset *won't* be an "fp-like" data item with an
>> "exponent" field appended to the opcode, but the instruction size
>> in words is in the first byte and the internal instruction format
>> is sorted out by decode.
>>
>> RIP-relative offsets are often relative to the incremented RIP
>> and include the variable instruction's length. The alignment
>> is for both the 'from' and 'to' addresses and aligning just the
>> destination to 32 or 64 bits won't allow encoding a larger range.
>>
>> The issue I had is when instructions are not byte aligned,
>> in my case 16-bit words. For branches the immediate field is a
>> relative _word_ count to get one extra bit of offset range.
>> But what to do when the immediate field is a 64-bit value:
>> should that still be a word offset count,
>> or should it switch to being a byte offset count?
> <
> Item 1
>>
>> For LD and ST the immediate offset is always a byte offset,
>> even if it is RIP-relative addressing.
>>
>> In my naming convention, Branch is to relative offset,
>> Jump is to an absolute address.
>>
>> I have a "BR reg" with the offset in a register,
>> intended for use in position-independent SWITCH statements.
>> A scaled-indexed LD of the offset onto a register, then BR reg.
>> But should that register offset be a byte count or a word count?
>> Or should I have two "BR reg" instructions, one for each?
> <
> I agree that Branches should be to displacements (IP = IP+DISP)
> I agree that JUMPs should be to absolute locations (IP = value)
> <
> The timing in the pipeline where reg becomes available is in conflict
> with the time when the next fetch address is needed. Reading and
> forwarding take significantly longer than IP+DISP calculation. So,
> this typically adds 1 cycle of delay to BR reg forms.
> <
> Item 1::
> <
> My BR and CALL instructions have a 2^28-bit range from 26-bit
> displacement.
> But when I switch to the forms with constants, I switch to
> absolute addressing.

My BRA and BSR instructions are 20 bits (relative).
There are also absolute-48 bit versions (64-bit encoding).

Technically, there are also relative 33 bit branches (also a 64-bit
encoding), but these will be skipped by the branch predictor (and thus
slower than either the Disp20 or Abs48 encodings). These will need to
always take the "slow path" (via the AGU and explicit branch mechanism).

In my case, branch displacements are always in terms of 16 bit units.
Abs48 gives a byte address, but currently the LSB is effectively MBZ
(and the instruction stream has a mandatory 16-bit alignment).

In my case:
BRA disp
Is handled by the branch predictor (in ID1), and has 2 cycle of latency.

Avoiding the latency cycle would require handling branch prediction
during the IF stage.

Most conditional branches depend on whether or not the predictor
predicts them, etc. Branches will be skipped by the branch predictor if
they cross a 16 MB boundary (mostly because an adder and carry
propagation across 24 bits is faster than 48 bits).

The 'RTS' instruction (decoded as 'JMP LR'), along with 'JMP R1' are
treated as special cases by the branch predictor (the relevant registers
are fed into it via a side-channel). These will be skipped if the
register encodes a branch into a different operating mode (or if the
register's value was being modified by the EX stages).

At present, all the LR style values are encoded in Inter-ISA form (LSB
set with mode bits in the high part of the register), mostly so that
branches between function pointers in different ISA modes will work as
expected. The branch predictor verifies that the target mode matches the
current mode.

Note that 'JMP R1' is special, in that previously is assumed "LR
semantics" (always assume that the register is Inter-ISA, ignoring the
LSB), but now both implicitly require the LSB to be set (partly as a
"lint" feature; If the LSB is not set it likely means the LR value has
been corrupted).

For things like function pointers, the LSB would also be set and the
mode bits are encoded in the high order bits, partly as a way to
potentially allow function pointers to be shared between the base ISA
and XG2 modes.

There is also a possible future where the compiler can mix the ISA modes
on a per-function basis. Well, apart from the issue that 'BSR Disp20'
can't encode Inter-ISA cases, which would require effectively generating
a function pointer to the target and then calling through this, if the
modes differed. At present, BGBCC only generates binaries for a single
ISA mode (apart from its limited ability to encode RISC-V fragments).

If the branch predictor does predict a branch, the operation for the
following cycle is also automatically marked as flushed (if there is a
branch mispredict, it is needed to initiate a branch to the following
instruction, so it having been preemptively flushed has no visible
effect in this case).

All other branches will take the slower path, having their address
calculated by the AGU, and then dispatched via the EX1 and EX2 stages
(taking around 8 clock cycles). The branch is signaled in EX1, but the
branch mechanism doesn't really start to take action until EX2, mostly
for timing/latency reasons. The mechanism effectively invalidates
anything currently in the pipeline following the branch, then signals
the L1 I$ to start fetching from a new location (crossing back through
the branch-predictor and then to the L1 I-cache).

....

Subject	Author
Variable branch length encoding?	Terje Mathisen
Re: Variable branch length encoding?	Quadibloc
Re: Variable branch length encoding?	EricP
Re: Variable branch length encoding?	Stephen Fuld
Re: Variable branch length encoding?	EricP
Re: Variable branch length encoding?	Stephen Fuld
Re: Variable branch length encoding?	EricP
Re: Variable branch length encoding?	MitchAlsup
Re: Variable branch length encoding?	Stephen Fuld
Re: Variable branch length encoding?	MitchAlsup
Re: Variable branch length encoding?	Stephen Fuld
Re: Variable branch length encoding?	robf...@gmail.com
Re: Variable branch length encoding?	Stephen Fuld
Re: Variable branch length encoding?	Stephen Fuld
Re: Variable branch length encoding?	Terje Mathisen
Re: Variable branch length encoding?	mac
Re: Variable branch length encoding?	Stephen Fuld
Re: Variable branch length encoding?	Terje Mathisen
Re: Variable branch length encoding?	Stephen Fuld
Re: Variable branch length encoding?	John Levine
Re: Variable branch length encoding?	MitchAlsup
Re: Variable branch length encoding?	MitchAlsup
Re: Variable branch length encoding?	BGB
Re: Variable branch length encoding?	MitchAlsup
Re: Variable branch length encoding?	BGB
Re: Variable branch length encoding?	MitchAlsup
Re: Variable branch length encoding?	Terje Mathisen
Re: Variable branch length encoding?	MitchAlsup
Re: Variable branch length encoding?	Terje Mathisen
Re: Variable branch length encoding?	Paul A. Clayton
Re: Variable branch length encoding?	Agner Fog
Re: Variable branch length encoding?	Thomas Koenig
Re: Variable branch length encoding?	Quadibloc
Re: Variable branch length encoding?	Thomas Koenig
Re: Variable branch length encoding?	Quadibloc
Re: Variable branch length encoding?	John Levine
Re: Variable branch length encoding?	MitchAlsup
Re: Variable branch length encoding?	robf...@gmail.com
Re: Variable branch length encoding?	Quadibloc
Re: Variable branch length encoding?	John Levine
Re: Variable branch length encoding?	MitchAlsup
Re: addressing hacks, Variable branch length encoding?	John Levine
Re: addressing hacks, Variable branch length encoding?	MitchAlsup
Re: addressing hacks, Variable branch length encoding?	John Levine
Re: Variable branch length encoding?	Timothy McCaffrey
Re: Variable branch length encoding?	MitchAlsup