Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

6 May, 2024: The networking issue during the past two days has been identified and fixed.


devel / comp.arch / Re: Stealing a Great Idea from the 6600

SubjectAuthor
* Stealing a Great Idea from the 6600John Savard
+- Re: Stealing a Great Idea from the 6600Scott Lurndal
+* Re: Stealing a Great Idea from the 6600MitchAlsup1
|`* Re: Stealing a Great Idea from the 6600John Savard
| `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|  `* Re: Stealing a Great Idea from the 6600John Savard
|   `* Re: Stealing a Great Idea from the 6600John Savard
|    `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|     +* Re: Stealing a Great Idea from the 6600John Savard
|     |`- Re: Stealing a Great Idea from the 6600John Savard
|     `* Re: Stealing a Great Idea from the 6600John Savard
|      `* Re: Stealing a Great Idea from the 6600John Savard
|       `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        +* Re: Stealing a Great Idea from the 6600BGB
|        |`* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | +* Re: Stealing a Great Idea from the 6600John Savard
|        | |+* Re: Stealing a Great Idea from the 6600John Savard
|        | ||`* Re: Stealing a Great Idea from the 6600Lawrence D'Oliveiro
|        | || +* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | || |+- Re: Stealing a Great Idea from the 6600Lawrence D'Oliveiro
|        | || |`* Re: Stealing a Great Idea from the 6600Scott Lurndal
|        | || | `* Re: a bit of history, Stealing a Great Idea from the 6600John Levine
|        | || |  `* Re: a bit of history, Stealing a Great Idea from the 6600Anton Ertl
|        | || |   +- Re: a bit of history, Stealing a Great Idea from the 6600MitchAlsup1
|        | || |   `* Re: a bit of history, Stealing a Great Idea from the 6600John Levine
|        | || |    `* Re: a bit of history, Stealing a Great Idea from the 6600Thomas Koenig
|        | || |     `* Re: a bit of history, Stealing a Great Idea from the 6600John Levine
|        | || |      `* Re: a bit of history, Stealing a Great Idea from the 6600MitchAlsup1
|        | || |       `* Re: a bit of history, Stealing a Great Idea from the 6600Thomas Koenig
|        | || |        `- Re: a bit of history, Stealing a Great Idea from the 6600MitchAlsup1
|        | || `- Re: Stealing a Great Idea from the 6600John Savard
|        | |`* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | +* Re: Stealing a Great Idea from the 6600George Neuner
|        | | |`* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | | `* Re: Stealing a Great Idea from the 6600George Neuner
|        | | |  `* Re: Stealing a Great Idea from the 6600BGB
|        | | |   `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |    +* Re: Stealing a Great Idea from the 6600Anton Ertl
|        | | |    |`- Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |    +* Re: Stealing a Great Idea from the 6600BGB
|        | | |    |+* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |    ||`- Re: Stealing a Great Idea from the 6600BGB
|        | | |    |`- Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |    `* Re: Stealing a Great Idea from the 6600EricP
|        | | |     `* Re: Stealing a Great Idea from the 6600BGB
|        | | |      `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |       `* Re: Stealing a Great Idea from the 6600BGB
|        | | |        `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |         `* Re: Stealing a Great Idea from the 6600BGB
|        | | |          +* Re: Stealing a Great Idea from the 6600BGB
|        | | |          |`* Re: Stealing a Great Idea from the 6600BGB
|        | | |          | `* Re: Stealing a Great Idea from the 6600Thomas Koenig
|        | | |          |  `* Re: Stealing a Great Idea from the 6600BGB
|        | | |          |   `* Re: Stealing a Great Idea from the 6600BGB
|        | | |          |    `* Re: Stealing a Great Idea from the 6600Thomas Koenig
|        | | |          |     `- Re: Stealing a Great Idea from the 6600BGB
|        | | |          `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |           `* Re: Stealing a Great Idea from the 6600BGB
|        | | |            `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        | | |             `* Re: Stealing a Great Idea from the 6600BGB
|        | | |              `- Re: Stealing a Great Idea from the 6600Thomas Koenig
|        | | `- Re: Stealing a Great Idea from the 6600Tim Rentsch
|        | `* Re: Stealing a Great Idea from the 6600BGB
|        |  `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        |   +* Re: Stealing a Great Idea from the 6600BGB
|        |   |`* Re: Stealing a Great Idea from the 6600MitchAlsup1
|        |   | `- Re: Stealing a Great Idea from the 6600BGB
|        |   +* Re: Stealing a Great Idea from the 6600John Savard
|        |   |`- Re: Stealing a Great Idea from the 6600BGB
|        |   `* Re: Stealing a Great Idea from the 6600Terje Mathisen
|        |    `- Re: Stealing a Great Idea from the 6600BGB
|        `* Re: Stealing a Great Idea from the 6600John Savard
|         +* Re: Stealing a Great Idea from the 6600John Savard
|         |`* Re: Stealing a Great Idea from the 6600John Savard
|         | `* Re: Stealing a Great Idea from the 6600John Savard
|         |  `* Re: Stealing a Great Idea from the 6600John Savard
|         |   `* Re: Stealing a Great Idea from the 6600BGB
|         |    `- Re: Stealing a Great Idea from the 6600MitchAlsup1
|         `* Re: Stealing a Great Idea from the 6600MitchAlsup1
|          `- Re: Stealing a Great Idea from the 6600John Savard
`* Re: Stealing a Great Idea from the 6600Lawrence D'Oliveiro
 `- Re: Stealing a Great Idea from the 6600MitchAlsup1

Pages:1234
Re: Stealing a Great Idea from the 6600

<v04nai$ome3$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37935&group=comp.arch#37935

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Sun, 21 Apr 2024 22:59:12 -0500
Organization: A noiseless patient Spider
Lines: 160
Message-ID: <v04nai$ome3$1@dont-email.me>
References: <1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com>
<oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com>
<dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<v02eij$6d5b$1@dont-email.me>
<152f8504112a37d8434c663e99cb36c5@www.novabbs.org>
<e8eb2j1ftsikv6j4eeaksm8lkhc31fuipi@4ax.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 22 Apr 2024 05:59:16 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="7b1e3ac212388cea6886df46e04c8fee";
logging-data="809411"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Wu8NEarY4r0hyoBz5iAgitHde0R+T6Y8="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:s8/6vPSn0//4tpQ8a30o/vXkcU0=
In-Reply-To: <e8eb2j1ftsikv6j4eeaksm8lkhc31fuipi@4ax.com>
Content-Language: en-US
 by: BGB - Mon, 22 Apr 2024 03:59 UTC

On 4/21/2024 8:16 PM, John Savard wrote:
> On Sun, 21 Apr 2024 18:57:27 +0000, mitchalsup@aol.com (MitchAlsup1)
> wrote:
>> BGB wrote:
>
>>> Like, in-order superscalar isn't going to do crap if nearly every
>>> instruction depends on every preceding instruction. Even pipelining
>>> can't help much with this.
>
>> Pipelining CREATED this (back to back dependencies). No amount of
>> pipelining can eradicate RAW data dependencies.
>
> This is quite true. However, in case an unsophisticated individual
> might read this thread, I think that I shall clarify.
>
> Without pipelining, it is not a problem if each instruction depends on
> the one immediately previous, and so people got used to writing
> programs that way, as it was simple to write the code to do one thing
> before starting to write the code to begin doing another thing.
>

Yeah.

This is also typical of naive compiler output, say:
y=m*x+b;
Turns into RPN as, say:
LD(m) LD(x) MUL LD(b) ADD ST(C)
Which, in a naive compiler (though, one with register allocation) may
become, say:
MULS R8, R9, R12
ADD R12, R10, R13
MOV R13, R11 //MUL output first goes into temporary

But, if MUL is 3c and ADD is 2c, this ends up needing 6 cycles.

The situation would be significantly worse in a compiler lacking
register allocation (would add 8 memory operations to this; similar to
what one gets with "gcc -O0").

For the most part, as can be noted, I was comparing against "gcc -O3" on
the RV64 size, with "-ffuction-sections" and "-Wl,-gc-sections" and
similar, as otherwise GCC's output is significantly larger. Though,
nevermind the seemingly fairly bulky ELF metadata (PE/COFF is seemingly
a bit more compact here). Can note that "-O3" vs "-Os" also doesn't seem
to make that big of a difference for RV64.

If one has another expression, one can shuffle the operations between
the expressions together, and the latency is lower than had no shuffling
occurred; and if one can reduce dependencies enough, operations can be
run in parallel for further gain. But, all this depends on first being
able to shuffle things to break up the register-register dependencies
between instructions.

In BGBCC, this part was done via the WEXifier, which imposes a lot of
annoying restrictions (partly because it starts working after code
generation has already taken place).

In size-optimized code, this doesn't happen, which results in a
performance hit. This is partly since the WEXifier can only work with
32-bit instructions, can't cross labels or relocs, and requires the
register allocator to essentially round-robin the registers to minimize
dependencies, ...

But, preferentially always allocating a new register and avoiding
reusing registers within a basic block, while it reduces dependencies,
also eats a lot more registers (with the indirect cost of increasing the
number that need to be saved/restored, though the size-impact of this is
reduced somewhat via prolog/epilog compression).

Though, one can shuffle stuff at the 3AC level (which exists in my case
between the RPN and final code generation), but this is more hit-or-miss.

Better would have been to go from 3AC to a "virtual assembler", which
could then allow reordering before emitting the actual machine code (and
thus wouldn't be as restricted). This was originally considered, but
ended up not going this way as it seemed like more work (in terms of
internal restructuring) than to shove the logic in after the
machine-code was generated.

But, the current compiler architecture was the result of always doing
the most quick/dirty option at the time, which doesn't necessarily
result in an optimal design.

Granted, OTOH, "waterfall method" doesn't really have the best
track-record either (vs the "hack something together, hack on it some
more, ..." method).

> This remained true when the simplest original form of pipelining was
> brought in - where fetching one instruction from memory was overlapped
> with decoding the previous instruction, and executing the instruction
> before that.
>
> It's only when what was originally called "superpipelining" came
> along, where the execute stages of multiple successive instructions
> could be overlapped, that it was necessary to do something about
> dependencies in order to take advantage of the speedup that could
> provide.
>

Yeah.

Pipeline:
PF:
PC arrives at I$
Selected from:
If Branch: Branch-PC
Else, if Branch-Predicted, Branch Pred Result
Else, LastPC+PCStep
IF:
Fetches 96 bits at PC
Figures how much to advance PC;
Figures out if we can do superscalar here.
Check for register clashes;
Check for valid prefix and suffix;
If both checks pass, go for it.
ID:
Unpack instruction words;
Pipeline now splits into 3 lanes;
Branch predictor does its thing.
ID2/RF:
Results come in from the registers;
Figure out if current bundle can enter EX stages;
Figure out if each predicated instruction should execute.
EX1(EX1C|EX1B|EX1A):
Do stuff: ALU, Initiate memory access, ...
EX2(EX2C|EX2B|EX2A):
Do stuff | results arrive.
EX3(EX3C|EX3B|EX3A):
Results arrive;
Produce any final results.
WB:
Results are written into register file.

By EX1, it is known whether or not the branch will actually be taken, so
(if needed) it may override the former guess of the branch-predictor. By
EX2, the branch-initiation takes effect, and by EX3, the new PC reaches
the I$ (overriding whether else would have normally arrived).

In a few cases (such as a jump between ISA modes), extra cycles may be
needed to make sure everything is caught up (so, the same PC address is
held on the I$ input for around 3 cycles in this case).

This may happen if, say:
Jumping between Baseline, XG2, or RISC-V;
WEXMD changing whether WEX decoding is enabled or disabled;
If disabled, it behaves as if the WEX'ed instructions were scalar;
Jumbo prefixes ignore this (always behaving as-if it were enabled);
...
Mostly to make sure that IF and ID can decode the instructions as is
correct for the mode in question.

> John Savard

Re: Stealing a Great Idea from the 6600

<v04tpb$pqus$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37936&group=comp.arch#37936

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Mon, 22 Apr 2024 07:49:30 +0200
Organization: A noiseless patient Spider
Lines: 36
Message-ID: <v04tpb$pqus$1@dont-email.me>
References: <lge02j554ucc6h81n5q2ej0ue2icnnp7i5@4ax.com>
<e2097beb24bf27eed0a92f14596bd59e@www.novabbs.org>
<in312jlca131khq3vj0i24n6pb0hah2ur5@4ax.com>
<71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org>
<1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com>
<oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com>
<dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<v02eij$6d5b$1@dont-email.me>
<152f8504112a37d8434c663e99cb36c5@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 22 Apr 2024 07:49:31 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="629972819d75c94d2d6f7497086954e2";
logging-data="846812"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+vLKUXIEqXZg0X8OwbQ05aOQZGrGRyWIPsEBYPxCJ4DQ=="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Firefox/91.0 SeaMonkey/2.53.18.2
Cancel-Lock: sha1:VVdKb0sQRrvUAFtp3SY0FKnOX1k=
In-Reply-To: <152f8504112a37d8434c663e99cb36c5@www.novabbs.org>
 by: Terje Mathisen - Mon, 22 Apr 2024 05:49 UTC

MitchAlsup1 wrote:
> BGB wrote:
>
>> On 4/20/2024 5:03 PM, MitchAlsup1 wrote:
>> Like, in-order superscalar isn't going to do crap if nearly every
>> instruction depends on every preceding instruction. Even pipelining
>> can't help much with this.
>
> Pipelining CREATED this (back to back dependencies). No amount of
> pipelining can eradicate RAW data dependencies.
>
>> The compiler can shuffle the instructions into an order to limit the
>> number of register dependencies and better fit the pipeline. But,
>> then, most of the "hard parts" are already done (so it doesn't take
>> much more for the compiler to flag which instructions can run in
>> parallel).
>
> Compiler scheduling works for exactly 1 pipeline implementation and
> is suboptimal for all others.

Well, yeah.

OTOH, if your (definitely not my!) compiler can schedule a 4-wide static
ordering of operations, then it will be very nearly optimal on 2-wide
and 3-wide as well. (The difference is typically in a bit more loop
setup and cleanup code than needed.)

Hand-optimizing Pentium asm code did teach me to "think like a cpu",
which is probably the only part of the experience which is still kind of
relevant. :-)

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Stealing a Great Idea from the 6600

<v054gb$r679$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37938&group=comp.arch#37938

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Mon, 22 Apr 2024 02:44:09 -0500
Organization: A noiseless patient Spider
Lines: 69
Message-ID: <v054gb$r679$1@dont-email.me>
References: <lge02j554ucc6h81n5q2ej0ue2icnnp7i5@4ax.com>
<e2097beb24bf27eed0a92f14596bd59e@www.novabbs.org>
<in312jlca131khq3vj0i24n6pb0hah2ur5@4ax.com>
<71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org>
<1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com>
<oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com>
<dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<v02eij$6d5b$1@dont-email.me>
<152f8504112a37d8434c663e99cb36c5@www.novabbs.org>
<v04tpb$pqus$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 22 Apr 2024 09:44:11 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="7b1e3ac212388cea6886df46e04c8fee";
logging-data="891113"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18qqVe6hmnslVneJ+RZBW9acL+8yWPd1mA="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:3gTeugqaUUVegIUUX07vhnfkiAA=
In-Reply-To: <v04tpb$pqus$1@dont-email.me>
Content-Language: en-US
 by: BGB - Mon, 22 Apr 2024 07:44 UTC

On 4/22/2024 12:49 AM, Terje Mathisen wrote:
> MitchAlsup1 wrote:
>> BGB wrote:
>>
>>> On 4/20/2024 5:03 PM, MitchAlsup1 wrote:
>>> Like, in-order superscalar isn't going to do crap if nearly every
>>> instruction depends on every preceding instruction. Even pipelining
>>> can't help much with this.
>>
>> Pipelining CREATED this (back to back dependencies). No amount of
>> pipelining can eradicate RAW data dependencies.
>>
>>> The compiler can shuffle the instructions into an order to limit the
>>> number of register dependencies and better fit the pipeline. But,
>>> then, most of the "hard parts" are already done (so it doesn't take
>>> much more for the compiler to flag which instructions can run in
>>> parallel).
>>
>> Compiler scheduling works for exactly 1 pipeline implementation and
>> is suboptimal for all others.
>
> Well, yeah.
>
> OTOH, if your (definitely not my!) compiler can schedule a 4-wide static
> ordering of operations, then it will be very nearly optimal on 2-wide
> and 3-wide as well. (The difference is typically in a bit more loop
> setup and cleanup code than needed.)
>
> Hand-optimizing Pentium asm code did teach me to "think like a cpu",
> which is probably the only part of the experience which is still kind of
> relevant. :-)
>

Mine is hard-pressed to even make effective use of the current pipeline,
so going wider does not make sense at present.

As I had noted before, the main merit of 3 wide in my case is that it
makes it easier to justify a 6R register file, which, unlike the 4R
register file, doesn't choke up with trying to run other instructions in
parallel with memory store and similar (which is actually a fairly
serious restriction given how much memory operations tend to clog up
Lane 1; opportunities for "ALU|ST" being more common than "ALU|ALU").

Granted, one could argue that (Reg, Disp) memory addressing could be
supported entirely within a 2R1W pattern, which while true in premise,
does not match my implementation (which always uses indexed addressing
internally, treating the Disp as a virtual register; thus eating 3
register ports).

Well, and for the 4R2W configuration, the main priority is minimizing
LUT cost (which favors leaving it as-is, with the current restrictions).

Granted, some similar issues apply to 128-bit MOV.X and SIMD ops, which
as-is can only exist as scalar ops. These could potentially also be
hacked around (say, to allow ALU|SIMD or ALU|MOV.X, but the "fix" would
cost a lot of LUTs). Mostly in that variability in terms of input
routing does not come cheap.

Though, that said, the 3rd lane still gets used for a share of basic ALU
instructions, so isn't entirely going to waste either.

> Terje
>

Re: Stealing a Great Idea from the 6600

<v05bdi$sls5$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37940&group=comp.arch#37940

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Mon, 22 Apr 2024 04:42:07 -0500
Organization: A noiseless patient Spider
Lines: 431
Message-ID: <v05bdi$sls5$1@dont-email.me>
References: <lge02j554ucc6h81n5q2ej0ue2icnnp7i5@4ax.com>
<e2097beb24bf27eed0a92f14596bd59e@www.novabbs.org>
<in312jlca131khq3vj0i24n6pb0hah2ur5@4ax.com>
<71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org>
<1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com>
<oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com>
<dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<v02eij$6d5b$1@dont-email.me>
<152f8504112a37d8434c663e99cb36c5@www.novabbs.org>
<v045in$hqoj$1@dont-email.me>
<631f946ee0323ccaa31fae0d7e30e2d5@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 22 Apr 2024 11:42:11 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="7b1e3ac212388cea6886df46e04c8fee";
logging-data="939909"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19/W3k8c3OcfJiFFxISNpDkUGK4cuBHgS8="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:fnwZj+7pr59bUCGnj4YhfpjxE7E=
In-Reply-To: <631f946ee0323ccaa31fae0d7e30e2d5@www.novabbs.org>
Content-Language: en-US
 by: BGB - Mon, 22 Apr 2024 09:42 UTC

On 4/21/2024 6:31 PM, MitchAlsup1 wrote:
> BGB wrote:
>
>> On 4/21/2024 1:57 PM, MitchAlsup1 wrote:
>>> BGB wrote:
>>>
>>> One of the things that I notice with My 66000 is when you get all the
>>> constants you ever need at the calculation OpCodes, you end up with
>>> FEWER instructions that "go random places" such as instructions that
>>> <well> paste constants together. This leave you with a data dependent
>>> string of calculations with occasional memory references. That is::
>>> universal constants gets rid of the easy to pipeline extra instructions
>>> leaving the meat of the algorithm exposed.
>>>
>
>> Possibly true.
>
>> RISC-V tends to have a lot of extra instructions due to lack of big
>> constants and lack of indexed addressing.
>
> You forgot the "every one an his brother" design of the ISA>
>
>> And, BJX2 has a lot of frivolous register-register MOV instructions.
>
> I empower you to get rid of them....

OK, I more meant that the compiler is prone to emit lots of:
MOV Reg, Reg

Rather than, say, the ISA listing being full of redundant "MOV Reg, Reg"
encodings...

But, getting rid of them has been an ongoing battle of compiler fiddling.

Many were popping up from odd corners, say:
Register allocation issues (mostly involving function call/return);
Type casts and promotions between equivalent representations (*1);
...

*1: Had eliminated some of these, by allowing temporaries to be coerced
directly into different types in some cases. But, "for reasons" doesn't
really work with some other types of variables. But, say, int->long, or
casting between pointer types, etc, can be done without needing to do
anything to the value in the register.

But, yes, performance and code density would be better with fewer
frivolous register MOVs.

> <snip>
>>>>> If you design around the notion of a 3R1W register file, FMAC and
>>>>> INSERT
>>>>> fall out of the encoding easily. Done right, one can switch it into
>>>>> a 4R
>>>>> or 4W register file for ENTER and EXIT--lessening the overhead of
>>>>> call/ret.
>>>>>
>>>
>>>> Possibly.
>>>
>>>> It looks like some savings could be possible in terms of prologs and
>>>> epilogs.
>>>
>>>> As-is, these are generally like:
>>>>    MOV    LR, R18
>>>>    MOV    GBR, R19
>>>>    ADD    -192, SP
>>>>    MOV.X  R18, (SP, 176)  //save GBR and LR
>>>>    MOV.X  ...  //save registers
>>>
>>> Why not an instruction that saves LR and GBR without wasting
>>> instructions
>>> to place them side by side prior to saving them ??
>>>
>
>> I have an optional MOV.C instruction, but would need to restructure
>> the code for generating the prologs to make use of them in this case.
>
>> Say:
>>    MOV.C  GBR, (SP, 184)
>>    MOV.C  LR, (SP, 176)
>
>> Though, MOV.C is considered optional.
>
>> There is a "MOV.C Lite" option, which saves some cost by only allowing
>> it for certain CR's (mostly LR and GBR), which also sort of overlaps
>> with (and is needed) by RISC-V mode, because these registers are in
>> GPR land for RV.
>
>> But, in any case, current compiler output shuffles them to R18 and R19
>> before saving them.
>
>
>>>>    WEXMD  2  //specify that we want 3-wide execution here
>>>
>>>>    //Reload GBR, *1
>>>>    MOV.Q  (GBR, 0), R18
>>>>    MOV    0, R0  //special reloc here
>>>>    MOV.Q  (GBR, R0), R18
>>>>    MOV    R18, GBR
>>>
>
>> Correction:
>>  >>    MOV.Q  (R18, R0), R18
>
>
>>> It is gorp like that that lead me to do it in HW with ENTER and EXIT.
>>> Save registers to the stack, setup FP if desired, allocate stack on
>>> SP, and decide if EXIT also does RET or just reloads the file. This
>>> would require 2 free registers if done in pure SW, along with several
>>> MOVs...
>>>
>
>> Possibly.
>> The partial reason it loads into R0 and uses R0 as an index, was that
>> I defined this mechanism before jumbo prefixes existed, and hadn't
>> updated it to allow for jumbo prefixes.
>
> No time like the present...
>

OK. Made this change.

Only a minor change to my compiler and PE loader.

>> Well, and if I used a direct displacement for GBR (which, along with
>> PC, is always BYTE Scale), this would have created a hard limit of 64
>> DLL's per process-space (I defined it as Disp24, which allows a more
>> reasonable hard upper limit of 2M DLLs per process-space).
>
> In my case, restricting myself to 32-bit IP relative addressing, GOT can
> be anywhere within ±2GB of the accessing instruction and can be as big
> as one desires.
>

In this case:
GBR points to the start of ".data" for a given PE image;
This starts with a pointer to a table of GBR pointers for every DLL in
the process;
Each DLL is assigned an index into this table, fixed up at load time;
The magic ritual, when perfored, will get GBR pointing at the
".data"/".bss" sections for that particular DLL.

But, say, one loads the EXE and DLLs.

One creates a program instance by allocating memory for each of the
data/bss sections, copying the data section from the base image, and
putting it in the table. Then jumping to the entry point with the EXE's
section in GBR.

One can fire up a new instance by allocating a new set of data areas,
jumping to the entry point as before. This instance does not need to
know or care that the prior instance exists, even if both exist in the
same address space, and have all their code at the same addresses (since
the ".text" sections are shared between all instances).

Normal PC-relative GOT's can't do this. You would either need multiple
address spaces, or multiple loaded copies of each image.

>> Granted, nowhere near even the limit of 64 as of yet. But, I had noted
>> that Windows programs would often easily exceed this limit, with even
>> a fairly simple program pulling in a fairly large number of random
>> DLLs, so in any case, a larger limit was needed.
>
> Due to the way linkages work in My 66000, each DLL gets its own GOT.
> So there is essentially no bounds on how many can be present/in-use.
> A LD of a GOT[entry] gets a pointer to the external variable.
> A CALX of GOT[entry] is a call through the GOT table using std ABI.
> {{There is no PLT}}
>

OK.

Had done it a little different:
Imported function gets a stub, generally like:
Foo:
MOV.Q (PC, 4), R1
JMP R1
_imp_Foo: .QWORD 0

Import table (or IAT / Import Address Table) points at _imp_Foo and
fixes it up to point at the imported function.

Had defined an alternate version:
Foo:
_imp_Foo:
BRA Abs48

The loader would see and special-case the BRA Abs48 instruction.

But, this latter form ran into a problem:
Things will violently explode if the EXE and DLL (or one DLL and
another) are not in the same ISA mode (say, Baseline vs XG2).

Which means, I am back to the less efficient option of needing to load
then branch.

Granted:
MOV Imm64, R1
JMP R1
Could also work, and saves a few clock cycles.

Doesn't currently extend to global variables, but I don't really feel
this is a huge loss. Might fix eventually.

Generally, loader is hard-coded to assume import by name, as I didn't
feel it worth the bother to try to deal with importing by 16-bit ordinal
number.

>> One potential optimization here is that the main EXE will always be 0
>> in the process, so this sequence could be reduced to, potentially:
>>    MOV.Q (GBR, 0), R18
>>    MOV.C (R18, 0), GBR
>
>> Early on, I did not have the constraint that main EXE was always 0,
>> and had initially assumed it would be treated equivalently to a DLL.
>
>
>>>>    //Generate Stack Canary, *2
>>>>    MOV    0x5149, R18  //magic number (randomly generated)
>>>>    VSKG   R18, R18  //Magic (combines input with SP and magic numbers)
>>>>    MOV.Q  R18, (SP, 144)
>>>
>>>>    ...
>>>>    function-specific stuff
>>>>    ...
>>>
>>>>    MOV    0x5149, R18
>>>>    MOV.Q  (SP, 144), R19
>>>>    VSKC   R18, R19  //Validate canary
>>>>    ...
>>>
>>>
>>>> *1: This part ties into the ABI, and mostly exists so that each PE
>>>> image can get GBR reloaded back to its own ".data"/".bss" sections
>>>> (with
>>>
>>> Universal displacements make GBR unnecessary as a memory reference can
>>> be accompanied with a 16-bit, 32-bit, or 64-bit displacement. Yes,
>>> you can read GOT[#i] directly without a pointer to it.
>>>
>
>> If I were doing a more conventional ABI, I would likely use (PC,
>> Disp33s) for accessing global variables.
>
> Even those 128GB away ??
>


Click here to read the complete article
Re: Stealing a Great Idea from the 6600

<h8gd2jduqgk7i6decn0lj902pob7bud984@4ax.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37941&group=comp.arch#37941

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: quadib...@servername.invalid (John Savard)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Mon, 22 Apr 2024 14:13:41 -0600
Organization: A noiseless patient Spider
Lines: 97
Message-ID: <h8gd2jduqgk7i6decn0lj902pob7bud984@4ax.com>
References: <lge02j554ucc6h81n5q2ej0ue2icnnp7i5@4ax.com> <e2097beb24bf27eed0a92f14596bd59e@www.novabbs.org> <in312jlca131khq3vj0i24n6pb0hah2ur5@4ax.com> <71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org> <1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com> <oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com> <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 22 Apr 2024 22:13:44 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="9b6945b7d48ec219e9571a56b657569c";
logging-data="1209489"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18kAq5h/mftD7N8E81kOfmVTCNNbR/uagw="
Cancel-Lock: sha1:QN/BiJP9oUTiUxqZmBUjM6rLnA8=
X-Newsreader: Forte Free Agent 3.3/32.846
 by: John Savard - Mon, 22 Apr 2024 20:13 UTC

On Sat, 20 Apr 2024 17:07:11 +0000, mitchalsup@aol.com (MitchAlsup1)
wrote:
>John Savard wrote:

>> And, hey, I'm not the first guy to get sunk because of forgetting what
>> lies under the tip of the iceberg that's above the water.
>
>> That also happened to the captain of the _Titanic_.
>
>Concer-tina-tanic !?!

Oh, dear. This discussion has inspired me to rework the basic design
of Concertina II _yet again_!

The new design, not yet online, will have the following features:

The code stream will continue to be divided into 256-bit blocks.

However, block headers wil be eliminated. Instead, this functionality
will be subsumed into the instruction set.

Case I:

Indicating that from 1 to 7 32-bit instruction slots in a block are
not used for instructions, but instead may contain pseudo-immediates
will be achieved by:

Placing a two-address register-to-register operate instruction in the
first instruction slot in a block. These instructions will have a
three-bit field which, if nonzero, indicates the amount of space
reserved.

To avoid waste, when such an instruction is present in any slot other
than the first, that field will have the following function:

If nonzero, it points to an instruction slot (slots 1 through 7, in
the second through eighth positions) and a duplicate copy of the
instruction in that slot will be placed in the instruction stream
immediately following the instruction with that field.

The following special conditions apply:

If the instruction slot contains a pair of 16-bit instructions, only
the first of those instructions is so inserted for execution.

The instruction slot may not be one that is reserved for
pseudo-immediates, except that it may be the _first_ such slot, in
which case, the first 16 bits of that slot are taken as a 16-bit
instruction, with the format indicated by the first bit (as opposed to
the usual 17th bit) of that instruction slot's contents.

So it's possible to reserve an odd multiple of 16 bits for
pseudo-immediates, so as to avoid waste.

Case II:

Instructions longer than 32 bits are specified by being of the form:

The first instruction slot:

11111
00
(3 bits) length in instruction slots, from 2 to 7
(22 bits) rest of the first part of the instruction

All remaining instruction slots:

11111
(3 bits) position within instruction, from 2 to 7
(24 bits) rest of this part of the instruction

This mechanism, however, will _also_ be used for VLIW functionality or
prefix functionality which was formerly in block headers.

In that case, the first instruction slot, and the remaining
instruction slots, no longer need to be contiguous; instead, ordinary
32-bit instructions or pairs of 16-bit instlructions can occur between
the portions of the ensemble of prefixed instructions formed by this
means.

And there is a third improvement.

When Case I above is in effect, the block in which space for
pseudo-immediates is reserved will be stored in an internal register
in the processor.

Subsequent blocks can contain operate instructions with
pseudo-immediate operands even if no space for pseudo-immediates is
reserved in those blocks. In that case, the retained copy of the last
block encountered in which pseudo-immediates were reserved shall be
referenced instead.

I think these changes will improve code density... or, at least, they
will make it appear that no space is obviously forced to be wasted,
even if no real improvement in code density results.

John Savard

Re: Stealing a Great Idea from the 6600

<hmod2jpvun417k1qjm04pvro5gvt3u21rg@4ax.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37942&group=comp.arch#37942

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: quadib...@servername.invalid (John Savard)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Mon, 22 Apr 2024 16:22:11 -0600
Organization: A noiseless patient Spider
Lines: 102
Message-ID: <hmod2jpvun417k1qjm04pvro5gvt3u21rg@4ax.com>
References: <e2097beb24bf27eed0a92f14596bd59e@www.novabbs.org> <in312jlca131khq3vj0i24n6pb0hah2ur5@4ax.com> <71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org> <1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com> <oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com> <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <h8gd2jduqgk7i6decn0lj902pob7bud984@4ax.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 23 Apr 2024 00:22:13 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="7270acb597502ebace3edfa43b122dbb";
logging-data="1265504"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+7uI5mVj81cZffKzkWrh/LDALirGuQ60k="
Cancel-Lock: sha1:GWa4IM+6N0YJT8tzApyE6EMTXls=
X-Newsreader: Forte Free Agent 3.3/32.846
 by: John Savard - Mon, 22 Apr 2024 22:22 UTC

On Mon, 22 Apr 2024 14:13:41 -0600, John Savard
<quadibloc@servername.invalid> wrote:

>On Sat, 20 Apr 2024 17:07:11 +0000, mitchalsup@aol.com (MitchAlsup1)
>wrote:
>>John Savard wrote:
>
>>> And, hey, I'm not the first guy to get sunk because of forgetting what
>>> lies under the tip of the iceberg that's above the water.
>>
>>> That also happened to the captain of the _Titanic_.
>>
>>Concer-tina-tanic !?!
>
>Oh, dear. This discussion has inspired me to rework the basic design
>of Concertina II _yet again_!
>
>The new design, not yet online, will have the following features:
>
>The code stream will continue to be divided into 256-bit blocks.
>
>However, block headers wil be eliminated. Instead, this functionality
>will be subsumed into the instruction set.
>
>Case I:
>
>Indicating that from 1 to 7 32-bit instruction slots in a block are
>not used for instructions, but instead may contain pseudo-immediates
>will be achieved by:
>
>Placing a two-address register-to-register operate instruction in the
>first instruction slot in a block. These instructions will have a
>three-bit field which, if nonzero, indicates the amount of space
>reserved.
>
>To avoid waste, when such an instruction is present in any slot other
>than the first, that field will have the following function:
>
>If nonzero, it points to an instruction slot (slots 1 through 7, in
>the second through eighth positions) and a duplicate copy of the
>instruction in that slot will be placed in the instruction stream
>immediately following the instruction with that field.
>
>The following special conditions apply:
>
>If the instruction slot contains a pair of 16-bit instructions, only
>the first of those instructions is so inserted for execution.
>
>The instruction slot may not be one that is reserved for
>pseudo-immediates, except that it may be the _first_ such slot, in
>which case, the first 16 bits of that slot are taken as a 16-bit
>instruction, with the format indicated by the first bit (as opposed to
>the usual 17th bit) of that instruction slot's contents.
>
>So it's possible to reserve an odd multiple of 16 bits for
>pseudo-immediates, so as to avoid waste.
>
>Case II:
>
>Instructions longer than 32 bits are specified by being of the form:
>
>The first instruction slot:
>
>11111
>00
>(3 bits) length in instruction slots, from 2 to 7
>(22 bits) rest of the first part of the instruction
>
>All remaining instruction slots:
>
>11111
>(3 bits) position within instruction, from 2 to 7
>(24 bits) rest of this part of the instruction
>
>This mechanism, however, will _also_ be used for VLIW functionality or
>prefix functionality which was formerly in block headers.
>
>In that case, the first instruction slot, and the remaining
>instruction slots, no longer need to be contiguous; instead, ordinary
>32-bit instructions or pairs of 16-bit instlructions can occur between
>the portions of the ensemble of prefixed instructions formed by this
>means.
>
>And there is a third improvement.
>
>When Case I above is in effect, the block in which space for
>pseudo-immediates is reserved will be stored in an internal register
>in the processor.
>
>Subsequent blocks can contain operate instructions with
>pseudo-immediate operands even if no space for pseudo-immediates is
>reserved in those blocks. In that case, the retained copy of the last
>block encountered in which pseudo-immediates were reserved shall be
>referenced instead.
>
>I think these changes will improve code density... or, at least, they
>will make it appear that no space is obviously forced to be wasted,
>even if no real improvement in code density results.

The page has now been updated to reflect this modified design.

John Savard

Re: Stealing a Great Idea from the 6600

<aj3e2j13ntqoofh22cienlntgrnkgrj488@4ax.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37944&group=comp.arch#37944

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: quadib...@servername.invalid (John Savard)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Mon, 22 Apr 2024 19:36:54 -0600
Organization: A noiseless patient Spider
Lines: 64
Message-ID: <aj3e2j13ntqoofh22cienlntgrnkgrj488@4ax.com>
References: <in312jlca131khq3vj0i24n6pb0hah2ur5@4ax.com> <71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org> <1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com> <oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com> <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <h8gd2jduqgk7i6decn0lj902pob7bud984@4ax.com> <hmod2jpvun417k1qjm04pvro5gvt3u21rg@4ax.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 23 Apr 2024 03:36:56 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="7270acb597502ebace3edfa43b122dbb";
logging-data="1458531"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18G9tQl+ELHs+lmj6DZRjBjZKUEDc0WA5k="
Cancel-Lock: sha1:S+zBCodAEVarrg7HzywIZIX3BHo=
X-Newsreader: Forte Free Agent 3.3/32.846
 by: John Savard - Tue, 23 Apr 2024 01:36 UTC

On Mon, 22 Apr 2024 16:22:11 -0600, John Savard
<quadibloc@servername.invalid> wrote:

>On Mon, 22 Apr 2024 14:13:41 -0600, John Savard
><quadibloc@servername.invalid> wrote:

>>The first instruction slot:
>>
>>11111
>>00
>>(3 bits) length in instruction slots, from 2 to 7
>>(22 bits) rest of the first part of the instruction
>>
>>All remaining instruction slots:
>>
>>11111
>>(3 bits) position within instruction, from 2 to 7
>>(24 bits) rest of this part of the instruction

>The page has now been updated to reflect this modified design.

And I thought I was on to something.

The functionality - pseudo-immediates and VLIW features - was all the
same, but now everything was so much simpler. The only thing that
needed to be in a header, the three-bit field that reserved space for
pseudo-immediates, now had just three bits of overhead.

Everything else followed a normal instruction model, instead of a
complicated header.

But... if I use a header with 22 bits usable to turn instruction words
that have 22 bits available...

into instructions that are _longer_ than 32 bits...

well, guess what?

If I use half the opcode space for four-word instructions, then one
header with 22 bits available can add 7 bits to each of three
subsequent instructions.

However, 24 plus 7 is 31.

So I'm stuck at putting two instructions in three words even for a
modest extension of the instruction set...

never mind adding a whole bunch of bits for stuff like predication!

I can tease out a couple of extra bits, so that I have a 22-bit
starting word, but 26 bits in each following one, by replacing the
three bit "position" field with a field that just contains 0 in every
instruction slot but the last one, indicated with a 1.

With 26 bits, to get 33 bits - all I need for a nice expansion of the
instruction set to its "full" form - I need to add seven bits to each
one, so that now does allow one starting word to prefix three
instructions.

Still not great, but adequate. And the first word doesn't really need
a length field either, it just needs to indicate it's the first one.
Which is how I had worked something like this before.

John Savard

Re: Stealing a Great Idea from the 6600

<3fab253870efca80b8ea207faf3c9d29@www.novabbs.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37945&group=comp.arch#37945

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchal...@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Tue, 23 Apr 2024 01:53:26 +0000
Organization: Rocksolid Light
Message-ID: <3fab253870efca80b8ea207faf3c9d29@www.novabbs.org>
References: <lge02j554ucc6h81n5q2ej0ue2icnnp7i5@4ax.com> <e2097beb24bf27eed0a92f14596bd59e@www.novabbs.org> <in312jlca131khq3vj0i24n6pb0hah2ur5@4ax.com> <71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org> <1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com> <oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com> <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <h8gd2jduqgk7i6decn0lj902pob7bud984@4ax.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2063311"; mail-complaints-to="usenet@i2pn2.org";
posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A";
User-Agent: Rocksolid Light
X-Spam-Checker-Version: SpamAssassin 4.0.0
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
X-Rslight-Site: $2y$10$dUKWHzr.FKpuXKmClBk6KOw6.7KcIvw7jwkV/CPDT1yx41qQL3q/C
 by: MitchAlsup1 - Tue, 23 Apr 2024 01:53 UTC

John Savard wrote:

> On Sat, 20 Apr 2024 17:07:11 +0000, mitchalsup@aol.com (MitchAlsup1)
> wrote:
>>John Savard wrote:

>>> And, hey, I'm not the first guy to get sunk because of forgetting what
>>> lies under the tip of the iceberg that's above the water.
>>
>>> That also happened to the captain of the _Titanic_.
>>
>>Concer-tina-tanic !?!

> Oh, dear. This discussion has inspired me to rework the basic design
> of Concertina II _yet again_!

I suggest it is time for Concertina III.......

> The new design, not yet online, will have the following features:

> The code stream will continue to be divided into 256-bit blocks.

Why not a whole cache line ??

> However, block headers wil be eliminated. Instead, this functionality
> will be subsumed into the instruction set.

> Case I:

> Indicating that from 1 to 7 32-bit instruction slots in a block are
> not used for instructions, but instead may contain pseudo-immediates
> will be achieved by:

> Placing a two-address register-to-register operate instruction in the
> first instruction slot in a block. These instructions will have a
> three-bit field which, if nonzero, indicates the amount of space
> reserved.

> To avoid waste, when such an instruction is present in any slot other
> than the first, that field will have the following function:

> If nonzero, it points to an instruction slot (slots 1 through 7, in
> the second through eighth positions) and a duplicate copy of the
> instruction in that slot will be placed in the instruction stream
> immediately following the instruction with that field.

> The following special conditions apply:

> If the instruction slot contains a pair of 16-bit instructions, only
> the first of those instructions is so inserted for execution.

> The instruction slot may not be one that is reserved for
> pseudo-immediates, except that it may be the _first_ such slot, in
> which case, the first 16 bits of that slot are taken as a 16-bit
> instruction, with the format indicated by the first bit (as opposed to
> the usual 17th bit) of that instruction slot's contents.

> So it's possible to reserve an odd multiple of 16 bits for
> pseudo-immediates, so as to avoid waste.

> Case II:

> Instructions longer than 32 bits are specified by being of the form:

> The first instruction slot:

> 11111
> 00
> (3 bits) length in instruction slots, from 2 to 7
> (22 bits) rest of the first part of the instruction

> All remaining instruction slots:

> 11111
> (3 bits) position within instruction, from 2 to 7
> (24 bits) rest of this part of the instruction

> This mechanism, however, will _also_ be used for VLIW functionality or
> prefix functionality which was formerly in block headers.

> In that case, the first instruction slot, and the remaining
> instruction slots, no longer need to be contiguous; instead, ordinary
> 32-bit instructions or pairs of 16-bit instlructions can occur between
> the portions of the ensemble of prefixed instructions formed by this
> means.

> And there is a third improvement.

> When Case I above is in effect, the block in which space for
> pseudo-immediates is reserved will be stored in an internal register
> in the processor.

> Subsequent blocks can contain operate instructions with
> pseudo-immediate operands even if no space for pseudo-immediates is
> reserved in those blocks. In that case, the retained copy of the last
> block encountered in which pseudo-immediates were reserved shall be
> referenced instead.

> I think these changes will improve code density... or, at least, they
> will make it appear that no space is obviously forced to be wasted,
> even if no real improvement in code density results.

> John Savard

Re: Stealing a Great Idea from the 6600

<bh6e2jdjvdmd42b1mdj1747lf85g4u3fcv@4ax.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37947&group=comp.arch#37947

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: quadib...@servername.invalid (John Savard)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Mon, 22 Apr 2024 20:19:36 -0600
Organization: A noiseless patient Spider
Lines: 16
Message-ID: <bh6e2jdjvdmd42b1mdj1747lf85g4u3fcv@4ax.com>
References: <in312jlca131khq3vj0i24n6pb0hah2ur5@4ax.com> <71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org> <1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com> <oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com> <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <h8gd2jduqgk7i6decn0lj902pob7bud984@4ax.com> <3fab253870efca80b8ea207faf3c9d29@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 23 Apr 2024 04:19:37 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="7270acb597502ebace3edfa43b122dbb";
logging-data="1474714"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/WOZ3AKpLBnd9yrDRK+Um+ZxguMEvUShU="
Cancel-Lock: sha1:fMf/A9KOI/WbdqYU2mmjZDIqJgw=
X-Newsreader: Forte Free Agent 3.3/32.846
 by: John Savard - Tue, 23 Apr 2024 02:19 UTC

On Tue, 23 Apr 2024 01:53:26 +0000, mitchalsup@aol.com (MitchAlsup1)
wrote:

>I suggest it is time for Concertina III.......

If the old Concertina II were worth keeping...

>Why not a whole cache line ??

That would be one way to allow the overhead of a block prefix to be
minimized.

But that starts to look like just having mode bits for an entire
program.

John Savard

Re: Stealing a Great Idea from the 6600

<rm6e2j5246uupmce18phlip53saamvl8rv@4ax.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37948&group=comp.arch#37948

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: quadib...@servername.invalid (John Savard)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Mon, 22 Apr 2024 20:22:12 -0600
Organization: A noiseless patient Spider
Lines: 24
Message-ID: <rm6e2j5246uupmce18phlip53saamvl8rv@4ax.com>
References: <71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org> <1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com> <oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com> <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <h8gd2jduqgk7i6decn0lj902pob7bud984@4ax.com> <hmod2jpvun417k1qjm04pvro5gvt3u21rg@4ax.com> <aj3e2j13ntqoofh22cienlntgrnkgrj488@4ax.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 23 Apr 2024 04:22:13 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="7270acb597502ebace3edfa43b122dbb";
logging-data="1474714"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18CG73dk3X7TROyO/IU6jGP7uZCeA4H5Jg="
Cancel-Lock: sha1:lj6qy4YGRheyB2sDtHikJtNgjL0=
X-Newsreader: Forte Free Agent 3.3/32.846
 by: John Savard - Tue, 23 Apr 2024 02:22 UTC

On Mon, 22 Apr 2024 19:36:54 -0600, John Savard
<quadibloc@servername.invalid> wrote:

>I can tease out a couple of extra bits, so that I have a 22-bit
>starting word, but 26 bits in each following one, by replacing the
>three bit "position" field with a field that just contains 0 in every
>instruction slot but the last one, indicated with a 1.
>
>With 26 bits, to get 33 bits - all I need for a nice expansion of the
>instruction set to its "full" form - I need to add seven bits to each
>one, so that now does allow one starting word to prefix three
>instructions.
>
>Still not great, but adequate. And the first word doesn't really need
>a length field either, it just needs to indicate it's the first one.
>Which is how I had worked something like this before.

But fully half the opcode space is allocated to 16-bit instructions.
EVen though that half doesn't really play nice with other things, it's
too tempting a target to ignore. But the price would be losing the
fully parallel nature of decoding.

John Savard

Re: Stealing a Great Idea from the 6600

<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37949&group=comp.arch#37949

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: gneun...@comcast.net (George Neuner)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Mon, 22 Apr 2024 23:09:43 -0400
Organization: i2pn2 (i2pn.org)
Message-ID: <ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
References: <1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com> <oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com> <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Info: i2pn2.org;
logging-data="2067482"; mail-complaints-to="usenet@i2pn2.org";
posting-account="h5eMH71iFfocGZucc+SnA0y5I+72/ecoTCcIjMd3Uww";
User-Agent: ForteAgent/8.00.32.1272
X-Spam-Checker-Version: SpamAssassin 4.0.0
 by: George Neuner - Tue, 23 Apr 2024 03:09 UTC

On Sun, 21 Apr 2024 00:43:21 +0000, mitchalsup@aol.com (MitchAlsup1)
wrote:

>Address arithmetic is ADD only and does not care about signs or
>overflow. There is no concept of a negative base register or a
>negative index register (or, for that matter, a negative displace-
>ment), overflow, underflow, carry, ...

Stack frame pointers often point to the middle of the frame and need
to access data using both positive and negative displacements.

Some GC schemes use negative displacements to access object headers.

Re: Stealing a Great Idea from the 6600

<sjme2j91bfb1pmkp8m7gq8m2843ko4ap6j@4ax.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37951&group=comp.arch#37951

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: quadib...@servername.invalid (John Savard)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Tue, 23 Apr 2024 00:54:00 -0600
Organization: A noiseless patient Spider
Lines: 16
Message-ID: <sjme2j91bfb1pmkp8m7gq8m2843ko4ap6j@4ax.com>
References: <1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com> <oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com> <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <h8gd2jduqgk7i6decn0lj902pob7bud984@4ax.com> <hmod2jpvun417k1qjm04pvro5gvt3u21rg@4ax.com> <aj3e2j13ntqoofh22cienlntgrnkgrj488@4ax.com> <rm6e2j5246uupmce18phlip53saamvl8rv@4ax.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 23 Apr 2024 08:54:00 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="7270acb597502ebace3edfa43b122dbb";
logging-data="1573141"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+/OnJdwbiBaNioGB9DyTcbzVQNCw0K9ck="
Cancel-Lock: sha1:M9WFAMWARPvtxaLqeg3/MlVtt04=
X-Newsreader: Forte Free Agent 3.3/32.846
 by: John Savard - Tue, 23 Apr 2024 06:54 UTC

On Mon, 22 Apr 2024 20:22:12 -0600, John Savard
<quadibloc@servername.invalid> wrote:

>But fully half the opcode space is allocated to 16-bit instructions.
>EVen though that half doesn't really play nice with other things, it's
>too tempting a target to ignore. But the price would be losing the
>fully parallel nature of decoding.

After heading out to buy groceries, my head cleared enough to discard
the various complicated and bizarre schemes I was considering to deal
with the issue, and instead to drastically reduce the overhead for the
instructions longer than 32 bits, now that this had become a major
concern due to also usiing this format for prefixed instructions as
well, in a simple and straightforward manner.

John Savard

Re: Stealing a Great Idea from the 6600

<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37958&group=comp.arch#37958

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchal...@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Tue, 23 Apr 2024 17:58:41 +0000
Organization: Rocksolid Light
Message-ID: <ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
References: <1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com> <oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com> <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org> <ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2135183"; mail-complaints-to="usenet@i2pn2.org";
posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A";
User-Agent: Rocksolid Light
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
X-Spam-Checker-Version: SpamAssassin 4.0.0
X-Rslight-Site: $2y$10$tH69DT4UtLLefB/288Y21.5eLFkBaz8ZwM36I6o2lOHuZhjszeBm6
 by: MitchAlsup1 - Tue, 23 Apr 2024 17:58 UTC

George Neuner wrote:

> On Sun, 21 Apr 2024 00:43:21 +0000, mitchalsup@aol.com (MitchAlsup1)
> wrote:

>>Address arithmetic is ADD only and does not care about signs or
>>overflow. There is no concept of a negative base register or a
>>negative index register (or, for that matter, a negative displace-
>>ment), overflow, underflow, carry, ...

> Stack frame pointers often point to the middle of the frame and need
> to access data using both positive and negative displacements.

Yes, one accesses callee saved registers with positive displacements
and local variables with negative accesses. One simply needs to know
where the former stops and the later begins. ENTER and EXIT know this
by the register count and by the stack allocation size.

> Some GC schemes use negative displacements to access object headers.

Those are negative displacements not negative bases or indexes.

Re: Stealing a Great Idea from the 6600

<v09akf$1s6pf$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37960&group=comp.arch#37960

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Tue, 23 Apr 2024 16:53:16 -0500
Organization: A noiseless patient Spider
Lines: 48
Message-ID: <v09akf$1s6pf$1@dont-email.me>
References: <1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com>
<oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com>
<dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<h8gd2jduqgk7i6decn0lj902pob7bud984@4ax.com>
<hmod2jpvun417k1qjm04pvro5gvt3u21rg@4ax.com>
<aj3e2j13ntqoofh22cienlntgrnkgrj488@4ax.com>
<rm6e2j5246uupmce18phlip53saamvl8rv@4ax.com>
<sjme2j91bfb1pmkp8m7gq8m2843ko4ap6j@4ax.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 23 Apr 2024 23:53:19 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="e4c9793258b9d913f28848bbc0503c1c";
logging-data="1973039"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ZPX9bNFUD7v4KYoOex7R0j4areGRhruY="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:rmii12vWhvGUgm+kw+VO/0KDOuI=
In-Reply-To: <sjme2j91bfb1pmkp8m7gq8m2843ko4ap6j@4ax.com>
Content-Language: en-US
 by: BGB - Tue, 23 Apr 2024 21:53 UTC

On 4/23/2024 1:54 AM, John Savard wrote:
> On Mon, 22 Apr 2024 20:22:12 -0600, John Savard
> <quadibloc@servername.invalid> wrote:
>
>> But fully half the opcode space is allocated to 16-bit instructions.
>> EVen though that half doesn't really play nice with other things, it's
>> too tempting a target to ignore. But the price would be losing the
>> fully parallel nature of decoding.
>
> After heading out to buy groceries, my head cleared enough to discard
> the various complicated and bizarre schemes I was considering to deal
> with the issue, and instead to drastically reduce the overhead for the
> instructions longer than 32 bits, now that this had become a major
> concern due to also usiing this format for prefixed instructions as
> well, in a simple and straightforward manner.
>

You know, one could just be like, say:
xxxx-xxxx-xxxx-xxx0 //16-bit op
xxxx-xxxx-xxxx-xxxx xxxx-xxxx-xxxx-xx01 //32-bit op
xxxx-xxxx-xxxx-xxxx xxxx-xxxx-xxxx-x011 //32-bit op
xxxx-xxxx-xxxx-xxxx xxxx-xxxx-xxxx-0111 //32-bit op
xxxx-xxxx-xxxx-xxxx xxxx-xxxx-xxxx-1111 //jumbo prefix (64+)

And call it "good enough"...

Then, say (6b registers):
zzzz-mmmm-nnnn-zzz0 //16-bit op (2R)
zzzz-tttt-ttss-ssss nnnn-nnpp-zzzz-xxx1 //32-bit op (3R)
iiii-iiii-iiss-ssss nnnn-nnpp-zzzz-xxx1 //32-bit op (3RI, Imm10)
iiii-iiii-iiii-iiii nnnn-nnpp-zzzz-xxx1 //32-bit op (2RI, Imm16)
iiii-iiii-iiii-iiii iiii-iipp-zzzz-xxx1 //32-bit op (Branch)

Or (5b registers):
zzzz-mmmm-nnnn-zzz0 //16-bit op (2R)
zzzz-zttt-ttzs-ssss nnnn-nzpp-zzzz-xxx1 //32-bit op (3R)
iiii-iiii-iiis-ssss nnnn-nzpp-zzzz-xxx1 //32-bit op (3RI, Imm11)
iiii-iiii-iiii-iiii nnnn-nzpp-zzzz-xxx1 //32-bit op (2RI, Imm16)
iiii-iiii-iiii-iiii iiii-iipp-zzzz-xxx1 //32-bit op (Branch)

....

> John Savard

Re: Stealing a Great Idea from the 6600

<480166a6c2a931428143397e100a6caf@www.novabbs.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37964&group=comp.arch#37964

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchal...@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Tue, 23 Apr 2024 22:55:50 +0000
Organization: Rocksolid Light
Message-ID: <480166a6c2a931428143397e100a6caf@www.novabbs.org>
References: <1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com> <oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com> <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <h8gd2jduqgk7i6decn0lj902pob7bud984@4ax.com> <hmod2jpvun417k1qjm04pvro5gvt3u21rg@4ax.com> <aj3e2j13ntqoofh22cienlntgrnkgrj488@4ax.com> <rm6e2j5246uupmce18phlip53saamvl8rv@4ax.com> <sjme2j91bfb1pmkp8m7gq8m2843ko4ap6j@4ax.com> <v09akf$1s6pf$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2157808"; mail-complaints-to="usenet@i2pn2.org";
posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A";
User-Agent: Rocksolid Light
X-Rslight-Site: $2y$10$F3.wfvf.XJVHshaMMnQtQOidN7RlIzUmHgyRR4aBwRdHQ.nQMMv/K
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
X-Spam-Checker-Version: SpamAssassin 4.0.0
 by: MitchAlsup1 - Tue, 23 Apr 2024 22:55 UTC

BGB wrote:

> On 4/23/2024 1:54 AM, John Savard wrote:
>> On Mon, 22 Apr 2024 20:22:12 -0600, John Savard
>> <quadibloc@servername.invalid> wrote:
>>
>>> But fully half the opcode space is allocated to 16-bit instructions.
>>> EVen though that half doesn't really play nice with other things, it's
>>> too tempting a target to ignore. But the price would be losing the
>>> fully parallel nature of decoding.
>>
>> After heading out to buy groceries, my head cleared enough to discard
>> the various complicated and bizarre schemes I was considering to deal
>> with the issue, and instead to drastically reduce the overhead for the
>> instructions longer than 32 bits, now that this had become a major
>> concern due to also usiing this format for prefixed instructions as
>> well, in a simple and straightforward manner.
>>

> You know, one could just be like, say:
> xxxx-xxxx-xxxx-xxx0 //16-bit op
> xxxx-xxxx-xxxx-xxxx xxxx-xxxx-xxxx-xx01 //32-bit op
> xxxx-xxxx-xxxx-xxxx xxxx-xxxx-xxxx-x011 //32-bit op
> xxxx-xxxx-xxxx-xxxx xxxx-xxxx-xxxx-0111 //32-bit op
> xxxx-xxxx-xxxx-xxxx xxxx-xxxx-xxxx-1111 //jumbo prefix (64+)

> And call it "good enough"...

> Then, say (6b registers):
> zzzz-mmmm-nnnn-zzz0 //16-bit op (2R)
> zzzz-tttt-ttss-ssss nnnn-nnpp-zzzz-xxx1 //32-bit op (3R)
> iiii-iiii-iiss-ssss nnnn-nnpp-zzzz-xxx1 //32-bit op (3RI, Imm10)
> iiii-iiii-iiii-iiii nnnn-nnpp-zzzz-xxx1 //32-bit op (2RI, Imm16)
> iiii-iiii-iiii-iiii iiii-iipp-zzzz-xxx1 //32-bit op (Branch)

> Or (5b registers):
> zzzz-mmmm-nnnn-zzz0 //16-bit op (2R)
> zzzz-zttt-ttzs-ssss nnnn-nzpp-zzzz-xxx1 //32-bit op (3R)
> iiii-iiii-iiis-ssss nnnn-nzpp-zzzz-xxx1 //32-bit op (3RI, Imm11)
> iiii-iiii-iiii-iiii nnnn-nzpp-zzzz-xxx1 //32-bit op (2RI, Imm16)
> iiii-iiii-iiii-iiii iiii-iipp-zzzz-xxx1 //32-bit op (Branch)

> ....

Punt on the 16-bit instructions::

000110 CONDI rrrrr PRED xxthen xxelse
000111 ddddd rrrrr SHF rwidth roffst
001001 ddddd rrrrr DscLd MemOp rrrrr
001010 ddddd rrrrr I12Sd 2-OPR rrrrr
001100 ddddd rrrrr I12 3OP rrrrr rrrrr
001101 ddddd rrrrr I12Sd 1-OPERA TIONx
01100 bitnum rrrrr 16-bit-displacement
011010 CONDI rrrrr 16-bit-displacement
011110 26-bit-displacementtttttttttttt
011111 26-bit-displacementtttttttttttt
100000
to
101110 ddddd rrrrr 16-bit-displacement
110000
to
111100 ddddd rrrrr 16-bit-immediateeee

>> John Savard

Re: Stealing a Great Idea from the 6600

<v0c9v5$2k063$8@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37991&group=comp.arch#37991

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ldo...@nz.invalid (Lawrence D'Oliveiro)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Thu, 25 Apr 2024 01:00:21 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 11
Message-ID: <v0c9v5$2k063$8@dont-email.me>
References: <71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org>
<1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com>
<oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com>
<dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<gul82jlmud2gglbf1siupn180r3f5o3qo5@4ax.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 25 Apr 2024 03:00:21 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="962cdf4b668ef58e236ab9c5f423d5d6";
logging-data="2752707"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+/oE4mZdZCBB2h7tYRtw9j"
User-Agent: Pan/0.155 (Kherson; fc5a80b8)
Cancel-Lock: sha1:IZQ8SZRmVFNStx+vi+K1MyFY8Dg=
 by: Lawrence D'Oliv - Thu, 25 Apr 2024 01:00 UTC

On Sat, 20 Apr 2024 18:06:22 -0600, John Savard wrote:

> Since there was only one set of arithmetic instrucions, that meant that
> when you wrote code to operate on unsigned values, you had to remember
> that the normal names of the condition code values were oriented around
> signed arithmetic.

I thought architectures typically had separate condition codes for “carry”
versus “overflow”. That way, you didn’t need signed versus unsigned
versions of add, subtract and compare; it was just a matter of looking at
the right condition codes on the result.

Re: Stealing a Great Idea from the 6600

<80e0bf91545212a676011a9ccd0efa06@www.novabbs.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37992&group=comp.arch#37992

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchal...@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Thu, 25 Apr 2024 02:50:09 +0000
Organization: Rocksolid Light
Message-ID: <80e0bf91545212a676011a9ccd0efa06@www.novabbs.org>
References: <71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org> <1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com> <oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com> <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <gul82jlmud2gglbf1siupn180r3f5o3qo5@4ax.com> <v0c9v5$2k063$8@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2279800"; mail-complaints-to="usenet@i2pn2.org";
posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A";
User-Agent: Rocksolid Light
X-Rslight-Site: $2y$10$0.pYOSzWvQPNUqZZd5MDP.k5SNi/iMxfXBwJNIAtr1bpr3KcuI8IC
X-Spam-Checker-Version: SpamAssassin 4.0.0
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
 by: MitchAlsup1 - Thu, 25 Apr 2024 02:50 UTC

Lawrence D'Oliveiro wrote:

> On Sat, 20 Apr 2024 18:06:22 -0600, John Savard wrote:

>> Since there was only one set of arithmetic instrucions, that meant that
>> when you wrote code to operate on unsigned values, you had to remember
>> that the normal names of the condition code values were oriented around
>> signed arithmetic.

> I thought architectures typically had separate condition codes for “carry”
> versus “overflow”. That way, you didn’t need signed versus unsigned
> versions of add, subtract and compare; it was just a matter of looking at
> the right condition codes on the result.

Maybe now with 4-or-5-bit condition codes yes,
But the early machines (360) with 2-bit codes were already constricted.

Re: Stealing a Great Idea from the 6600

<v0cil3$2porv$2@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37993&group=comp.arch#37993

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ldo...@nz.invalid (Lawrence D'Oliveiro)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Thu, 25 Apr 2024 03:28:36 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 21
Message-ID: <v0cil3$2porv$2@dont-email.me>
References: <71acfecad198c4e9a9b14ffab7fc1cb5@www.novabbs.org>
<1s042jdli35gdo092v6uaupmrcmvo0i5vp@4ax.com>
<oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com>
<dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<gul82jlmud2gglbf1siupn180r3f5o3qo5@4ax.com> <v0c9v5$2k063$8@dont-email.me>
<80e0bf91545212a676011a9ccd0efa06@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 25 Apr 2024 05:28:36 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="962cdf4b668ef58e236ab9c5f423d5d6";
logging-data="2941823"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+vtvaZPN3cKP3Cl4Dtudsf"
User-Agent: Pan/0.155 (Kherson; fc5a80b8)
Cancel-Lock: sha1:THWaMyrUaIEGyo7LI++pcoGp81U=
 by: Lawrence D'Oliv - Thu, 25 Apr 2024 03:28 UTC

On Thu, 25 Apr 2024 02:50:09 +0000, MitchAlsup1 wrote:

> Lawrence D'Oliveiro wrote:
>
>> On Sat, 20 Apr 2024 18:06:22 -0600, John Savard wrote:
>
>>> Since there was only one set of arithmetic instrucions, that meant
>>> that when you wrote code to operate on unsigned values, you had to
>>> remember that the normal names of the condition code values were
>>> oriented around signed arithmetic.
>
>> I thought architectures typically had separate condition codes for
>> “carry” versus “overflow”. That way, you didn’t need signed versus
>> unsigned versions of add, subtract and compare; it was just a matter of
>> looking at the right condition codes on the result.
>
> Maybe now with 4-or-5-bit condition codes yes,
> But the early machines (360) with 2-bit codes were already constricted.

The DEC PDP-6, from around 1964, same time as the System/360, had separate
carry and overflow flags.

Re: Stealing a Great Idea from the 6600

<6gpj2jt10mbjc6nbd38g87d44tacuokjnh@4ax.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=37995&group=comp.arch#37995

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: quadib...@servername.invalid (John Savard)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Wed, 24 Apr 2024 23:12:01 -0600
Organization: A noiseless patient Spider
Lines: 17
Message-ID: <6gpj2jt10mbjc6nbd38g87d44tacuokjnh@4ax.com>
References: <oj742jdvpl21il2s5a1ndsp3oidsnfjmr6@4ax.com> <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <gul82jlmud2gglbf1siupn180r3f5o3qo5@4ax.com> <v0c9v5$2k063$8@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 25 Apr 2024 07:12:02 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="7cb425b3212d9c4a2716be1f4617d0fe";
logging-data="2983842"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18qd4W0fVpqnPEHg2xuWLx+8erGpMOwiVw="
Cancel-Lock: sha1:RYGvBrRmLP5qZG9p+miktlcDvx0=
X-Newsreader: Forte Free Agent 3.3/32.846
 by: John Savard - Thu, 25 Apr 2024 05:12 UTC

On Thu, 25 Apr 2024 01:00:21 -0000 (UTC), Lawrence D'Oliveiro
<ldo@nz.invalid> wrote:
>On Sat, 20 Apr 2024 18:06:22 -0600, John Savard wrote:
>
>> Since there was only one set of arithmetic instrucions, that meant that
>> when you wrote code to operate on unsigned values, you had to remember
>> that the normal names of the condition code values were oriented around
>> signed arithmetic.
>
>I thought architectures typically had separate condition codes for “carry”
>versus “overflow”. That way, you didn’t need signed versus unsigned
>versions of add, subtract and compare; it was just a matter of looking at
>the right condition codes on the result.

Yes; I thought that was the same as what I just said.

John Savard

Re: Stealing a Great Idea from the 6600

<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=38017&group=comp.arch#38017

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: gneun...@comcast.net (George Neuner)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Thu, 25 Apr 2024 17:01:55 -0400
Organization: i2pn2 (i2pn.org)
Message-ID: <dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org> <ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com> <ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Info: i2pn2.org;
logging-data="2359219"; mail-complaints-to="usenet@i2pn2.org";
posting-account="h5eMH71iFfocGZucc+SnA0y5I+72/ecoTCcIjMd3Uww";
User-Agent: ForteAgent/8.00.32.1272
X-Spam-Checker-Version: SpamAssassin 4.0.0
 by: George Neuner - Thu, 25 Apr 2024 21:01 UTC

On Tue, 23 Apr 2024 17:58:41 +0000, mitchalsup@aol.com (MitchAlsup1)
wrote:

>George Neuner wrote:
>
>> On Sun, 21 Apr 2024 00:43:21 +0000, mitchalsup@aol.com (MitchAlsup1)
>> wrote:
>
>>>Address arithmetic is ADD only and does not care about signs or
>>>overflow. There is no concept of a negative base register or a
>>>negative index register (or, for that matter, a negative displace-
>>>ment), overflow, underflow, carry, ...
>
>> Stack frame pointers often point to the middle of the frame and need
>> to access data using both positive and negative displacements.
>
>Yes, one accesses callee saved registers with positive displacements
>and local variables with negative accesses. One simply needs to know
>where the former stops and the later begins. ENTER and EXIT know this
>by the register count and by the stack allocation size.
>
>> Some GC schemes use negative displacements to access object headers.
>
>Those are negative displacements not negative bases or indexes.

I was reacting to your message (quoted fully above) which,
paraphrased, says "address arithmetic is add only and there is no
concept of a negative displacement".

In one sense you are correct: the result of the calculation has to be
considered as unsigned in the range 0..max_memory ... ie. there is no
concept of negative *address*.

However, the components being added to form the address, I believe are
a different matter.

I agree that negative base is meaningless.

However, negative index and negative displacement both do have
meaning. The inclusion of specialized index registers is debatable
[I'm in the GPR camp], but I do believe that index and displacement
*values* both always should be considered as signed.

YMMV.

Re: Stealing a Great Idea from the 6600

<v0euek$3a2rc$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=38025&group=comp.arch#38025

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Thu, 25 Apr 2024 20:02:09 -0500
Organization: A noiseless patient Spider
Lines: 91
Message-ID: <v0euek$3a2rc$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 26 Apr 2024 03:02:12 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="0ea82f9b9e39b8d196087c6b4e96eff4";
logging-data="3476332"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+qUj5zYgujBW3n0B7Xsejy0abGdsZchIA="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:K7x3++6881/bwUnY0WiIqu6N4us=
Content-Language: en-US
In-Reply-To: <dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com>
 by: BGB - Fri, 26 Apr 2024 01:02 UTC

On 4/25/2024 4:01 PM, George Neuner wrote:
> On Tue, 23 Apr 2024 17:58:41 +0000, mitchalsup@aol.com (MitchAlsup1)
> wrote:
>
>> George Neuner wrote:
>>
>>> On Sun, 21 Apr 2024 00:43:21 +0000, mitchalsup@aol.com (MitchAlsup1)
>>> wrote:
>>
>>>> Address arithmetic is ADD only and does not care about signs or
>>>> overflow. There is no concept of a negative base register or a
>>>> negative index register (or, for that matter, a negative displace-
>>>> ment), overflow, underflow, carry, ...
>>
>>> Stack frame pointers often point to the middle of the frame and need
>>> to access data using both positive and negative displacements.
>>
>> Yes, one accesses callee saved registers with positive displacements
>> and local variables with negative accesses. One simply needs to know
>> where the former stops and the later begins. ENTER and EXIT know this
>> by the register count and by the stack allocation size.
>>
>>> Some GC schemes use negative displacements to access object headers.
>>
>> Those are negative displacements not negative bases or indexes.
>
> I was reacting to your message (quoted fully above) which,
> paraphrased, says "address arithmetic is add only and there is no
> concept of a negative displacement".
>
> In one sense you are correct: the result of the calculation has to be
> considered as unsigned in the range 0..max_memory ... ie. there is no
> concept of negative *address*.
>
> However, the components being added to form the address, I believe are
> a different matter.
>
> I agree that negative base is meaningless.
>
> However, negative index and negative displacement both do have
> meaning. The inclusion of specialized index registers is debatable
> [I'm in the GPR camp], but I do believe that index and displacement
> *values* both always should be considered as signed.
>

Agreed in the sense that negative displacements exist.

However, can note that positive displacements tend to be significantly
more common than negative ones. Whether or not it makes sense to have a
negative displacement, depending mostly on the probability of greater
than half of the missed displacements being negative.

From what I can tell, this seems to be:
~ 10 bits, scaled.
~ 13 bits, unscaled.

So, say, an ISA like RISC-V might have had a slightly hit rate with
unsigned displacements than with signed displacements, but if one added
1 or 2 bits, signed would have still been a clear winner (or, with 1 or
2 fewer bits, unsigned a clear winner).

I ended up going with signed displacements for XG2, but it was pretty
close to break-even in this case (when expanding from the 9-bit unsigned
displacements in Baseline).

Granted, all signed or all-unsigned might be better from an ISA design
consistency POV.

If one had 16-bit displacements, then unscaled displacements would make
sense; otherwise scaled displacements seem like a win (misaligned
displacements being much less common than aligned displacements).

But, admittedly, main reason I went with unscaled for GBR-rel and PC-rel
Load/Store, was because using scaled displacements here would have
required more relocation types (nevermind if the hit rate for unscaled
9-bit displacements is "pretty weak").

Though, did end up later adding specialized Scaled GBR-Rel Load/Store
ops (to improve code density), so it might have been better in
retrospect had I instead just went the "keep it scaled and add more
reloc types to compensate" option.

....

> YMMV.

Re: Stealing a Great Idea from the 6600

<ff78aaa73101509100f09f190838a2a7@www.novabbs.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=38029&group=comp.arch#38029

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!.POSTED!not-for-mail
From: mitchal...@aol.com (MitchAlsup1)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Fri, 26 Apr 2024 13:25:03 +0000
Organization: Rocksolid Light
Message-ID: <ff78aaa73101509100f09f190838a2a7@www.novabbs.org>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org> <ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com> <ec69999967361c286afdbe60bc2443ea@www.novabbs.org> <dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: i2pn2.org;
logging-data="2426555"; mail-complaints-to="usenet@i2pn2.org";
posting-account="65wTazMNTleAJDh/pRqmKE7ADni/0wesT78+pyiDW8A";
User-Agent: Rocksolid Light
X-Rslight-Site: $2y$10$UGwWA1CR/j3m8NkNGA7Dvu4IqbQt0n9e0UEoohchI.epdncq5r1L.
X-Rslight-Posting-User: ac58ceb75ea22753186dae54d967fed894c3dce8
X-Spam-Checker-Version: SpamAssassin 4.0.0
 by: MitchAlsup1 - Fri, 26 Apr 2024 13:25 UTC

BGB wrote:

> On 4/25/2024 4:01 PM, George Neuner wrote:
>> On Tue, 23 Apr 2024 17:58:41 +0000, mitchalsup@aol.com (MitchAlsup1)
>> wrote:
>>

> Agreed in the sense that negative displacements exist.

> However, can note that positive displacements tend to be significantly
> more common than negative ones. Whether or not it makes sense to have a
> negative displacement, depending mostly on the probability of greater
> than half of the missed displacements being negative.

> From what I can tell, this seems to be:
> ~ 10 bits, scaled.
> ~ 13 bits, unscaled.

> So, say, an ISA like RISC-V might have had a slightly hit rate with
> unsigned displacements than with signed displacements, but if one added
> 1 or 2 bits, signed would have still been a clear winner (or, with 1 or
> 2 fewer bits, unsigned a clear winner).

> I ended up going with signed displacements for XG2, but it was pretty
> close to break-even in this case (when expanding from the 9-bit unsigned
> displacements in Baseline).

> Granted, all signed or all-unsigned might be better from an ISA design
> consistency POV.

> If one had 16-bit displacements, then unscaled displacements would make
> sense; otherwise scaled displacements seem like a win (misaligned
> displacements being much less common than aligned displacements).

What we need is ~16-bit displacements where 82½%-91¼% are positive.

How does one use a frame pointer without negative displacements ??

[FP+disp] accesses callee save registers
[FP-disp] accesses local stack variables and descriptors

[SP+disp] accesses argument and result values

> But, admittedly, main reason I went with unscaled for GBR-rel and PC-rel
> Load/Store, was because using scaled displacements here would have
> required more relocation types (nevermind if the hit rate for unscaled
> 9-bit displacements is "pretty weak").

> Though, did end up later adding specialized Scaled GBR-Rel Load/Store
> ops (to improve code density), so it might have been better in
> retrospect had I instead just went the "keep it scaled and add more
> reloc types to compensate" option.

> ....

>> YMMV.

Re: Stealing a Great Idea from the 6600

<2024Apr26.173457@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=38032&group=comp.arch#38032

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.hispagatos.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Fri, 26 Apr 2024 15:34:57 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 28
Message-ID: <2024Apr26.173457@mips.complang.tuwien.ac.at>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org> <ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com> <ec69999967361c286afdbe60bc2443ea@www.novabbs.org> <dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me> <ff78aaa73101509100f09f190838a2a7@www.novabbs.org>
Injection-Date: Fri, 26 Apr 2024 17:49:35 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="34665e79be177e2bef134484e850f3da";
logging-data="3972330"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19MTLIaEBb32mG8A+wUjXlh"
Cancel-Lock: sha1:+fdv0hkuBqo5R9W3iZta77FZNe8=
X-newsreader: xrn 10.11
 by: Anton Ertl - Fri, 26 Apr 2024 15:34 UTC

mitchalsup@aol.com (MitchAlsup1) writes:
>What we need is ~16-bit displacements where 82½%-91¼% are positive.

What are these funny numbers about?

Do you mean that you want number ranges like -11468..54067 (82.5%
positive) or -5734..59801 (91.25% positive)? Which one of those? And
why not, say -8192..57343 (87.5% positive)?

>How does one use a frame pointer without negative displacements ??

You let it point to the lowest address you want to access. That moves
the problem to unwinding frame pointer chains where the unwinder does
not know the frame-specific difference between the frame pointer and
the pointer of the next frame.

An alternative is to have a frame-independent difference that leaves
enough room that, say 90% (or 99%, or whatever) of the frames don't
need negative offsets from that frame.

Likewise, if you have signed displacements, and are unhappy about the
skewed usage, you can let the frame pointer point at an offset from
the pointer to the next fram such that the usage is less skewed.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Stealing a Great Idea from the 6600

<v0gobh$3qnis$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=38033&group=comp.arch#38033

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
Date: Fri, 26 Apr 2024 12:30:24 -0500
Organization: A noiseless patient Spider
Lines: 132
Message-ID: <v0gobh$3qnis$1@dont-email.me>
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org>
<acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com>
<kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com>
<9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org>
<v017mg$3rcg9$1@dont-email.me>
<da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org>
<sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com>
<44fdd1209496c66ba18e425370a8b50d@www.novabbs.org>
<ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com>
<ec69999967361c286afdbe60bc2443ea@www.novabbs.org>
<dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me>
<ff78aaa73101509100f09f190838a2a7@www.novabbs.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 26 Apr 2024 19:30:26 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="0ea82f9b9e39b8d196087c6b4e96eff4";
logging-data="4021852"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/XZkCzJxPqyggoI97xCpLz5e66PeriR0s="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:K1BmjVQSzbJ4kJnuAORkPcb0KpQ=
Content-Language: en-US
In-Reply-To: <ff78aaa73101509100f09f190838a2a7@www.novabbs.org>
 by: BGB - Fri, 26 Apr 2024 17:30 UTC

On 4/26/2024 8:25 AM, MitchAlsup1 wrote:
> BGB wrote:
>
>> On 4/25/2024 4:01 PM, George Neuner wrote:
>>> On Tue, 23 Apr 2024 17:58:41 +0000, mitchalsup@aol.com (MitchAlsup1)
>>> wrote:
>>>
>
>> Agreed in the sense that negative displacements exist.
>
>> However, can note that positive displacements tend to be significantly
>> more common than negative ones. Whether or not it makes sense to have
>> a negative displacement, depending mostly on the probability of
>> greater than half of the missed displacements being negative.
>
>>  From what I can tell, this seems to be:
>>    ~ 10 bits, scaled.
>>    ~ 13 bits, unscaled.
>
>
>> So, say, an ISA like RISC-V might have had a slightly hit rate with
>> unsigned displacements than with signed displacements, but if one
>> added 1 or 2 bits, signed would have still been a clear winner (or,
>> with 1 or 2 fewer bits, unsigned a clear winner).
>
>> I ended up going with signed displacements for XG2, but it was pretty
>> close to break-even in this case (when expanding from the 9-bit
>> unsigned displacements in Baseline).
>
>
>> Granted, all signed or all-unsigned might be better from an ISA design
>> consistency POV.
>
>
>> If one had 16-bit displacements, then unscaled displacements would
>> make sense; otherwise scaled displacements seem like a win (misaligned
>> displacements being much less common than aligned displacements).
>
> What we need is ~16-bit displacements where 82½%-91¼% are positive.
>

I was seeing stats more like 99.8% positive, 0.2% negative.

There was enough of a bias that, below 10 bits, if one takes all the
remaining cases, zero extending would always win, until reaching 10
bits, when the number of missed reaches 50% negative (along with
positive displacements larger than 512).

So, one can make a choice: -512..511, or 0..1023, ...

In XG2, I ended up with -512..511, for pros or cons (for some programs,
this choice is optimal, for others it is not).

Where, when scaled for QWORD, this is +/- 4K.

If one had a 16-bit displacement, it would be a choice between +/- 32K,
or (scaled) +/- 256K, or 0..512K, ...

For the special purpose "LEA.Q (GBR, Disp16), Rn" instruction, I ended
up going unsigned, where for a lot of the programs I am dealing with,
this is big enough to cover ".data" and part of ".bss", generally used
for arrays which need the larger displacements (the compiler lays things
out so that most of the commonly used variables are closer to the start
of ".data", so can use smaller displacements).

Does implicitly require that all non-trivial global arrays have at least
64-bit alignment.

Note that seemingly both BGBCC and GCC have variations on this
optimization, though in GCC's case it requires special command-line
options ("-fdata-sections", etc).

> How does one use a frame pointer without negative displacements ??
>
> [FP+disp] accesses callee save registers
> [FP-disp] accesses local stack variables and descriptors
>
> [SP+disp] accesses argument and result values
>

In my case, all of these are [SP+Disp], granted, there is no frame
pointer and stack frames are fixed-size in BGBCC.

This is typically with a frame layout like:
Argument/Spill space
-- Frame Top
Register Save
(Stack Canary)
Local arrays/structs
Local variables
Argument/Spill Space
-- Frame Bottom

Contrast with traditional x86 layout, which puts saved registers and
local variables near the frame-pointer, which points near the top of the
stack frame.

Though, in a majority of functions, the MOV.L and MOV.Q functions have a
big enough displacement to cover the whole frame (excludes functions
which have a lot of local arrays or similar, though overly large local
arrays are auto-folded to using heap allocation, but at present this
logic is based on the size of individual arrays rather than on the total
combined size of the stack frame).

Adding a frame pointer (with negative displacements) wouldn't make a big
difference in XG2 Mode, but would be more of an issue for (pure)
Baseline, where options are either to load the displacement into a
register, or use a jumbo prefix.

>> But, admittedly, main reason I went with unscaled for GBR-rel and
>> PC-rel Load/Store, was because using scaled displacements here would
>> have required more relocation types (nevermind if the hit rate for
>> unscaled 9-bit displacements is "pretty weak").
>
>> Though, did end up later adding specialized Scaled GBR-Rel Load/Store
>> ops (to improve code density), so it might have been better in
>> retrospect had I instead just went the "keep it scaled and add more
>> reloc types to compensate" option.
>
>
>> ....
>
>
>>> YMMV.

Re: Stealing a Great Idea from the 6600

<IQSWN.4$nQv.0@fx10.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=38036&group=comp.arch#38036

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder9.news.weretis.net!border-3.nntp.ord.giganews.com!border-4.nntp.ord.giganews.com!nntp.giganews.com!news-out.netnews.com!s1-1.netnews.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx10.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Stealing a Great Idea from the 6600
References: <dd1866c4efb369b7b6cc499d718dc938@www.novabbs.org> <acq62j98dhmguil5ebce6lq4m9kkgt1fs2@4ax.com> <kkq62jppr53is4r70n151jl17bjd5kd6lv@4ax.com> <9d1fadaada2ec0683fc54688cce7cf27@www.novabbs.org> <v017mg$3rcg9$1@dont-email.me> <da6dc5fe28bb31b4c73d78ef1aac2ac5@www.novabbs.org> <sdl82jpkpf1t0ctr8sgqm5bvqqireg08j5@4ax.com> <44fdd1209496c66ba18e425370a8b50d@www.novabbs.org> <ks8e2j1kquqpcupcgh32es7nci33nlajid@4ax.com> <ec69999967361c286afdbe60bc2443ea@www.novabbs.org> <dtel2j5kipf6tj9cabgp7pqk8eei14eo1a@4ax.com> <v0euek$3a2rc$1@dont-email.me> <ff78aaa73101509100f09f190838a2a7@www.novabbs.org>
In-Reply-To: <ff78aaa73101509100f09f190838a2a7@www.novabbs.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 27
Message-ID: <IQSWN.4$nQv.0@fx10.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Fri, 26 Apr 2024 18:59:52 UTC
Date: Fri, 26 Apr 2024 14:59:43 -0400
X-Received-Bytes: 2163
X-Original-Bytes: 2112
 by: EricP - Fri, 26 Apr 2024 18:59 UTC

MitchAlsup1 wrote:
> BGB wrote:
>
>> If one had 16-bit displacements, then unscaled displacements would
>> make sense; otherwise scaled displacements seem like a win (misaligned
>> displacements being much less common than aligned displacements).
>
> What we need is ~16-bit displacements where 82½%-91¼% are positive.
>
> How does one use a frame pointer without negative displacements ??
>
> [FP+disp] accesses callee save registers
> [FP-disp] accesses local stack variables and descriptors
>
> [SP+disp] accesses argument and result values

A sign extended 16-bit offsets would cover almost all such access needs
so I really don't see the need for funny business.

But if you really want a skewed range offset it could use something like
excess-256 encoding which zero extends the immediate then subtract 256
(or whatever) from it, to give offsets in the range -256..+65535-256.
So an immediate value of 0 equals an offset of -256.

Pages:1234
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor