Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Whom computers would destroy, they must first drive mad.


devel / comp.arch / Re: Compact representation for common integer constants

SubjectAuthor
* Compact representation for common integer constantsJohnG
+* Re: Compact representation for common integer constantsIvan Godard
|+- Re: Compact representation for common integer constantsDavid Brown
|`* Re: Compact representation for common integer constantsJohnG
| `* Re: Compact representation for common integer constantsBGB
|  `* Re: Compact representation for common integer constantsMitchAlsup
|   `* Re: Compact representation for common integer constantsBGB
|    `* Re: Compact representation for common integer constantsThomas Koenig
|     +- Re: Compact representation for common integer constantsMitchAlsup
|     `* Re: Compact representation for common integer constantsBGB
|      `* Re: Compact representation for common integer constantsMitchAlsup
|       `* Re: Compact representation for common integer constantsIvan Godard
|        +- Re: Compact representation for common integer constantsMarcus
|        +* Re: Compact representation for common integer constantsBGB
|        |`* Re: Compact representation for common integer constantsMitchAlsup
|        | +* Clamping. was: Compact representation for common integer constantsIvan Godard
|        | |+* Re: Clamping. was: Compact representation for common integer constantsMitchAlsup
|        | ||`* Re: Clamping. was: Compact representation for common integerIvan Godard
|        | || `- Re: Clamping. was: Compact representation for common integer constantsMitchAlsup
|        | |`* Re: Clamping. was: Compact representation for common integerBGB
|        | | `* Re: Clamping. was: Compact representation for common integerIvan Godard
|        | |  `- Re: Clamping. was: Compact representation for common integer constantsMitchAlsup
|        | +* Re: Compact representation for common integer constantsMarcus
|        | |`* Re: Compact representation for common integer constantsMitchAlsup
|        | | `* Re: Compact representation for common integer constantsDavid Brown
|        | |  `* Re: Compact representation for common integer constantsMitchAlsup
|        | |   +- Re: Compact representation for common integer constantsThomas Koenig
|        | |   `* Re: Compact representation for common integer constantsDavid Brown
|        | |    `- Re: Compact representation for common integer constantsMitchAlsup
|        | `* Re: Compact representation for common integer constantsThomas Koenig
|        |  +- Re: Compact representation for common integer constantsAnton Ertl
|        |  `* Re: Compact representation for common integer constantsMitchAlsup
|        |   `* Re: Compact representation for common integer constantsThomas Koenig
|        |    +* Re: Compact representation for common integer constantsAnton Ertl
|        |    |`* Re: Compact representation for common integer constantsBrian G. Lucas
|        |    | +* Re: Compact representation for common integer constantsThomas Koenig
|        |    | |`- Re: Compact representation for common integer constantsBrian G. Lucas
|        |    | +- Re: Compact representation for common integer constantsStefan Monnier
|        |    | `* Re: Compact representation for common integer constantsAnton Ertl
|        |    |  `* Re: Compact representation for common integer constantsThomas Koenig
|        |    |   +* Re: Compact representation for common integer constantsAnton Ertl
|        |    |   |`* Re: Compact representation for common integer constantsThomas Koenig
|        |    |   | `- Re: Compact representation for common integer constantsAnton Ertl
|        |    |   `* Re: Compact representation for common integer constantsTerje Mathisen
|        |    |    `- Re: Compact representation for common integer constantsAnton Ertl
|        |    `* Re: Compact representation for common integer constantsMitchAlsup
|        |     `* Re: Compact representation for common integer constantsThomas Koenig
|        |      `* Re: Compact representation for common integer constantsBrian G. Lucas
|        |       `* Re: Compact representation for common integer constantsThomas Koenig
|        |        +* Re: Compact representation for common integer constantsMitchAlsup
|        |        |`- Re: Compact representation for common integer constantsThomas Koenig
|        |        +* Re: Compact representation for common integer constantsAnton Ertl
|        |        |+* Re: Compact representation for common integer constantsThomas Koenig
|        |        ||+* Re: Compact representation for common integer constantsMitchAlsup
|        |        |||`* Re: Compact representation for common integer constantsThomas Koenig
|        |        ||| `- Re: Compact representation for common integer constantsMitchAlsup
|        |        ||`* Re: Compact representation for common integer constantsAnton Ertl
|        |        || +* Re: Compact representation for common integer constantsMitchAlsup
|        |        || |+* Re: Compact representation for common integer constantsEricP
|        |        || ||+* Re: Compact representation for common integer constantsThomas Koenig
|        |        || |||+- Re: Compact representation for common integer constantsMitchAlsup
|        |        || |||+* Re: Compact representation for common integer constantsEricP
|        |        || ||||`* Re: Compact representation for common integer constantsTerje Mathisen
|        |        || |||| `* Re: Compact representation for common integer constantsDavid Brown
|        |        || ||||  `* Re: Compact representation for common integer constantsTerje Mathisen
|        |        || ||||   `* Re: Compact representation for common integer constantsDavid Brown
|        |        || ||||    `- Re: Compact representation for common integer constantsTerje Mathisen
|        |        || |||`* Re: Compact representation for common integer constantsStephen Fuld
|        |        || ||| `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || |||  +- Re: Compact representation for common integer constantsStephen Fuld
|        |        || |||  `* Re: Compact representation for common integer constantsBill Findlay
|        |        || |||   `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || |||    `- Re: Compact representation for common integer constantsBill Findlay
|        |        || ||+* Re: Compact representation for common integer constantsThomas Koenig
|        |        || |||+* Re: Compact representation for common integer constantsStephen Fuld
|        |        || ||||`- Re: Compact representation for common integer constantsThomas Koenig
|        |        || |||`- Re: Compact representation for common integer constantsEricP
|        |        || ||`* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || +* Re: Compact representation for common integer constantsNiklas Holsti
|        |        || || |`* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || | `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |  `* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || |   `* Re: Compact representation for common integer constantsEricP
|        |        || || |    +* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |`* Re: Compact representation for common integer constantsEricP
|        |        || || |    | `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |  `* Re: Compact representation for common integer constantsThomas Koenig
|        |        || || |    |   +- Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |   `* Re: Compact representation for common integer constantsTerje Mathisen
|        |        || || |    |    `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |     +- Re: Compact representation for common integer constantsTerje Mathisen
|        |        || || |    |     `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |      `- Re: Compact representation for common integer constantsTerje Mathisen
|        |        || || |    `* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || |     `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |      +- Re: Compact representation for common integer constantsBill Findlay
|        |        || || |      +* Re: Compact representation for common integer constantsThomas Koenig
|        |        || || |      |+* Re: Compact representation for common integer constantsAnton Ertl
|        |        || || |      ||`* Re: Compact representation for common integer constantsThomas Koenig
|        |        || || |      || `- Re: Compact representation for common integer constantsAnton Ertl
|        |        || || |      |`* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |      +* Re: Compact representation for common integer constantsTerje Mathisen
|        |        || || |      `* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || `* Re: Compact representation for common integer constantsEricP
|        |        || |`- Re: Compact representation for common integer constantsAnton Ertl
|        |        || `* Re: Compact representation for common integer constantsThomas Koenig
|        |        |`* Re: Compact representation for common integer constantsMitchAlsup
|        |        `* Re: Compact representation for common integer constantsBrian G. Lucas
|        `* Re: Compact representation for common integer constantsQuadibloc
+* Re: Compact representation for common integer constantsBGB
`* Re: Compact representation for common integer constantsJohn Levine

Pages:123456789101112131415
Re: The old RISC-vs-CISC

<s7alsa$824$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16579&group=comp.arch#16579

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC
Date: Mon, 10 May 2021 00:03:38 -0700
Organization: A noiseless patient Spider
Lines: 25
Message-ID: <s7alsa$824$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7aif0$13o3$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 10 May 2021 07:03:38 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="fd55d588e88460585721e0e49e5ed356";
logging-data="8260"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/2hDUSq5GoUFQ4Hrw6sQwQ"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.0
Cancel-Lock: sha1:jmiM6QTQWHe2UKOuo+gXKk3nE4k=
In-Reply-To: <s7aif0$13o3$1@gioia.aioe.org>
Content-Language: en-US
 by: Ivan Godard - Mon, 10 May 2021 07:03 UTC

On 5/9/2021 11:05 PM, Terje Mathisen wrote:
> MitchAlsup wrote:
>> On Sunday, May 9, 2021 at 4:34:56 PM UTC-5, BGB wrote:
>>> On 5/9/2021 3:59 PM, MitchAlsup wrote:
>>>> I know one compiler writer that would enunciate "LEA" as 'Lee'-'ah'
>>> I always thought of it like "Lee".
>> <
>> It was funny to hear him say "lee-ah", but he was from Russia so I
>> gave him
>> a break.
>
> That is funny:
>
> 'Lea' is an old female name here in Norway (probably from the
> Norwegian-language bible), it is always pronounced as two emphasized
> syllables, i.e. close to the 'Lee'-'ah' suggestion above.
>
> I never even considered that it could be pronounced any other way!
>
> Terje
>

Exists in English too, though often spelled "Leah".
https://en.wikipedia.org/wiki/Lea_(given_name)
https://en.wikipedia.org/wiki/Leah_(given_name)

Re: The old RISC-vs-CISC

<s7amuj$6vd$2@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16580&group=comp.arch#16580

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-6262-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC
Date: Mon, 10 May 2021 07:21:55 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <s7amuj$6vd$2@newsreader4.netcologne.de>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7aif0$13o3$1@gioia.aioe.org>
Injection-Date: Mon, 10 May 2021 07:21:55 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-6262-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:6262:0:7285:c2ff:fe6c:992d";
logging-data="7149"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Mon, 10 May 2021 07:21 UTC

Terje Mathisen <terje.mathisen@tmsw.no> schrieb:
> MitchAlsup wrote:
>> On Sunday, May 9, 2021 at 4:34:56 PM UTC-5, BGB wrote:
>>> On 5/9/2021 3:59 PM, MitchAlsup wrote:
>>>> I know one compiler writer that would enunciate "LEA" as 'Lee'-'ah'
>>> I always thought of it like "Lee".
>> <
>> It was funny to hear him say "lee-ah", but he was from Russia so I gave him
>> a break.
>
> That is funny:
>
> 'Lea' is an old female name here in Norway (probably from the
> Norwegian-language bible),

It is a fairly common name in Germany.

I always thought it came from Latin, "lioness". Hmm...
You're right, it is also a bibilical name.

>it is always pronounced as two emphasized
> syllables, i.e. close to the 'Lee'-'ah' suggestion above.

I would always prounounce the instruction like the name,
le:a .

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<s7ao1i$l2p$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16581&group=comp.arch#16581

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
Date: Mon, 10 May 2021 02:40:25 -0500
Organization: A noiseless patient Spider
Lines: 94
Message-ID: <s7ao1i$l2p$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7aeoj$tdu$1@dont-email.me> <s7ak85$5dr$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 10 May 2021 07:40:34 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8fd450ac5d006b88eb9e65bad5397861";
logging-data="21593"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1819lNjwfh14Wl9bAmlCU6/"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.0
Cancel-Lock: sha1:xayjhIoa1SQF7LzciHvFxEUQIHU=
In-Reply-To: <s7ak85$5dr$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: BGB - Mon, 10 May 2021 07:40 UTC

On 5/10/2021 1:35 AM, Thomas Koenig wrote:
> BGB <cr88192@gmail.com> schrieb:
>
>> Similarly, it wont effect any local branches unless by some absurd
>> chance a *single function* were to exceed 1MB in the ".text" section.
>
>> Excluding the possibility of excessively large procedurally generated
>> "switch()" blocks or similar (eg: 10k+ case labels), this is unlikely.
>
> One way I have encountered this is automatically generated formulas
> from computer algebra sytems like Maple.
>
> Look at https://gcc.gnu.org/bugzilla/attachment.cgi?id=41459 for
> an example of such code. It isn't easy on compilers (but it does
> not have branches).
>

OK.

>> But, these cases don't break the ISA, merely the limits of the existing
>> branch encodings. I could, in-theory add wider encodings, but then would
>> need to deal with them in the pipeline (and making this part any wider
>> would be a big source of timing hassles).
>
> Code should be correct as first consideration, fast as a (distant)
> second.
>
> I assume you can jump to a register in your ISA, for function
> pointers.
>

Yeah, these are possible:
JMP / BRA Rn // Jump to register
JSR / BSR Rn // Call to register
JT / BT Rn // Branch True
JF / BF Rn // Branch False

Naming gets a little funky, and this is a place where my ISA listing and
assembler ended up in disagreement.

So, the assembler uses different mnemonics than the ISA listing (namely
JMP/JSR/JT/JF) mostly to avoid ambiguity over the (PC, Rn) vs (Rn) cases.

> If that is the case, you can reverse the test and optionally branch
> over an instruction sequence which loads the target address into
> a register via loading the PC and adding the offset to it (as
> determined by the assembler) and then jumping to that register.
>
> [...]
>

Well, or use a conditional branch to the register...

The issue is not "what can or can't be done" but rather whether or not
it can be done using a single instruction.

>> Though, it would technically be a bit simpler/cheaper to add a special
>> case to allow for a Jumbo-encoded "BRA (Abs48)" or similar in this case,
>> which wouldn't require any new logic on the EX side of things (can jump
>> anywhere... Just requires using base relocs...).
>
> That sounds even better. You have the long instructions, why not use
> them?
>

Yes, possible...

Did just go and add a "BRA Disp33s" encoding as a test...
But, sadly, directly using the AGU output for the branch (to sidestep
some address hackery) kinda blows out the timing constraints...

Like, it would be nicer to be like:
Well, we can have "48b+(Disp33s*2)" from the AGU, why not just use this
address as the branch destination?... But, alas, timing isn't super
favorable towards this idea...

An Abs48 encoding or similar wouldn't necessarily be subject to this
issue, since from the EX stages' POV, this case is functionally
equivalent to the "branch to register" case.

One possibility:
FAjj-jjjj: LDIZ Imm24u, R0 //existing op
FBjj-jjjj: LDIN Imm24n, R0 //existing op
FEdd-dddd-FAdd-dddd: LDIZ Imm48u, R0
FEdd-dddd-FBdd-dddd: LDIN Imm48n, R0
FFdd-dddd-FAdd-dddd: BRA Abs48
FFdd-dddd-FBdd-dddd: BSR Abs48

Which while not perfect (due to the implications of this encoding; and
inability to be predicated), is at least "not completely awful".

Re: FP8 (was Compact representation for common integer constants)

<2021May10.101544@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16582&group=comp.arch#16582

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: FP8 (was Compact representation for common integer constants)
Date: Mon, 10 May 2021 08:15:44 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 39
Message-ID: <2021May10.101544@mips.complang.tuwien.ac.at>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com> <s76as3$i7j$1@gal.iecc.com> <s76ehd$65p$1@dont-email.me> <2021May9.101917@mips.complang.tuwien.ac.at> <s7abfl$1k5m$1@gal.iecc.com>
Injection-Info: reader02.eternal-september.org; posting-host="21d162dc1fd54433f69c9d73ceeec069";
logging-data="12668"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/T1bHAsZ5w/+pXVF3OShXX"
Cancel-Lock: sha1:TF4rvA/rQEclJmZ+nYpyw0kcl5A=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Mon, 10 May 2021 08:15 UTC

John Levine <johnl@taugh.com> writes:
>According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:
>>As for fixed vs. floating point, I guess that is a cross-cutting
>>concern. Sure you can argue that, if you spend a lot of time on
>>low-level steps such as coding layout, spending some time on range
>>analysis is minor change, so fixed point is acceptable.
>
>We forget how flaky and unreliable those early computers were. They
>used orders of magnitude more components than the most complex
>electronic systems ever had before. The Williams tubes used in the
>early 1950s just barely worked and needed endless tuning and fiddling.
>Even once they switched to core, tubes burned out, components went out
>of spec, solder joints cracked, and uptime was measured in hours if
>you were lucky, minutes if you weren't.

Heinz Zemanek told us that the relay computers had long uptimes,
while the tube computers failed every five minutes, and then had hours
of downtime for repair, but the argument was that the tube machine
computed more in the five minutes than the relay machine in
hours.

>>What held back floating point for a long time was the slowness and/or
>>high cost of FP hardware, but at least in general-purpose computers
>>that's a thing of the past.
>
>On small systems, I suppose. Floating point was standard or at least a
>widely provided option on large computers by the early 1960s.

But it still was much slower than integer arithmetic on most machines.
E.g., in "Writing Efficient Programs" (1982) Bentley originally used
FP for his traveling salesman example, but switched to integer
arithmetic as one of the optimization steps. When adapting the
examble for my course in ~2000, I left that step away because the
expected speed gain on the Coppermine CPU was small.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: The old RISC-vs-CISC

<s7arl5$dl0$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16583&group=comp.arch#16583

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC
Date: Mon, 10 May 2021 03:42:03 -0500
Organization: A noiseless patient Spider
Lines: 41
Message-ID: <s7arl5$dl0$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7aif0$13o3$1@gioia.aioe.org> <s7alsa$824$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 10 May 2021 08:42:14 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8fd450ac5d006b88eb9e65bad5397861";
logging-data="13984"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19fTrcn1kDOR/D0kTPh8mQi"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.0
Cancel-Lock: sha1:hZSBf+NkI/HVAPlGyWGiJ9hXo0o=
In-Reply-To: <s7alsa$824$1@dont-email.me>
Content-Language: en-US
 by: BGB - Mon, 10 May 2021 08:42 UTC

On 5/10/2021 2:03 AM, Ivan Godard wrote:
> On 5/9/2021 11:05 PM, Terje Mathisen wrote:
>> MitchAlsup wrote:
>>> On Sunday, May 9, 2021 at 4:34:56 PM UTC-5, BGB wrote:
>>>> On 5/9/2021 3:59 PM, MitchAlsup wrote:
>>>>> I know one compiler writer that would enunciate "LEA" as 'Lee'-'ah'
>>>> I always thought of it like "Lee".
>>> <
>>> It was funny to hear him say "lee-ah", but he was from Russia so I
>>> gave him
>>> a break.
>>
>> That is funny:
>>
>> 'Lea' is an old female name here in Norway (probably from the
>> Norwegian-language bible), it is always pronounced as two emphasized
>> syllables, i.e. close to the 'Lee'-'ah' suggestion above.
>>
>> I never even considered that it could be pronounced any other way!
>>
>> Terje
>>
>
> Exists in English too, though often spelled "Leah".
> https://en.wikipedia.org/wiki/Lea_(given_name)
> https://en.wikipedia.org/wiki/Leah_(given_name)

Though, it is usually pronounced as a single syllable with a dipthong,
rather than two syllables.

Depending on accent, the distinction between "Leah" and "Lee" may be
lost (in much the same way as "pin" vs "pen", or "cot" vs "caught", ...).

Or, while most people (myself included) say "dog" with a single vowel,
many of the locals shift it to a diphthong "daug" / "dauwg", or with
other words like "ball" ("bauwl"), and with their endless obsession with
a certain sport, if one is around them, one hears a whole lot about
"fuutbauwl"...

Granted, I did not originate in the area I am currently living...

Re: FP8 (was Compact representation for common integer constants)

<GqbmI.582827$%W6.8780@fx44.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16584&group=comp.arch#16584

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!tr1.eu1.usenetexpress.com!feeder.usenetexpress.com!tr3.iad1.usenetexpress.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx44.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: FP8 (was Compact representation for common integer constants)
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com> <s76as3$i7j$1@gal.iecc.com> <s76ehd$65p$1@dont-email.me> <2021May9.101917@mips.complang.tuwien.ac.at> <s7abfl$1k5m$1@gal.iecc.com> <2021May10.101544@mips.complang.tuwien.ac.at>
In-Reply-To: <2021May10.101544@mips.complang.tuwien.ac.at>
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 63
Message-ID: <GqbmI.582827$%W6.8780@fx44.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Mon, 10 May 2021 14:28:54 UTC
Date: Mon, 10 May 2021 10:28:40 -0400
X-Received-Bytes: 3481
X-Received-Body-CRC: 1582211329
 by: EricP - Mon, 10 May 2021 14:28 UTC

Anton Ertl wrote:
> John Levine <johnl@taugh.com> writes:
>> According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:
>>> As for fixed vs. floating point, I guess that is a cross-cutting
>>> concern. Sure you can argue that, if you spend a lot of time on
>>> low-level steps such as coding layout, spending some time on range
>>> analysis is minor change, so fixed point is acceptable.
>> We forget how flaky and unreliable those early computers were. They
>> used orders of magnitude more components than the most complex
>> electronic systems ever had before. The Williams tubes used in the
>> early 1950s just barely worked and needed endless tuning and fiddling.
>> Even once they switched to core, tubes burned out, components went out
>> of spec, solder joints cracked, and uptime was measured in hours if
>> you were lucky, minutes if you weren't.
>
> Heinz Zemanek told us that the relay computers had long uptimes,
> while the tube computers failed every five minutes, and then had hours
> of downtime for repair, but the argument was that the tube machine
> computed more in the five minutes than the relay machine in
> hours.

Modern floating point was first used by Konrad Zuse for his computers,
Z1 (circa 1935), to Z4 (circa 1945).

Z3 circa 1941 was built with 2,600 relays, had a 22-bit word length
and a clock frequency of about 5–10 Hz.
https://en.wikipedia.org/wiki/Z3_(computer)

Z4 was the first commercial unit.
https://en.wikipedia.org/wiki/Z4_(computer)

(in German)
Die Z3 von Konrad Zuse im Deutschen Museum
https://www.youtube.com/watch?v=aUXnhVrT4CI

A Z3 was rebuilt in 2010
[paywalled]
https://ieeexplore.ieee.org/document/1498716

(in German)
Horst Zuse‘s Z3 Part 1: Demonstration
https://www.youtube.com/watch?v=WwBsot-yqgc

(in German)
https://www.youtube.com/watch?v=_YR5HhWlOgg

>>> What held back floating point for a long time was the slowness and/or
>>> high cost of FP hardware, but at least in general-purpose computers
>>> that's a thing of the past.
>> On small systems, I suppose. Floating point was standard or at least a
>> widely provided option on large computers by the early 1960s.
>
> But it still was much slower than integer arithmetic on most machines.
> E.g., in "Writing Efficient Programs" (1982) Bentley originally used
> FP for his traveling salesman example, but switched to integer
> arithmetic as one of the optimization steps. When adapting the
> examble for my course in ~2000, I left that step away because the
> expected speed gain on the Coppermine CPU was small.
>
> - anton

Re: Compact representation for common integer constants

<9f36daff-8b8f-4550-80ad-2f75dd98f319n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16585&group=comp.arch#16585

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:4a:: with SMTP id t10mr23712963qkt.249.1620663765622; Mon, 10 May 2021 09:22:45 -0700 (PDT)
X-Received: by 2002:aca:b387:: with SMTP id c129mr25631593oif.30.1620663765379; Mon, 10 May 2021 09:22:45 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!tr3.eu1.usenetexpress.com!feeder.usenetexpress.com!tr3.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 10 May 2021 09:22:45 -0700 (PDT)
In-Reply-To: <s7a8u7$mui$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com> <s6udkp$hs5$1@dont-email.me> <6a45a966-9d86-40ed-9b16-67766956d46fn@googlegroups.com> <s74akj$siq$1@dont-email.me> <f94d31bd-ae99-4d0a-84d3-d16e9ba71c6fn@googlegroups.com> <s74muh$vqf$1@dont-email.me> <s789v4$rv6$1@newsreader4.netcologne.de> <s7a8u7$mui$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <9f36daff-8b8f-4550-80ad-2f75dd98f319n@googlegroups.com>
Subject: Re: Compact representation for common integer constants
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 10 May 2021 16:22:45 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 105
 by: MitchAlsup - Mon, 10 May 2021 16:22 UTC

On Sunday, May 9, 2021 at 10:22:49 PM UTC-5, BGB wrote:
> On 5/9/2021 4:28 AM, Thomas Koenig wrote:
> > BGB <cr8...@gmail.com> schrieb:
> >
> >> IMUL (Lane 1)
> >> 32*32 -> 64
> >
> > Do you have two instructions (one for signed and one for unsigned)
> > or three (one for the lower half, one for signed high, one for
> > unsigned high)? The latter version could save you some ALU
> > complexity and some latency in the (probably common) case where
> > only a 32*32 multiplication is needed, at the cost of added
> > instructions for the 32*32-> 64 bit case.
> >
> There are actually more cases:
> MULS: 32*32->32, Result is sign-extended from low 32 bits
IMUL
> MULU: 32*32->32, Result is zero-extended from low 32 bits
UMUL
> DMULS: 32*32->64, 64-bit signed result
CARRY; IMUL
> DMULU: 32*32->64, 64-bit unsigned result
CARRY; UMUL
>
> The former give the typical behaviors one expects in C, the latter gives
> the widened results.
>
> These exist as 3R forms, so:
> DMULU R4, R5, R7 // R7 = R4 * R5
<
All mine are 2-operand 1-result
>
>
> Originally, there were also multiply ops which multiplied two inputs and
> then stored a pair of results in R0 and R1, more like the original
> SuperH multiply ops, but I dropped these for various reasons.
<
Consumes way more OpCode space that in useful
>
> There are cases where DMACS or DMACU instructions could be useful:
> DMACU R4, R5, R7 // R7 = R4 * R5 + R7
<
IMAC and UMAC
>
> But, I don't currently have these.
>
>
> Eg (64-bit signed multiply):
> SHADQ R4, -32, R6 | SHADQ R5, -32, R7 //1c
> DMULS R4, R7, R16 //1c
> DMACS R5, R6, R16 //3c (2c penalty)
> SHADQ R16, 32, R2 //3c (2c penalty)
> DMACU R4, R5, R2 //1c
> RTS
>
> Though, while fewer instructions than the current form, the above
> construction would still be pretty bad in terms of interlock penalties.
>
> SHADQ R4, -32, R6 | SHADQ R5, -32, R7 //1c
> DMULS R5, R6, R16 //1c
> DMULS R4, R7, R17 //1c
> DMULU R4, R5, R18 //1c
> ADD R16, R17, R19 //2c (1c penalty, DMULS R17)
> SHADQ R19, 32, R2 //2c (1c penalty, ADD R19)
> ADD R18, R2 //1c
> RTS
>
> Both cases would have approximately the same clock-cycle count (assuming
> both cases have a 3-cycle latency).
<
Which is why I used CARRY; xMUL
>
> ( Where recently, I have gotten around to modifying things such that the
> multiplier is now fully pipelined... )
>
>
>
> Otherwise, my time recently has mostly been being consumed by debugging...
<
Sherlock we know you well...........
>
> Then tries and seeing if I can get stuff to pass timing at 75MHz again
> (hopefully without wrecking stuff quite as bad this time). This
> sub-effort also revealed a few bugs (*), though there are still some
> bugs I have yet to resolve...
>
> *: Eg, after boosting the core to 75MHz while leaving the MMIO bus at
> 50MHz, stuff was breaking in simulation due to the L2 Ringbus <->
> MMIO-Bus bridge not waiting for the MMIO-Bus to return to a READY state
> before returning back to an idle state.
>
> It was then possible for the response to travel the rings and get back
> to the L1, which then allows execution to continue, with the CPU core
> then issuing another MMIO request, which then travels along the rings
> back to the MMIO bridge, in less time than it took for the 'OK -> READY'
> transition to happen on the MMIO bus...
>
> The way the bridge was designed, it would then try to initiate a
> request, see that the MMIO-Bus state was 'OK', and use whatever result
> was present (losing the request or returning garbage).
>
> This may have been happening at 50MHz as well, and could have possibly
> been leading to some of the bugs I had seen.
>
>
> Or such...

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<c731f39e-357f-4b8b-a2ee-c3b77610e8c5n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16586&group=comp.arch#16586

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:a546:: with SMTP id o67mr23862385qke.160.1620664212678;
Mon, 10 May 2021 09:30:12 -0700 (PDT)
X-Received: by 2002:a05:6830:40a4:: with SMTP id x36mr18767081ott.342.1620664212452;
Mon, 10 May 2021 09:30:12 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 10 May 2021 09:30:12 -0700 (PDT)
In-Reply-To: <s7aeoj$tdu$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me> <6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me> <77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me> <0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me> <f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me> <55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7aeoj$tdu$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c731f39e-357f-4b8b-a2ee-c3b77610e8c5n@googlegroups.com>
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 10 May 2021 16:30:12 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Mon, 10 May 2021 16:30 UTC

On Monday, May 10, 2021 at 12:02:13 AM UTC-5, BGB wrote:
> On 5/9/2021 6:08 PM, MitchAlsup wrote:
> > On Sunday, May 9, 2021 at 4:34:56 PM UTC-5, BGB wrote:
> >> On 5/9/2021 3:59 PM, MitchAlsup wrote:
> >
> >>> <
> >>> How many branches are farther than 1/8 GB away with code compiled
> >>> from high level languages ??
> >>> <
> >> The normal direct branch op in BJX2 has a 20-bit displacement, so can
> >> reach +/- 1MB.
> >>
> >> A Jumbo+LEA can do +/- 2GB, so:
> >> LEA.B (PC, Disp33s), R3
> >> JMP R3
> >>
> >> But, this is also possible (+/- 32MB):
> >> MOV #Imm25s, R0
> >> BRA R0
> >>
> > My 66000 has 16-bit actual word displacement (18-bit byte address) for
> > conditional branches and 26-bit unconditional branches (and calls) (28-bit
> > effective range).
> >
> > But JMP instructions can have 32-bit or 64-bit immediates so you predicate
> > over the JMP with an inverted condition for the predicate.
> All these cases are 20 bits in my case.
<
But you are not fully supporting a 64-bit address space. Not even attempting.
>
> I had a few times considered eliminating the fixed BT/BF instructions in
> favor of using the BRA?T and BRA?F encodings instead (semantically
> equivalent), but haven't done so yet (partly inertia, partly I don't
> like breaking binary compatibility unless there is a good reason).
>
> Though, the main reason to do this would be to reclaim ~ 21 bits of
> encoding space...
> >>
> >> In general, these don't really come up at present because, of my test
> >> programs, all of them have < 1MB in the ".text" section.
> > <
> > In practice, few (<1%) are branches of those dimensions.
> > <
> In my current tests programs, it currently rests at 0%, since any larger
> branches tend to be indirect calls through a function pointer or similar
> (*2).
<
Which is exactly why I could allow them to be more expensive--so little use.
>
> *2: Though, one of these programs is has ~ 1.2 MB of ".text" on an
> x86-64 build (for both native Win64 and WSL), and 760K as BJX2 (in speed
> optimized mode).
>
>
>
> Similarly, it wont effect any local branches unless by some absurd
> chance a *single function* were to exceed 1MB in the ".text" section.
>
> Excluding the possibility of excessively large procedurally generated
> "switch()" blocks or similar (eg: 10k+ case labels), this is unlikely.
>
The person debugging a 10K case label switch statement gets what [s]he deserves.
>
> But, these cases don't break the ISA, merely the limits of the existing
> branch encodings. I could, in-theory add wider encodings, but then would
> need to deal with them in the pipeline (and making this part any wider
> would be a big source of timing hassles).
>
>
> Well, it is either this, or allow a "jumbo branch" encoding where, say:
> Can branch pretty much anywhere in the address space;
> Completely ignored by the branch predictor;
> Takes ~ 11 or so clock cycles to perform said branch...
>
>
> In this case, the branch would likely be routed through the ALU (as a
> 64-bit integer ADD), and then the ADD result is used as a branch
> destination.
>
> Where, say, the branch instruction is actually decoded as a hacked
> 64-bit ADD instruction, and then in EX2 or EX3, some logic is like "Hey,
> we just added something to PC!" and invokes the register-indirect branch
> mechanism using the ADD result.
>
> But, a lot of this would be mostly because, as-is, there is basically no
> good way for the existing PC-relative branch mechanisms to handle a
> displacement this large...
>
>
> Though, it would technically be a bit simpler/cheaper to add a special
> case to allow for a Jumbo-encoded "BRA (Abs48)" or similar in this case,
> which wouldn't require any new logic on the EX side of things (can jump
> anywhere... Just requires using base relocs...).
> >>>>> Compound addressing within structures;
> >>>>> Eg: Accessing an element within an array within a structure.
> >>>>> ...
> >>>>>
> >>>> MRISC32 also has LEA (called "LDEA" - LoaD Effective Address). It
> >>>> essentially uses the output from the AGU as the result (bypassing
> >>>> the load-from-memory stage), and has a few different use cases.
> >>>>
> >>> I know one compiler writer that would enunciate "LEA" as 'Lee'-'ah'
> >> I always thought of it like "Lee".
> > <
> > It was funny to hear him say "lee-ah", but he was from Russia so I gave him
> > a break.
> OK.
>
> I vary between phonetic mappings, and letter-based mappings.
>
> If I try to say things IRL, sometimes there is a slight delay if my mind
> isn't sure how to map a mess of letters and numbers over to a spoken
> form. As can be noted, I mostly think in visual images and text with
> spoken language more as a 2nd class citizen.
<
Which brings to mind: how is IRL pronounced differently than URL ?
>
>
> Admittedly, this is also partly why my naming conventions tend to be
> based more on patterns or visual aesthetic rather than whether or not
> things can be easily spoken.
<
My OpCode spelling harken back more to IBM 360 and DEC PDP-11 than
to 68K, or x86-64.
> >>>>
> >>>> Apart from preloading memory addresses into registers, it can be used
> >>>> for simple arithmetic on the form A + B << N (where N is 0, 1, 2 or 3),
> >>>> which can be used for implementing x * 3 and x * 5, for instance.
> >>>>
> >>> Better still is to make integer multiply 3-cycles.
> >>>>
> >>>> Another very useful use case is to load vector registers with a
> >>>> "stride". In vector mode the AGU will generate addresses on the
> >>>> form:
> >>>>
> >>>> (1) addr[k] = Rb + Im * k (Immediate form stride)
> >>>> (2) addr[k] = Rb + (Ri << Sc) * k (Register form stride)
> >>>> (3) addr[k] = Rb + Vi[k] << Sc (Gather/scatter)
> >>>>
> >>> Both stride and gather forms fall out "for free" in VVM. {So, I am
> >>> agreeing with you that LEA is a valuable instruction.}
> >>>>
> >>>> Forms (1) and (2) construct addresses with a constant offset
> >>>> between each address (I call these "stride based" load/store).
> >>> <
> >>> Don't forget the stride-0 form in GPUs where every thread wants to
> >>> bang on the auto-update memory reference location.
> >>>>
> >>>> The LDEA instruction can thus be used for loading a series of
> >>>> numbers into a vector register, e.g. like this:
> >>>>
> >>>> LDEA V1, Z, #1 ; V1 = {0, 1, 2, 3, 4, 5, ...}
> >>>>
> >>>> LDI S1, #7
> >>>> LDEA V1, S1, #3 ; V1 = {7, 10, 13, 16, 19, ...}
> >>>>
> >>>> This is a common operation in vectorized code, and on classic SIMD
> >>>> ISA:s you usually load a predefined constant (e.g. from memory) in this
> >>>> case.
> >>>>
> >>>> Since the MRISC32 ISA allows each implementation to define the vector
> >>>> register size, using predefined vector constants is not a good solution
> >>>> since the vector register size is not known at compile time.
> >>>>
> >>>> /Marcus

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<7c39686c-0929-4e1f-8463-33aa8e31f648n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16587&group=comp.arch#16587

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:8246:: with SMTP id e67mr23749024qkd.410.1620664566789;
Mon, 10 May 2021 09:36:06 -0700 (PDT)
X-Received: by 2002:a9d:2de1:: with SMTP id g88mr23037719otb.5.1620664566532;
Mon, 10 May 2021 09:36:06 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 10 May 2021 09:36:06 -0700 (PDT)
In-Reply-To: <s7ak85$5dr$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me> <6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me> <77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me> <0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me> <f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me> <55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7aeoj$tdu$1@dont-email.me> <s7ak85$5dr$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <7c39686c-0929-4e1f-8463-33aa8e31f648n@googlegroups.com>
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 10 May 2021 16:36:06 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Mon, 10 May 2021 16:36 UTC

On Monday, May 10, 2021 at 1:35:51 AM UTC-5, Thomas Koenig wrote:
> BGB <cr8...@gmail.com> schrieb:
> > Similarly, it wont effect any local branches unless by some absurd
> > chance a *single function* were to exceed 1MB in the ".text" section.
>
> > Excluding the possibility of excessively large procedurally generated
> > "switch()" blocks or similar (eg: 10k+ case labels), this is unlikely.
> One way I have encountered this is automatically generated formulas
> from computer algebra sytems like Maple.
>
> Look at https://gcc.gnu.org/bugzilla/attachment.cgi?id=41459 for
> an example of such code. It isn't easy on compilers (but it does
> not have branches).
> > But, these cases don't break the ISA, merely the limits of the existing
> > branch encodings. I could, in-theory add wider encodings, but then would
> > need to deal with them in the pipeline (and making this part any wider
> > would be a big source of timing hassles).
> Code should be correct as first consideration, fast as a (distant)
> second.
>
> I assume you can jump to a register in your ISA, for function
> pointers.
<
The proper word is "through" as in: you can jump through a register.
"Indirectly through" is also correct.
>
> If that is the case, you can reverse the test and optionally branch
> over an instruction sequence which loads the target address into
> a register via loading the PC and adding the offset to it (as
> determined by the assembler) and then jumping to that register.
<
Predicate over the not-to-be-taken branch.
>
> [...]
> > Though, it would technically be a bit simpler/cheaper to add a special
> > case to allow for a Jumbo-encoded "BRA (Abs48)" or similar in this case,
> > which wouldn't require any new logic on the EX side of things (can jump
> > anywhere... Just requires using base relocs...).
> That sounds even better. You have the long instructions, why not use
> them?

Re: The old RISC-vs-CISC

<jwv1raetu9z.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16588&group=comp.arch#16588

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC
Date: Mon, 10 May 2021 12:43:12 -0400
Organization: A noiseless patient Spider
Lines: 6
Message-ID: <jwv1raetu9z.fsf-monnier+comp.arch@gnu.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me>
<jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7aeoj$tdu$1@dont-email.me>
<c731f39e-357f-4b8b-a2ee-c3b77610e8c5n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="98d28c00892a96af5ca08fba50e87736";
logging-data="28221"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18FzFoqTnnI8ngCqQzcSs5L"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:uNwOROmkS9v93dXZTcpal9afRR8=
sha1:ILGwCFWMwWmHxx4cEJRTkp9D04U=
 by: Stefan Monnier - Mon, 10 May 2021 16:43 UTC

> Which brings to mind: how is IRL pronounced differently than URL ?

I thought the question would be: how do you spell "we are ell"?

Stefan

Re: Compact representation for common integer constants

<s7bo65$9gq$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16589&group=comp.arch#16589

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Mon, 10 May 2021 09:49:11 -0700
Organization: A noiseless patient Spider
Lines: 109
Message-ID: <s7bo65$9gq$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s6udkp$hs5$1@dont-email.me>
<6a45a966-9d86-40ed-9b16-67766956d46fn@googlegroups.com>
<s74akj$siq$1@dont-email.me>
<f94d31bd-ae99-4d0a-84d3-d16e9ba71c6fn@googlegroups.com>
<s74muh$vqf$1@dont-email.me> <s789v4$rv6$1@newsreader4.netcologne.de>
<s7a8u7$mui$1@dont-email.me>
<9f36daff-8b8f-4550-80ad-2f75dd98f319n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 10 May 2021 16:49:09 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="fd55d588e88460585721e0e49e5ed356";
logging-data="9754"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18t64GLUavBB5yjSshvqfmV"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.0
Cancel-Lock: sha1:CgebddPU/h76Vs451sOPhEYgWxs=
In-Reply-To: <9f36daff-8b8f-4550-80ad-2f75dd98f319n@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Mon, 10 May 2021 16:49 UTC

On 5/10/2021 9:22 AM, MitchAlsup wrote:
> On Sunday, May 9, 2021 at 10:22:49 PM UTC-5, BGB wrote:
>> On 5/9/2021 4:28 AM, Thomas Koenig wrote:
>>> BGB <cr8...@gmail.com> schrieb:
>>>
>>>> IMUL (Lane 1)
>>>> 32*32 -> 64
>>>
>>> Do you have two instructions (one for signed and one for unsigned)
>>> or three (one for the lower half, one for signed high, one for
>>> unsigned high)? The latter version could save you some ALU
>>> complexity and some latency in the (probably common) case where
>>> only a 32*32 multiplication is needed, at the cost of added
>>> instructions for the 32*32-> 64 bit case.
>>>
>> There are actually more cases:
>> MULS: 32*32->32, Result is sign-extended from low 32 bits
> IMUL
>> MULU: 32*32->32, Result is zero-extended from low 32 bits
> UMUL
>> DMULS: 32*32->64, 64-bit signed result
> CARRY; IMUL
>> DMULU: 32*32->64, 64-bit unsigned result
> CARRY; UMUL
>>
>> The former give the typical behaviors one expects in C, the latter gives
>> the widened results.
>>
>> These exist as 3R forms, so:
>> DMULU R4, R5, R7 // R7 = R4 * R5
> <
> All mine are 2-operand 1-result
>>
>>
>> Originally, there were also multiply ops which multiplied two inputs and
>> then stored a pair of results in R0 and R1, more like the original
>> SuperH multiply ops, but I dropped these for various reasons.
> <
> Consumes way more OpCode space that in useful
>>
>> There are cases where DMACS or DMACU instructions could be useful:
>> DMACU R4, R5, R7 // R7 = R4 * R5 + R7
> <
> IMAC and UMAC
>>
>> But, I don't currently have these.
>>
>>
>> Eg (64-bit signed multiply):
>> SHADQ R4, -32, R6 | SHADQ R5, -32, R7 //1c
>> DMULS R4, R7, R16 //1c
>> DMACS R5, R6, R16 //3c (2c penalty)
>> SHADQ R16, 32, R2 //3c (2c penalty)
>> DMACU R4, R5, R2 //1c
>> RTS
>>
>> Though, while fewer instructions than the current form, the above
>> construction would still be pretty bad in terms of interlock penalties.
>>
>> SHADQ R4, -32, R6 | SHADQ R5, -32, R7 //1c
>> DMULS R5, R6, R16 //1c
>> DMULS R4, R7, R17 //1c
>> DMULU R4, R5, R18 //1c
>> ADD R16, R17, R19 //2c (1c penalty, DMULS R17)
>> SHADQ R19, 32, R2 //2c (1c penalty, ADD R19)
>> ADD R18, R2 //1c
>> RTS
>>
>> Both cases would have approximately the same clock-cycle count (assuming
>> both cases have a 3-cycle latency).
> <
> Which is why I used CARRY; xMUL
>>
>> ( Where recently, I have gotten around to modifying things such that the
>> multiplier is now fully pipelined... )
>>
>>
>>
>> Otherwise, my time recently has mostly been being consumed by debugging...
> <
> Sherlock we know you well...........
>>
>> Then tries and seeing if I can get stuff to pass timing at 75MHz again
>> (hopefully without wrecking stuff quite as bad this time). This
>> sub-effort also revealed a few bugs (*), though there are still some
>> bugs I have yet to resolve...
>>
>> *: Eg, after boosting the core to 75MHz while leaving the MMIO bus at
>> 50MHz, stuff was breaking in simulation due to the L2 Ringbus <->
>> MMIO-Bus bridge not waiting for the MMIO-Bus to return to a READY state
>> before returning back to an idle state.
>>
>> It was then possible for the response to travel the rings and get back
>> to the L1, which then allows execution to continue, with the CPU core
>> then issuing another MMIO request, which then travels along the rings
>> back to the MMIO bridge, in less time than it took for the 'OK -> READY'
>> transition to happen on the MMIO bus...
>>
>> The way the bridge was designed, it would then try to initiate a
>> request, see that the MMIO-Bus state was 'OK', and use whatever result
>> was present (losing the request or returning garbage).
>>
>> This may have been happening at 50MHz as well, and could have possibly
>> been leading to some of the bugs I had seen.
>>
>>
>> Or such...

Do you have any saturating multiplies? Why or why not?

Re: FP8 (was Compact representation for common integer constants)

<s7brr7$1da5$1@gal.iecc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16590&group=comp.arch#16590

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.cmpublishers.com!adore2!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: FP8 (was Compact representation for common integer constants)
Date: Mon, 10 May 2021 17:51:35 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <s7brr7$1da5$1@gal.iecc.com>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com> <2021May9.101917@mips.complang.tuwien.ac.at> <s7abfl$1k5m$1@gal.iecc.com> <2021May10.101544@mips.complang.tuwien.ac.at>
Injection-Date: Mon, 10 May 2021 17:51:35 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="46405"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com> <2021May9.101917@mips.complang.tuwien.ac.at> <s7abfl$1k5m$1@gal.iecc.com> <2021May10.101544@mips.complang.tuwien.ac.at>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Mon, 10 May 2021 17:51 UTC

According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:
>>On small systems, I suppose. Floating point was standard or at least a
>>widely provided option on large computers by the early 1960s.
>
>But it still was much slower than integer arithmetic on most machines.

I happen to have a 360/67 manual here. It and the similar /65 were
workhorse mainframes in the early 1970s.

Memory to register integer add took 1.4us, floating took 2.43 short, 2.45 double
Integer multiply was 4.8us, 4.4 short (faster than integer), 7.6 double
Integer divide was 8.7us, float 7.3, double 14.10

The short float was faster than integer because the number of bits in
the product or dividend was less.

I have manuals for the 386 and 486, where float arithmetic was also
about half the speed of fixed, even though integer arithmetic was
32x32 and float was all extended with 64 fraction bits. Doesn't seem
much slower to me.

If you can write your code using the same number of fixed instructions
as floats, sure, it'll be faster, but if you need to add extra code to
explicit scaling, I doubt it'll really be faster.

On the other hand, if you had a 286 or 386 with no 287 or 387 and
were simulating floating point in software, *that* was slow.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<s7bt9m$iim$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16591&group=comp.arch#16591

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: m.del...@this.bitsnbites.eu (Marcus)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
Date: Mon, 10 May 2021 20:16:22 +0200
Organization: A noiseless patient Spider
Lines: 58
Message-ID: <s7bt9m$iim$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 10 May 2021 18:16:22 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="2ad4a2a59ec60664e3ca2d347c0ed230";
logging-data="19030"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19a5o5SMfkY1aA97LozQHdvy51R1slBuvY="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.8.1
Cancel-Lock: sha1:fcdfHdxzooGvza0Sv9eeLoOBpvE=
In-Reply-To: <55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
Content-Language: en-US
 by: Marcus - Mon, 10 May 2021 18:16 UTC

On 2021-05-10, MitchAlsup wrote:
> On Sunday, May 9, 2021 at 4:34:56 PM UTC-5, BGB wrote:
>> On 5/9/2021 3:59 PM, MitchAlsup wrote:
>
>>> <
>>> How many branches are farther than 1/8 GB away with code compiled
>>> from high level languages ??
>>> <
>> The normal direct branch op in BJX2 has a 20-bit displacement, so can
>> reach +/- 1MB.
>>
>> A Jumbo+LEA can do +/- 2GB, so:
>> LEA.B (PC, Disp33s), R3
>> JMP R3
>>
>> But, this is also possible (+/- 32MB):
>> MOV #Imm25s, R0
>> BRA R0
>>
> My 66000 has 16-bit actual word displacement (18-bit byte address) for
> conditional branches and 26-bit unconditional branches (and calls) (28-bit
> effective range).
>
> But JMP instructions can have 32-bit or 64-bit immediates so you predicate
> over the JMP with an inverted condition for the predicate.

MRISC32 has 18-bit word displacement (+/-0.5MB) for conditional
branches, and 21-bit word displacement (+/-4MB) for unconditional jumps
and calls.

Since unconditional jump targets are register-relative (where PC is one
possible register - for PC-relative unconditional branches), it is
possible to extend the range with one additional instruction, e.g:

ADDPCHI LR, #foo@pchi
JL LR, #foo+4@pclo

I would expect that a reasonably advanced microarchitecture will fuse
those two instructions into a single instruction with a full-width
displacement (it's a very distinct 64-bit sequence where only the
immediate fields are variable), while a dumb implementation (like my
current CPU) will take two cycles to complete the jump.

In fact the GCC back-end always emits the two-instruction sequence for
calls (and tail calls), and relies on the linker to relax it to a
single instruction when possible (though I have not implemented linker
relaxation yet).

>>
>> In general, these don't really come up at present because, of my test
>> programs, all of them have < 1MB in the ".text" section.
> <
> In practice, few (<1%) are branches of those dimensions.
> <

That's what I was betting on.

/Marcus

Re: Compact representation for common integer constants

<s7btv6$o52$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16592&group=comp.arch#16592

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: m.del...@this.bitsnbites.eu (Marcus)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Mon, 10 May 2021 20:27:49 +0200
Organization: A noiseless patient Spider
Lines: 121
Message-ID: <s7btv6$o52$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s6udkp$hs5$1@dont-email.me>
<6a45a966-9d86-40ed-9b16-67766956d46fn@googlegroups.com>
<s74akj$siq$1@dont-email.me>
<f94d31bd-ae99-4d0a-84d3-d16e9ba71c6fn@googlegroups.com>
<s74muh$vqf$1@dont-email.me> <s789v4$rv6$1@newsreader4.netcologne.de>
<s7a8u7$mui$1@dont-email.me>
<9f36daff-8b8f-4550-80ad-2f75dd98f319n@googlegroups.com>
<s7bo65$9gq$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 10 May 2021 18:27:50 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="2ad4a2a59ec60664e3ca2d347c0ed230";
logging-data="24738"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+7tOLd8Cko074mhRyxPnjyxvQiybWbFKE="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.8.1
Cancel-Lock: sha1:mz30a/VARmWJMKvrKHZtBTm3sbI=
In-Reply-To: <s7bo65$9gq$1@dont-email.me>
Content-Language: en-US
 by: Marcus - Mon, 10 May 2021 18:27 UTC

On 2021-05-10, Ivan Godard wrote:
> On 5/10/2021 9:22 AM, MitchAlsup wrote:
>> On Sunday, May 9, 2021 at 10:22:49 PM UTC-5, BGB wrote:
>>> On 5/9/2021 4:28 AM, Thomas Koenig wrote:
>>>> BGB <cr8...@gmail.com> schrieb:
>>>>
>>>>> IMUL (Lane 1)
>>>>> 32*32 -> 64
>>>>
>>>> Do you have two instructions (one for signed and one for unsigned)
>>>> or three (one for the lower half, one for signed high, one for
>>>> unsigned high)? The latter version could save you some ALU
>>>> complexity and some latency in the (probably common) case where
>>>> only a 32*32 multiplication is needed, at the cost of added
>>>> instructions for the 32*32-> 64 bit case.
>>>>
>>> There are actually more cases:
>>> MULS: 32*32->32, Result is sign-extended from low 32 bits
>> IMUL
>>> MULU: 32*32->32, Result is zero-extended from low 32 bits
>> UMUL
>>> DMULS: 32*32->64, 64-bit signed result
>> CARRY; IMUL
>>> DMULU: 32*32->64, 64-bit unsigned result
>> CARRY; UMUL
>>>
>>> The former give the typical behaviors one expects in C, the latter gives
>>> the widened results.
>>>
>>> These exist as 3R forms, so:
>>> DMULU R4, R5, R7 // R7 = R4 * R5
>> <
>> All mine are 2-operand 1-result
>>>
>>>
>>> Originally, there were also multiply ops which multiplied two inputs and
>>> then stored a pair of results in R0 and R1, more like the original
>>> SuperH multiply ops, but I dropped these for various reasons.
>> <
>> Consumes way more OpCode space that in useful
>>>
>>> There are cases where DMACS or DMACU instructions could be useful:
>>> DMACU R4, R5, R7 // R7 = R4 * R5 + R7
>> <
>> IMAC and UMAC
>>>
>>> But, I don't currently have these.
>>>
>>>
>>> Eg (64-bit signed multiply):
>>> SHADQ R4, -32, R6 | SHADQ R5, -32, R7 //1c
>>> DMULS R4, R7, R16 //1c
>>> DMACS R5, R6, R16 //3c (2c penalty)
>>> SHADQ R16, 32, R2 //3c (2c penalty)
>>> DMACU R4, R5, R2 //1c
>>> RTS
>>>
>>> Though, while fewer instructions than the current form, the above
>>> construction would still be pretty bad in terms of interlock penalties.
>>>
>>> SHADQ R4, -32, R6 | SHADQ R5, -32, R7 //1c
>>> DMULS R5, R6, R16 //1c
>>> DMULS R4, R7, R17 //1c
>>> DMULU R4, R5, R18 //1c
>>> ADD R16, R17, R19 //2c (1c penalty, DMULS R17)
>>> SHADQ R19, 32, R2 //2c (1c penalty, ADD R19)
>>> ADD R18, R2 //1c
>>> RTS
>>>
>>> Both cases would have approximately the same clock-cycle count (assuming
>>> both cases have a 3-cycle latency).
>> <
>> Which is why I used CARRY; xMUL
>>>
>>> ( Where recently, I have gotten around to modifying things such that the
>>> multiplier is now fully pipelined... )
>>>
>>>
>>>
>>> Otherwise, my time recently has mostly been being consumed by
>>> debugging...
>> <
>> Sherlock we know you well...........
>>>
>>> Then tries and seeing if I can get stuff to pass timing at 75MHz again
>>> (hopefully without wrecking stuff quite as bad this time). This
>>> sub-effort also revealed a few bugs (*), though there are still some
>>> bugs I have yet to resolve...
>>>
>>> *: Eg, after boosting the core to 75MHz while leaving the MMIO bus at
>>> 50MHz, stuff was breaking in simulation due to the L2 Ringbus <->
>>> MMIO-Bus bridge not waiting for the MMIO-Bus to return to a READY state
>>> before returning back to an idle state.
>>>
>>> It was then possible for the response to travel the rings and get back
>>> to the L1, which then allows execution to continue, with the CPU core
>>> then issuing another MMIO request, which then travels along the rings
>>> back to the MMIO bridge, in less time than it took for the 'OK -> READY'
>>> transition to happen on the MMIO bus...
>>>
>>> The way the bridge was designed, it would then try to initiate a
>>> request, see that the MMIO-Bus state was 'OK', and use whatever result
>>> was present (losing the request or returning garbage).
>>>
>>> This may have been happening at 50MHz as well, and could have possibly
>>> been leading to some of the bugs I had seen.
>>>
>>>
>>> Or such...
>
> Do you have any saturating multiplies? Why or why not?

MRISC32 has MULQ: Multiply fixed-point Q numbers with saturation (the
saturation is trivial since the only case that needs to be handled is
-1.0 x -1.0 = 0.999..). There's also MULQR (same but with rounding).

....mostly because MRISC32 tries to support DSP:ish things, and because
the hardware cost is negligible (it's a simple addition to the existing
integer multiplier).

/Marcus

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<62b81886-bf04-4a4c-a89c-aacedba9a1f5n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16593&group=comp.arch#16593

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:55ab:: with SMTP id f11mr25068477qvx.49.1620671718945;
Mon, 10 May 2021 11:35:18 -0700 (PDT)
X-Received: by 2002:a4a:48c2:: with SMTP id p185mr20124688ooa.73.1620671718690;
Mon, 10 May 2021 11:35:18 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 10 May 2021 11:35:18 -0700 (PDT)
In-Reply-To: <s7bt9m$iim$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me> <6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me> <77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me> <0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me> <f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me> <55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7bt9m$iim$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <62b81886-bf04-4a4c-a89c-aacedba9a1f5n@googlegroups.com>
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 10 May 2021 18:35:18 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Mon, 10 May 2021 18:35 UTC

On Monday, May 10, 2021 at 1:16:25 PM UTC-5, Marcus wrote:
> On 2021-05-10, MitchAlsup wrote:
> > On Sunday, May 9, 2021 at 4:34:56 PM UTC-5, BGB wrote:
> >> On 5/9/2021 3:59 PM, MitchAlsup wrote:
> >
> >>> <
> >>> How many branches are farther than 1/8 GB away with code compiled
> >>> from high level languages ??
> >>> <
> >> The normal direct branch op in BJX2 has a 20-bit displacement, so can
> >> reach +/- 1MB.
> >>
> >> A Jumbo+LEA can do +/- 2GB, so:
> >> LEA.B (PC, Disp33s), R3
> >> JMP R3
> >>
> >> But, this is also possible (+/- 32MB):
> >> MOV #Imm25s, R0
> >> BRA R0
> >>
> > My 66000 has 16-bit actual word displacement (18-bit byte address) for
> > conditional branches and 26-bit unconditional branches (and calls) (28-bit
> > effective range).
> >
> > But JMP instructions can have 32-bit or 64-bit immediates so you predicate
> > over the JMP with an inverted condition for the predicate.
> MRISC32 has 18-bit word displacement (+/-0.5MB) for conditional
> branches, and 21-bit word displacement (+/-4MB) for unconditional jumps
> and calls.
>
> Since unconditional jump targets are register-relative (where PC is one
> possible register - for PC-relative unconditional branches), it is
> possible to extend the range with one additional instruction, e.g:
>
> ADDPCHI LR, #foo@pchi
> JL LR, #foo+4@pclo
>
> I would expect that a reasonably advanced microarchitecture will fuse
> those two instructions into a single instruction with a full-width
> displacement (it's a very distinct 64-bit sequence where only the
> immediate fields are variable), while a dumb implementation (like my
> current CPU) will take two cycles to complete the jump.
>
> In fact the GCC back-end always emits the two-instruction sequence for
> calls (and tail calls), and relies on the linker to relax it to a
> single instruction when possible (though I have not implemented linker
> relaxation yet).
<
This is what we did in Mc 88K, the compiler produced the "anywhere"
sequence, and the linker converted it back to "restricted but fits" code
more than 99% of the time.
> >>
> >> In general, these don't really come up at present because, of my test
> >> programs, all of them have < 1MB in the ".text" section.
> > <
> > In practice, few (<1%) are branches of those dimensions.
> > <
> That's what I was betting on.
>
> /Marcus

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<f475d3b6-dc42-4063-9d3f-4f38e3de4e37n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16594&group=comp.arch#16594

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:7906:: with SMTP id u6mr24604718qkc.225.1620680023535;
Mon, 10 May 2021 13:53:43 -0700 (PDT)
X-Received: by 2002:aca:b387:: with SMTP id c129mr830677oif.30.1620680023257;
Mon, 10 May 2021 13:53:43 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 10 May 2021 13:53:43 -0700 (PDT)
In-Reply-To: <62b81886-bf04-4a4c-a89c-aacedba9a1f5n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1ddd:1500:64b5:d96c:ed93:912e;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1ddd:1500:64b5:d96c:ed93:912e
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me> <6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me> <77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me> <0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me> <f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me> <55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7bt9m$iim$1@dont-email.me> <62b81886-bf04-4a4c-a89c-aacedba9a1f5n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f475d3b6-dc42-4063-9d3f-4f38e3de4e37n@googlegroups.com>
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Mon, 10 May 2021 20:53:43 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: robf...@gmail.com - Mon, 10 May 2021 20:53 UTC

ANY1 supports a 21 bit branch displacement for branching +/-1MB. It may also store the return address in the target register allowing conditional branch to subroutine.
Operation for BEQ:
Rt = IP + 8
If (Ra = Rb)
If (Rc = 63)
IP = IP + Displacement
Else
IP = Rc + Displacement

There really needs to be only a small number of return address registers. So in a couple of designs I have trimmed the target register down to two bits allowing three return address registers, and then extended the constant field of the JAL instruction by three extra bits. It makes quite a difference being able to branch +-16MB for instance instead of 2MB. 2MB is not enough for some code.

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<e1d470b1-b2d7-43df-af69-e6f6c9aff210n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16595&group=comp.arch#16595

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:4a:: with SMTP id t10mr24940424qkt.249.1620681071314;
Mon, 10 May 2021 14:11:11 -0700 (PDT)
X-Received: by 2002:a05:6830:1605:: with SMTP id g5mr22498720otr.22.1620681071095;
Mon, 10 May 2021 14:11:11 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 10 May 2021 14:11:10 -0700 (PDT)
In-Reply-To: <f475d3b6-dc42-4063-9d3f-4f38e3de4e37n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me> <6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me> <77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me> <0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me> <f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me> <55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7bt9m$iim$1@dont-email.me> <62b81886-bf04-4a4c-a89c-aacedba9a1f5n@googlegroups.com>
<f475d3b6-dc42-4063-9d3f-4f38e3de4e37n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <e1d470b1-b2d7-43df-af69-e6f6c9aff210n@googlegroups.com>
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 10 May 2021 21:11:11 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Mon, 10 May 2021 21:11 UTC

On Monday, May 10, 2021 at 3:53:45 PM UTC-5, robf...@gmail.com wrote:
> ANY1 supports a 21 bit branch displacement for branching +/-1MB. It may also store the return address in the target register allowing conditional branch to subroutine.
<
How often do you find conditional branching to a subroutine to be effective ?

{I remember using this a lot in 8085 assembly coding, but when writing in C I seldom find
it profitable because all the arguments had to be in the proper registers before the condition
to use the conditionality of the call.

Also, do you have a conditional return, or does the epilogue of the subroutine "get in the way" ??
<
> Operation for BEQ:
> Rt = IP + 8
> If (Ra = Rb)
> If (Rc = 63)
> IP = IP + Displacement
> Else
> IP = Rc + Displacement
>
> There really needs to be only a small number of return address registers. So in a couple of designs I have trimmed the target register down to two bits allowing three return address registers, and then extended the constant field of the JAL instruction by three extra bits. It makes quite a difference being able to branch +-16MB for instance instead of 2MB. 2MB is not enough for some code.
<
I got ±128MB from my ISA.

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<s7ccdn$28b$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16600&group=comp.arch#16600

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
Date: Mon, 10 May 2021 15:34:32 -0700
Organization: A noiseless patient Spider
Lines: 34
Message-ID: <s7ccdn$28b$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7bt9m$iim$1@dont-email.me>
<62b81886-bf04-4a4c-a89c-aacedba9a1f5n@googlegroups.com>
<f475d3b6-dc42-4063-9d3f-4f38e3de4e37n@googlegroups.com>
<e1d470b1-b2d7-43df-af69-e6f6c9aff210n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 10 May 2021 22:34:31 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8d7b9135d963dcff083c5b1cfb1a474a";
logging-data="2315"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/62PBgWQRlWUD7uX/sV2CV"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.0
Cancel-Lock: sha1:fB1UryAys8ny5tcLeZpEKavulAM=
In-Reply-To: <e1d470b1-b2d7-43df-af69-e6f6c9aff210n@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Mon, 10 May 2021 22:34 UTC

On 5/10/2021 2:11 PM, MitchAlsup wrote:
> On Monday, May 10, 2021 at 3:53:45 PM UTC-5, robf...@gmail.com wrote:
>> ANY1 supports a 21 bit branch displacement for branching +/-1MB. It may also store the return address in the target register allowing conditional branch to subroutine.
> <
> How often do you find conditional branching to a subroutine to be effective ?

Very frequent in Mill code, due to predication of calls when EBBs are
folded together to let both then and else run interleaved. Probably rare
otherwise. Consequently asking "how often do you see?" is not very
informative unless also asking whether the compiler takes advantage of
the conditional.

> {I remember using this a lot in 8085 assembly coding, but when writing in C I seldom find
> it profitable because all the arguments had to be in the proper registers before the condition
> to use the conditionality of the call.
>
> Also, do you have a conditional return, or does the epilogue of the subroutine "get in the way" ??

Down with epiloges!

> <
>> Operation for BEQ:
>> Rt = IP + 8
>> If (Ra = Rb)
>> If (Rc = 63)
>> IP = IP + Displacement
>> Else
>> IP = Rc + Displacement
>>
>> There really needs to be only a small number of return address registers. So in a couple of designs I have trimmed the target register down to two bits allowing three return address registers, and then extended the constant field of the JAL instruction by three extra bits. It makes quite a difference being able to branch +-16MB for instance instead of 2MB. 2MB is not enough for some code.
> <
> I got ±128MB from my ISA.
>

Re: The old RISC-vs-CISC

<jwvy2cmqk02.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16601&group=comp.arch#16601

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC
Date: Mon, 10 May 2021 18:54:44 -0400
Organization: A noiseless patient Spider
Lines: 25
Message-ID: <jwvy2cmqk02.fsf-monnier+comp.arch@gnu.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me>
<jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7bt9m$iim$1@dont-email.me>
<62b81886-bf04-4a4c-a89c-aacedba9a1f5n@googlegroups.com>
<f475d3b6-dc42-4063-9d3f-4f38e3de4e37n@googlegroups.com>
<e1d470b1-b2d7-43df-af69-e6f6c9aff210n@googlegroups.com>
<s7ccdn$28b$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="6fcdc590312139b22c872e589820db70";
logging-data="9952"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1//CAu5iyNwAZlyxqarklTp"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:sdoU0M8w+8ux4tZe/VqCt0taVJw=
sha1:Gtm+zVVUOs/M7PZJ4MLw/Hp9dYw=
 by: Stefan Monnier - Mon, 10 May 2021 22:54 UTC

Ivan Godard [2021-05-10 15:34:32] wrote:
> On 5/10/2021 2:11 PM, MitchAlsup wrote:
>> On Monday, May 10, 2021 at 3:53:45 PM UTC-5, robf...@gmail.com wrote:
>>> ANY1 supports a 21 bit branch displacement for branching +/-1MB. It may
>>> also store the return address in the target register allowing conditional
>>> branch to subroutine.
>> <
>> How often do you find conditional branching to a subroutine to be effective ?
>
> Very frequent in Mill code, due to predication of calls when EBBs are folded
> together to let both then and else run interleaved. Probably rare
> otherwise. Consequently asking "how often do you see?" is not very
> informative unless also asking whether the compiler takes advantage of
> the conditional.

I guess it's also because Mill's "branch to subroutine" is really a full
function call (including argument passing), which solves:

>> {I remember using this a lot in 8085 assembly coding, but when
>> writing in C I seldom find it profitable because all the arguments
>> had to be in the proper registers before the condition to use the
>> conditionality of the call.

Stefan

Re: FP8 (was Compact representation for common integer constants)

<ygnzgx2fb2b.fsf@y.z>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16602&group=comp.arch#16602

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!fdcspool6.netnews.com!news-out.netnews.com!news.alt.net!fdc2.netnews.com!peer04.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx16.iad.POSTED!not-for-mail
From: x...@y.z (Josh Vanderhoof)
Newsgroups: comp.arch
Subject: Re: FP8 (was Compact representation for common integer constants)
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s73rcr$83m$1@dont-email.me> <s73u8m$tnj$1@dont-email.me>
<s75pe2$shi$1@dont-email.me> <s76as3$i7j$1@gal.iecc.com>
<jwv7dk93zcy.fsf-monnier+comp.arch@gnu.org>
<s787j3$99m$1@dont-email.me>
<jwv1rafyd1g.fsf-monnier+comp.arch@gnu.org>
<jwvv97rwy3y.fsf-monnier+comp.arch@gnu.org>
<48015cc8-9326-427a-9fdd-36ed1e12939an@googlegroups.com>
<s7ai2u$ul4$1@gioia.aioe.org>
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.2 (gnu/linux)
Reply-To: Josh Vanderhoof <jlv@mxsimulator.com>
Message-ID: <ygnzgx2fb2b.fsf@y.z>
Cancel-Lock: sha1:WxHLwaObmLElUj/jO0SvKKVbuzA=
MIME-Version: 1.0
Content-Type: text/plain
Lines: 27
X-Complaints-To: https://www.astraweb.com/aup
NNTP-Posting-Date: Mon, 10 May 2021 23:01:17 UTC
Date: Mon, 10 May 2021 19:01:16 -0400
X-Received-Bytes: 2376
 by: Josh Vanderhoof - Mon, 10 May 2021 23:01 UTC

Terje Mathisen <terje.mathisen@tmsw.no> writes:

> MitchAlsup wrote:
>> On Sunday, May 9, 2021 at 1:45:06 PM UTC-5, Stefan Monnier wrote:
>>> Stefan Monnier [2021-05-09 14:35:28] wrote:
>>>>> Yes that makes sense. Specifically for FP8, the increased dynamic range
>>>>> is by far the most important trait. I've even seen examples of using
>>>>> 1:5:2 in DNN:s (deep neural networks) where dynamic range is often more
>>>>> important than precision.
>>>> Indeed 1:5:2 is arguably more useful than 1:4:3.
>>> An alternative of course is to use a log-base representation. So your
>>> 8bit data is split 1:7 between a sign and an exponent, with no mantissa
>>> at all and then you choose your dynamic range by picking the base.
>>> Multiplication is easy but addition is takes more effort.
>> <
>> Really ?!?!?
>> Both are implemented with table look ups with concatenated values as the
>> address.
>
> Agreed, in a HW implementation you don't even need a full 64K (8+8
> bits), since the sign logic can go on in parallel with the table
> access.
>
> I.e. 14-bit tables for each operation is sufficient, so that is about
> 14 KB x 4 = 56 KB of rom space?

Don't you only need half the table for the commutative ops?

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<s7cigi$gv8$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16603&group=comp.arch#16603

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
Date: Mon, 10 May 2021 19:17:16 -0500
Organization: A noiseless patient Spider
Lines: 151
Message-ID: <s7cigi$gv8$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7aeoj$tdu$1@dont-email.me> <s7ak85$5dr$1@newsreader4.netcologne.de>
<s7ao1i$l2p$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 11 May 2021 00:18:26 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="588833796df27a08939633a5f2cccbe9";
logging-data="17384"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18A5Bv5Dl5wfQsbgF33PBZC"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.0
Cancel-Lock: sha1:7ylt77TWmNsY8N8TRULuug7C7Og=
In-Reply-To: <s7ao1i$l2p$1@dont-email.me>
Content-Language: en-US
 by: BGB - Tue, 11 May 2021 00:17 UTC

On 5/10/2021 2:40 AM, BGB wrote:
> On 5/10/2021 1:35 AM, Thomas Koenig wrote:
>> BGB <cr88192@gmail.com> schrieb:
>>
>>> Similarly, it wont effect any local branches unless by some absurd
>>> chance a *single function* were to exceed 1MB in the ".text" section.
>>
>>> Excluding the possibility of excessively large procedurally generated
>>> "switch()" blocks or similar (eg: 10k+ case labels), this is unlikely.
>>
>> One way I have encountered this is automatically generated formulas
>> from computer algebra sytems like Maple.
>>
>> Look at https://gcc.gnu.org/bugzilla/attachment.cgi?id=41459 for
>> an example of such code.  It isn't easy on compilers (but it does
>> not have branches).
>>
>
> OK.
>
>
>>> But, these cases don't break the ISA, merely the limits of the existing
>>> branch encodings. I could, in-theory add wider encodings, but then would
>>> need to deal with them in the pipeline (and making this part any wider
>>> would be a big source of timing hassles).
>>
>> Code should be correct as first consideration, fast as a (distant)
>> second.
>>
>> I assume you can jump to a register in your ISA, for function
>> pointers.
>>
>
> Yeah, these are possible:
>  JMP / BRA Rn  // Jump to register
>  JSR / BSR Rn  // Call to register
>  JT / BT Rn    // Branch True
>  JF / BF Rn    // Branch False
>
> Naming gets a little funky, and this is a place where my ISA listing and
> assembler ended up in disagreement.
>
> So, the assembler uses different mnemonics than the ISA listing (namely
> JMP/JSR/JT/JF) mostly to avoid ambiguity over the (PC, Rn) vs (Rn) cases.
>
>
>> If that is the case, you can reverse the test and optionally branch
>> over an instruction sequence which loads the target address into
>> a register via loading the PC and adding the offset to it (as
>> determined by the assembler) and then jumping to that register.
>>
>> [...]
>>
>
> Well, or use a conditional branch to the register...
>
> The issue is not "what can or can't be done" but rather whether or not
> it can be done using a single instruction.
>
>
>>> Though, it would technically be a bit simpler/cheaper to add a special
>>> case to allow for a Jumbo-encoded "BRA (Abs48)" or similar in this case,
>>> which wouldn't require any new logic on the EX side of things (can jump
>>> anywhere... Just requires using base relocs...).
>>
>> That sounds even better.  You have the long instructions, why not use
>> them?
>>
>
> Yes, possible...
>
> Did just go and add a "BRA Disp33s" encoding as a test...
> But, sadly, directly using the AGU output for the branch (to sidestep
> some address hackery) kinda blows out the timing constraints...
>
> Like, it would be nicer to be like:
> Well, we can have "48b+(Disp33s*2)" from the AGU, why not just use this
> address as the branch destination?... But, alas, timing isn't super
> favorable towards this idea...
>

I hacked on it some, and "made it work", mostly by having a new
carry-select adder which implements "(Base48+(Disp33s<<1))".

I have also added logic to the branch predictor so that it uses a carry
bit to detect out-of-range cases (and will disable itself from
predicting the branch if it will go out-of-range).

The EX1 stage then uses a "slightly larger" address calculation for the
branch target.

These effectively allows effectively eliminating most of the
modular-addressed-branch issues.

Now, theoretically, branches can be +/- 8GB.

While I was at it, I effectively widened the AGU by a few bits, so that
it can also now handle 33-bit displacements (as opposed to falling on
its face if it goes outside of +/- 4GB).

The tweaked AGU logic should now be able to handle +/- 32GB for QWORD
operations.

Still seems to pass timing at 50MHz, so probably OK.

>
> An Abs48 encoding or similar wouldn't necessarily be subject to this
> issue, since from the EX stages' POV, this case is functionally
> equivalent to the "branch to register" case.
>
> One possibility:
>  FAjj-jjjj: LDIZ Imm24u, R0  //existing op
>  FBjj-jjjj: LDIN Imm24n, R0  //existing op
>  FEdd-dddd-FAdd-dddd: LDIZ Imm48u, R0
>  FEdd-dddd-FBdd-dddd: LDIN Imm48n, R0
>  FFdd-dddd-FAdd-dddd: BRA Abs48
>  FFdd-dddd-FBdd-dddd: BSR Abs48
>
> Which while not perfect (due to the implications of this encoding; and
> inability to be predicated), is at least "not completely awful".

These operations now exist...

* FEjj_jjjj_F202_C4jj BRA (PC, disp32u)
* FEjj_jjjj_F202_C5jj BRA (PC, disp32n)
* FFjj_jjjj_FAjj_jjjj BRA #abs48

* FEjj_jjjj_F202_CCjj BSR (PC, disp32u)
* FEjj_jjjj_F202_CDjj BSR (PC, disp32n)
* FFjj_jjjj_FBjj_jjjj BSR #abs48

The Disp33s encodings are "kinda a hack", I had an instruction in Imm10
space which only used 8 bits of the operand, so, YOLO...

As noted, the much larger relative branch encodings would likely be:
* FEjj_jjjj_F0jj_Cjjj BRA (PC, disp44s)
* FEjj_jjjj_F0jj_Djjj BSR (PC, disp44s)

Which, while less hacky, these still currently isn't a good way to
utilize such a large displacement.

In any case, while it isn't exactly inconceivable that ".text" can
exceed 1MB, I don't really expect it will exceed 8GB "anytime soon"...

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<004e6091-5505-4963-96ca-81e9a06a707dn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16604&group=comp.arch#16604

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:59ce:: with SMTP id f14mr26391546qtf.346.1620692786856;
Mon, 10 May 2021 17:26:26 -0700 (PDT)
X-Received: by 2002:a05:6808:91:: with SMTP id s17mr113557oic.1.1620692786534;
Mon, 10 May 2021 17:26:26 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!feeder1.cambriumusenet.nl!feed.tweak.nl!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 10 May 2021 17:26:26 -0700 (PDT)
In-Reply-To: <s7cigi$gv8$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me> <6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me> <77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me> <0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me> <f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me> <55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7aeoj$tdu$1@dont-email.me> <s7ak85$5dr$1@newsreader4.netcologne.de>
<s7ao1i$l2p$1@dont-email.me> <s7cigi$gv8$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <004e6091-5505-4963-96ca-81e9a06a707dn@googlegroups.com>
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 11 May 2021 00:26:26 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: MitchAlsup - Tue, 11 May 2021 00:26 UTC

On Monday, May 10, 2021 at 7:18:28 PM UTC-5, BGB wrote:
> On 5/10/2021 2:40 AM, BGB wrote:
> > On 5/10/2021 1:35 AM, Thomas Koenig wrote:
> > Yes, possible...
> >
> > Did just go and add a "BRA Disp33s" encoding as a test...
> > But, sadly, directly using the AGU output for the branch (to sidestep
> > some address hackery) kinda blows out the timing constraints...
> >
> > Like, it would be nicer to be like:
> > Well, we can have "48b+(Disp33s*2)" from the AGU, why not just use this
> > address as the branch destination?... But, alas, timing isn't super
> > favorable towards this idea...
> >
> I hacked on it some, and "made it work", mostly by having a new
> carry-select adder which implements "(Base48+(Disp33s<<1))".
>
> I have also added logic to the branch predictor so that it uses a carry
> bit to detect out-of-range cases (and will disable itself from
> predicting the branch if it will go out-of-range).
<
In most cases, the prediction shows up at the same moment as the instruction(s)
so if you have to perform the branch target address calculation and look at its
HoB you are already too late to use the prediction in the current fetch cycle.

Adders (8-64-bits) generally have the following properties:: smallest delay
though adder = 5 gates, greatest delay = 11 gates (which can be circuit
designed down to 8-gates, but not in a synthesized form).
>
> The EX1 stage then uses a "slightly larger" address calculation for the
> branch target.
>
> These effectively allows effectively eliminating most of the
> modular-addressed-branch issues.
>
> Now, theoretically, branches can be +/- 8GB.
<
<alt>0177</alt> = ±
>
>
> While I was at it, I effectively widened the AGU by a few bits, so that
> it can also now handle 33-bit displacements (as opposed to falling on
> its face if it goes outside of +/- 4GB).
>
> The tweaked AGU logic should now be able to handle +/- 32GB for QWORD
> operations.
>
>
> Still seems to pass timing at 50MHz, so probably OK.
> >
> > An Abs48 encoding or similar wouldn't necessarily be subject to this
> > issue, since from the EX stages' POV, this case is functionally
> > equivalent to the "branch to register" case.
> >
> > One possibility:
> > FAjj-jjjj: LDIZ Imm24u, R0 //existing op
> > FBjj-jjjj: LDIN Imm24n, R0 //existing op
> > FEdd-dddd-FAdd-dddd: LDIZ Imm48u, R0
> > FEdd-dddd-FBdd-dddd: LDIN Imm48n, R0
> > FFdd-dddd-FAdd-dddd: BRA Abs48
> > FFdd-dddd-FBdd-dddd: BSR Abs48
> >
> > Which while not perfect (due to the implications of this encoding; and
> > inability to be predicated), is at least "not completely awful".
> These operations now exist...
>
> * FEjj_jjjj_F202_C4jj BRA (PC, disp32u)
> * FEjj_jjjj_F202_C5jj BRA (PC, disp32n)
> * FFjj_jjjj_FAjj_jjjj BRA #abs48
>
> * FEjj_jjjj_F202_CCjj BSR (PC, disp32u)
> * FEjj_jjjj_F202_CDjj BSR (PC, disp32n)
> * FFjj_jjjj_FBjj_jjjj BSR #abs48
>
>
> The Disp33s encodings are "kinda a hack", I had an instruction in Imm10
> space which only used 8 bits of the operand, so, YOLO...
>
> As noted, the much larger relative branch encodings would likely be:
> * FEjj_jjjj_F0jj_Cjjj BRA (PC, disp44s)
> * FEjj_jjjj_F0jj_Djjj BSR (PC, disp44s)
>
>
> Which, while less hacky, these still currently isn't a good way to
> utilize such a large displacement.
>
>
>
> In any case, while it isn't exactly inconceivable that ".text" can
> exceed 1MB, I don't really expect it will exceed 8GB "anytime soon"...

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<472dd71a-0149-41b7-96db-da9012476b8an@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16605&group=comp.arch#16605

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ae9:e71a:: with SMTP id m26mr15938524qka.36.1620698435896;
Mon, 10 May 2021 19:00:35 -0700 (PDT)
X-Received: by 2002:a05:6830:1605:: with SMTP id g5mr23399780otr.22.1620698435662;
Mon, 10 May 2021 19:00:35 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 10 May 2021 19:00:35 -0700 (PDT)
In-Reply-To: <004e6091-5505-4963-96ca-81e9a06a707dn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1ddd:1500:64b5:d96c:ed93:912e;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1ddd:1500:64b5:d96c:ed93:912e
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me> <6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me> <77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me> <0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me> <f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me> <55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7aeoj$tdu$1@dont-email.me> <s7ak85$5dr$1@newsreader4.netcologne.de>
<s7ao1i$l2p$1@dont-email.me> <s7cigi$gv8$1@dont-email.me> <004e6091-5505-4963-96ca-81e9a06a707dn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <472dd71a-0149-41b7-96db-da9012476b8an@googlegroups.com>
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Tue, 11 May 2021 02:00:35 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
 by: robf...@gmail.com - Tue, 11 May 2021 02:00 UTC

> How often do you find conditional branching to a subroutine to be effective ?

Not expecting the conditional branch to subroutine to be very effective. There just happened to be an empty target register field left over in the instruction, so I thought why not make it do some work since the cost is low? But an unconditional branch to subroutine is also available by making the branch condition always true. There is already an unconditional jump and link instruction which can do relative calls, but it forces the address alignment to 16B, whereas the branch to subroutine only needs an 8B alignment.

>Also, do you have a conditional return, or does the epilogue of the subroutine "get in the way" ??

The branch instruction may also be used to perform a conditional return by using the return address register in the target calculation.

> I got ±128MB from my ISA.

For ANY1 the unconditional JAL has a whopping 44-bit target constant field. That’s ±8TB addressing. Not sure that such a range is very valuable, but it falls out of using a 64-bit instruction format.
Now that I think about it, it might be more valuable to trim back some of the constant bits, and use them to add onto the return address for the call. That would allow encoding inline constant parameters to functions and allow variable parameter lists. The compiler could calculate where the return must be and adjust the constant accordingly.

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<s7ctcp$ip0$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16606&group=comp.arch#16606

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
Date: Mon, 10 May 2021 22:22:59 -0500
Organization: A noiseless patient Spider
Lines: 143
Message-ID: <s7ctcp$ip0$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7aeoj$tdu$1@dont-email.me> <s7ak85$5dr$1@newsreader4.netcologne.de>
<s7ao1i$l2p$1@dont-email.me> <s7cigi$gv8$1@dont-email.me>
<004e6091-5505-4963-96ca-81e9a06a707dn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 11 May 2021 03:24:10 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="588833796df27a08939633a5f2cccbe9";
logging-data="19232"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19oskQitEYTzrDsZ7hpls8/"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.0
Cancel-Lock: sha1:6TnqYXIcXbafM9a1k/1yZK8xuCE=
In-Reply-To: <004e6091-5505-4963-96ca-81e9a06a707dn@googlegroups.com>
Content-Language: en-US
 by: BGB - Tue, 11 May 2021 03:22 UTC

On 5/10/2021 7:26 PM, MitchAlsup wrote:
> On Monday, May 10, 2021 at 7:18:28 PM UTC-5, BGB wrote:
>> On 5/10/2021 2:40 AM, BGB wrote:
>>> On 5/10/2021 1:35 AM, Thomas Koenig wrote:
>>> Yes, possible...
>>>
>>> Did just go and add a "BRA Disp33s" encoding as a test...
>>> But, sadly, directly using the AGU output for the branch (to sidestep
>>> some address hackery) kinda blows out the timing constraints...
>>>
>>> Like, it would be nicer to be like:
>>> Well, we can have "48b+(Disp33s*2)" from the AGU, why not just use this
>>> address as the branch destination?... But, alas, timing isn't super
>>> favorable towards this idea...
>>>
>> I hacked on it some, and "made it work", mostly by having a new
>> carry-select adder which implements "(Base48+(Disp33s<<1))".
>>
>> I have also added logic to the branch predictor so that it uses a carry
>> bit to detect out-of-range cases (and will disable itself from
>> predicting the branch if it will go out-of-range).
> <
> In most cases, the prediction shows up at the same moment as the instruction(s)
> so if you have to perform the branch target address calculation and look at its
> HoB you are already too late to use the prediction in the current fetch cycle.
>
> Adders (8-64-bits) generally have the following properties:: smallest delay
> though adder = 5 gates, greatest delay = 11 gates (which can be circuit
> designed down to 8-gates, but not in a synthesized form).

It is calculated during ID1, in parallel with the main instruction decoder.
It then calculates the fetch address for the next cycle.

This means a correctly predicted branch still has a latency of 2 cycles,
since the new fetch location doesn't start to take effect until the
branch has reached the ID2 stage ( PF IF ID1 ID2 EX1 EX2 EX3 WB ).

Either way, timing here tends to be pretty tight here, which is part of
why Mod-4GB branch-addressing was a thing.

But, looking at the carry bit does have a useful property:
* It means I can now actually make the adder narrower without breaking
stuff.

So, instead of being 32b+24b with the address wrapping at the 4GB mark,
I could do a 24-bit adder, then be like "OK, the high-order bit carried,
ignore this branch!"

Ironically, it is possible it could be made to handle the Abs48 branches
if needed, since no adder would be needed in this case.

Non-predicted branches don't occur until EX1.

Can't go that much narrower though with the use of 20-bit branch
displacements. Likewise, the Disp33 branches are ignored by the branch
predictor (and are handled in EX1 along with register-indirect branches).

Branches generated from EX1 don't actually start taking effect until EX2.

EX1: Submit branch target address to branch-initiator mechanism;
EX2: Set pipeline flush bits, submit new address to L1 I$ (PF).
(This part feeds back through the branch-predictor).
EX3: ... Branch is now underway ...

If I did a "long branch" via the ALU:
EX1: ALU does its thing, 1;
EX2: ALU does its thing, 2;
EX3: Submit ALU result to branch-initiator.

Though, long-branch would likely need to also invoke the initiator in
EX1 in order to flush the pipeline (and again in EX3).

>>
>> The EX1 stage then uses a "slightly larger" address calculation for the
>> branch target.
>>
>> These effectively allows effectively eliminating most of the
>> modular-addressed-branch issues.
>>
>> Now, theoretically, branches can be +/- 8GB.
> <
> <alt>0177</alt> = ±

Tries it, ... ±
Hmm...

>>
>>
>> While I was at it, I effectively widened the AGU by a few bits, so that
>> it can also now handle 33-bit displacements (as opposed to falling on
>> its face if it goes outside of +/- 4GB).
>>
>> The tweaked AGU logic should now be able to handle +/- 32GB for QWORD
>> operations.
>>
>>
>> Still seems to pass timing at 50MHz, so probably OK.
>>>
>>> An Abs48 encoding or similar wouldn't necessarily be subject to this
>>> issue, since from the EX stages' POV, this case is functionally
>>> equivalent to the "branch to register" case.
>>>
>>> One possibility:
>>> FAjj-jjjj: LDIZ Imm24u, R0 //existing op
>>> FBjj-jjjj: LDIN Imm24n, R0 //existing op
>>> FEdd-dddd-FAdd-dddd: LDIZ Imm48u, R0
>>> FEdd-dddd-FBdd-dddd: LDIN Imm48n, R0
>>> FFdd-dddd-FAdd-dddd: BRA Abs48
>>> FFdd-dddd-FBdd-dddd: BSR Abs48
>>>
>>> Which while not perfect (due to the implications of this encoding; and
>>> inability to be predicated), is at least "not completely awful".
>> These operations now exist...
>>
>> * FEjj_jjjj_F202_C4jj BRA (PC, disp32u)
>> * FEjj_jjjj_F202_C5jj BRA (PC, disp32n)
>> * FFjj_jjjj_FAjj_jjjj BRA #abs48
>>
>> * FEjj_jjjj_F202_CCjj BSR (PC, disp32u)
>> * FEjj_jjjj_F202_CDjj BSR (PC, disp32n)
>> * FFjj_jjjj_FBjj_jjjj BSR #abs48
>>
>>
>> The Disp33s encodings are "kinda a hack", I had an instruction in Imm10
>> space which only used 8 bits of the operand, so, YOLO...
>>
>> As noted, the much larger relative branch encodings would likely be:
>> * FEjj_jjjj_F0jj_Cjjj BRA (PC, disp44s)
>> * FEjj_jjjj_F0jj_Djjj BSR (PC, disp44s)
>>
>>
>> Which, while less hacky, these still currently isn't a good way to
>> utilize such a large displacement.
>>
>>
>>
>> In any case, while it isn't exactly inconceivable that ".text" can
>> exceed 1MB, I don't really expect it will exceed 8GB "anytime soon"...

Re: The old RISC-vs-CISC

<s7decb$1c7j$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16607&group=comp.arch#16607

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!/FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC
Date: Tue, 11 May 2021 10:14:03 +0200
Organization: Aioe.org NNTP Server
Lines: 32
Message-ID: <s7decb$1c7j$1@gioia.aioe.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7bt9m$iim$1@dont-email.me>
<62b81886-bf04-4a4c-a89c-aacedba9a1f5n@googlegroups.com>
<f475d3b6-dc42-4063-9d3f-4f38e3de4e37n@googlegroups.com>
<e1d470b1-b2d7-43df-af69-e6f6c9aff210n@googlegroups.com>
<s7ccdn$28b$1@dont-email.me>
NNTP-Posting-Host: /FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Tue, 11 May 2021 08:14 UTC

Ivan Godard wrote:
> On 5/10/2021 2:11 PM, MitchAlsup wrote:
>> On Monday, May 10, 2021 at 3:53:45 PM UTC-5, robf...@gmail.com wrote:
>>> ANY1 supports a 21 bit branch displacement for branching +/-1MB. It
>>> may also store the return address in the target register allowing
>>> conditional branch to subroutine.
>> <
>> How often do you find conditional branching to a subroutine to be
>> effective ?
>
> Very frequent in Mill code, due to predication of calls when EBBs are
> folded together to let both then and else run interleaved. Probably rare
>  otherwise. Consequently asking "how often do you see?" is not very
> informative unless also asking whether the compiler takes advantage of
> the conditional.

A conditional function call would be very useful whenever you write any
code that process a stream of data, using a local buffer which you have
to refill whenever you reach the low-water mark.

I.e.

if (amount < low_limit) fill_buffer()

This happens a lot in most stream/file library code, as well as wjen
processing bitstreams for a codec.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Pages:123456789101112131415
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor