Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

This login session: $13.76, but for you $11.88.


devel / comp.arch / Re: Compact representation for common integer constants

SubjectAuthor
* Compact representation for common integer constantsJohnG
+* Re: Compact representation for common integer constantsIvan Godard
|+- Re: Compact representation for common integer constantsDavid Brown
|`* Re: Compact representation for common integer constantsJohnG
| `* Re: Compact representation for common integer constantsBGB
|  `* Re: Compact representation for common integer constantsMitchAlsup
|   `* Re: Compact representation for common integer constantsBGB
|    `* Re: Compact representation for common integer constantsThomas Koenig
|     +- Re: Compact representation for common integer constantsMitchAlsup
|     `* Re: Compact representation for common integer constantsBGB
|      `* Re: Compact representation for common integer constantsMitchAlsup
|       `* Re: Compact representation for common integer constantsIvan Godard
|        +- Re: Compact representation for common integer constantsMarcus
|        +* Re: Compact representation for common integer constantsBGB
|        |`* Re: Compact representation for common integer constantsMitchAlsup
|        | +* Clamping. was: Compact representation for common integer constantsIvan Godard
|        | |+* Re: Clamping. was: Compact representation for common integer constantsMitchAlsup
|        | ||`* Re: Clamping. was: Compact representation for common integerIvan Godard
|        | || `- Re: Clamping. was: Compact representation for common integer constantsMitchAlsup
|        | |`* Re: Clamping. was: Compact representation for common integerBGB
|        | | `* Re: Clamping. was: Compact representation for common integerIvan Godard
|        | |  `- Re: Clamping. was: Compact representation for common integer constantsMitchAlsup
|        | +* Re: Compact representation for common integer constantsMarcus
|        | |`* Re: Compact representation for common integer constantsMitchAlsup
|        | | `* Re: Compact representation for common integer constantsDavid Brown
|        | |  `* Re: Compact representation for common integer constantsMitchAlsup
|        | |   +- Re: Compact representation for common integer constantsThomas Koenig
|        | |   `* Re: Compact representation for common integer constantsDavid Brown
|        | |    `- Re: Compact representation for common integer constantsMitchAlsup
|        | `* Re: Compact representation for common integer constantsThomas Koenig
|        |  +- Re: Compact representation for common integer constantsAnton Ertl
|        |  `* Re: Compact representation for common integer constantsMitchAlsup
|        |   `* Re: Compact representation for common integer constantsThomas Koenig
|        |    +* Re: Compact representation for common integer constantsAnton Ertl
|        |    |`* Re: Compact representation for common integer constantsBrian G. Lucas
|        |    | +* Re: Compact representation for common integer constantsThomas Koenig
|        |    | |`- Re: Compact representation for common integer constantsBrian G. Lucas
|        |    | +- Re: Compact representation for common integer constantsStefan Monnier
|        |    | `* Re: Compact representation for common integer constantsAnton Ertl
|        |    |  `* Re: Compact representation for common integer constantsThomas Koenig
|        |    |   +* Re: Compact representation for common integer constantsAnton Ertl
|        |    |   |`* Re: Compact representation for common integer constantsThomas Koenig
|        |    |   | `- Re: Compact representation for common integer constantsAnton Ertl
|        |    |   `* Re: Compact representation for common integer constantsTerje Mathisen
|        |    |    `- Re: Compact representation for common integer constantsAnton Ertl
|        |    `* Re: Compact representation for common integer constantsMitchAlsup
|        |     `* Re: Compact representation for common integer constantsThomas Koenig
|        |      `* Re: Compact representation for common integer constantsBrian G. Lucas
|        |       `* Re: Compact representation for common integer constantsThomas Koenig
|        |        +* Re: Compact representation for common integer constantsMitchAlsup
|        |        |`- Re: Compact representation for common integer constantsThomas Koenig
|        |        +* Re: Compact representation for common integer constantsAnton Ertl
|        |        |+* Re: Compact representation for common integer constantsThomas Koenig
|        |        ||+* Re: Compact representation for common integer constantsMitchAlsup
|        |        |||`* Re: Compact representation for common integer constantsThomas Koenig
|        |        ||| `- Re: Compact representation for common integer constantsMitchAlsup
|        |        ||`* Re: Compact representation for common integer constantsAnton Ertl
|        |        || +* Re: Compact representation for common integer constantsMitchAlsup
|        |        || |+* Re: Compact representation for common integer constantsEricP
|        |        || ||+* Re: Compact representation for common integer constantsThomas Koenig
|        |        || |||+- Re: Compact representation for common integer constantsMitchAlsup
|        |        || |||+* Re: Compact representation for common integer constantsEricP
|        |        || ||||`* Re: Compact representation for common integer constantsTerje Mathisen
|        |        || |||| `* Re: Compact representation for common integer constantsDavid Brown
|        |        || ||||  `* Re: Compact representation for common integer constantsTerje Mathisen
|        |        || ||||   `* Re: Compact representation for common integer constantsDavid Brown
|        |        || ||||    `- Re: Compact representation for common integer constantsTerje Mathisen
|        |        || |||`* Re: Compact representation for common integer constantsStephen Fuld
|        |        || ||| `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || |||  +- Re: Compact representation for common integer constantsStephen Fuld
|        |        || |||  `* Re: Compact representation for common integer constantsBill Findlay
|        |        || |||   `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || |||    `- Re: Compact representation for common integer constantsBill Findlay
|        |        || ||+* Re: Compact representation for common integer constantsThomas Koenig
|        |        || |||+* Re: Compact representation for common integer constantsStephen Fuld
|        |        || ||||`- Re: Compact representation for common integer constantsThomas Koenig
|        |        || |||`- Re: Compact representation for common integer constantsEricP
|        |        || ||`* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || +* Re: Compact representation for common integer constantsNiklas Holsti
|        |        || || |`* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || | `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |  `* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || |   `* Re: Compact representation for common integer constantsEricP
|        |        || || |    +* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |`* Re: Compact representation for common integer constantsEricP
|        |        || || |    | `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |  `* Re: Compact representation for common integer constantsThomas Koenig
|        |        || || |    |   +- Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |   `* Re: Compact representation for common integer constantsTerje Mathisen
|        |        || || |    |    `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |     +- Re: Compact representation for common integer constantsTerje Mathisen
|        |        || || |    |     `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |    |      `- Re: Compact representation for common integer constantsTerje Mathisen
|        |        || || |    `* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || |     `* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |      +- Re: Compact representation for common integer constantsBill Findlay
|        |        || || |      +* Re: Compact representation for common integer constantsThomas Koenig
|        |        || || |      |+* Re: Compact representation for common integer constantsAnton Ertl
|        |        || || |      ||`* Re: Compact representation for common integer constantsThomas Koenig
|        |        || || |      || `- Re: Compact representation for common integer constantsAnton Ertl
|        |        || || |      |`* Re: Compact representation for common integer constantsMitchAlsup
|        |        || || |      +* Re: Compact representation for common integer constantsTerje Mathisen
|        |        || || |      `* Re: Compact representation for common integer constantsStephen Fuld
|        |        || || `* Re: Compact representation for common integer constantsEricP
|        |        || |`- Re: Compact representation for common integer constantsAnton Ertl
|        |        || `* Re: Compact representation for common integer constantsThomas Koenig
|        |        |`* Re: Compact representation for common integer constantsMitchAlsup
|        |        `* Re: Compact representation for common integer constantsBrian G. Lucas
|        `* Re: Compact representation for common integer constantsQuadibloc
+* Re: Compact representation for common integer constantsBGB
`* Re: Compact representation for common integer constantsJohn Levine

Pages:123456789101112131415
Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<s78dnf$hpi$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16554&group=comp.arch#16554

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: m.del...@this.bitsnbites.eu (Marcus)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
Date: Sun, 9 May 2021 12:32:15 +0200
Organization: A noiseless patient Spider
Lines: 169
Message-ID: <s78dnf$hpi$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 9 May 2021 10:32:15 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="a8137cbbf03d8e7974c63a761c828a0b";
logging-data="18226"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+n2bMQ1X4N8vmMrTACUYMbQeV87tLsBIk="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.8.1
Cancel-Lock: sha1:mCduoaZ0vXBIIWfQAkGPcAtXsYc=
In-Reply-To: <s74rp7$laf$1@dont-email.me>
Content-Language: en-US
 by: Marcus - Sun, 9 May 2021 10:32 UTC

On 2021-05-08, BGB wrote:
> On 5/7/2021 3:40 PM, MitchAlsup wrote:
>> On Friday, May 7, 2021 at 3:20:56 PM UTC-5, BGB wrote:
>>> On 5/7/2021 8:15 AM, Stefan Monnier wrote:
>>>>> It is possible that it could be generalized, but then the ISA would
>>>>> be less
>>>>> RISC style then it is already...
>>>>>
>>>>> Then again, I did see recently that someone had listed my ISA
>>>>> somewhere, but
>>>>> then classified it as a CISC; I don't entirely agree, but alas...
>>>>>
>>>>> Then again, many peoples' definitions of "RISC" exclude largish
>>>>> instruction-sets with variable-length instruction encodings, so alas.
>>>>>
>>>>> But, taken at face value, then one would also need to exclude
>>>>> Thumb2 and
>>>>> similar from the RISC category.
>>>>
>>>> I think it's better not to worry about how other people label your ISA.
>>>>
>>>> The manichean RISC-vs-CISC labeling is stupid anyway: the design space
>>>> has many more than 2 spots.
>>>>
>>>> Also I think the "classic RISC" is the result of specific conditions at
>>>> that point in time which resulted in a particular sweet spot in the
>>>> design space.
>>>>
>>>> But design constraints are quite different now, so applying the same
>>>> "quantitative approach" that lead to MIPS/SPARC/... will now result in
>>>> something quite different. One could argue that such new ISAs
>>>> could be called RISC-2020 ;-)
>>>>
>>> Yeah.
>>>
>>> Classic RISC:
>>> Fixed size instructions
>>> Typically only a single addressing mode (Reg, Disp)
>>> Typically only supporting aligned memory access
>>> Aims for fixed 1 instruction per cycle
>> <
>> Mc 88K had
>> fixed sized instructions
>> [Rbase+IMM16] and [Rbase+Rindex<<scale] address modes
>> Aligned memory has been show to be defective
>> aimed at 1ipc but we did engineer a 6-wide OoO version
>>
>> My 66000
>> has fixed sized instruction specifiers with 1 or 2 constants.
>> [Rbase+IMM16] and [Rbase+Rindex<<scale] and
>> [Rbase+Rindex<<scale+disp[32,64] address modes
>> Misaligned memory model
>> Aimed at low burden for LBIO implementations and low burden for GBOoO
>> implementations
>> Inherently parallel
>> Never needs a NoOp
>
>
> BJX2 allows misaligned access, and also does not need NOPs.
>
> Misaligned access may degrade performance in some cases though, doesn't
> work with certain instructions, ...
>
> If one triggers an interlock on a memory load or similar, there will be
> a partial pipeline stall, and it will behave as-if a NOP were present.
> Triggering an interlock on various other instructions may also trigger
> this behavior.
>
> Interlocks currently are in 2nd place (behind cache misses) for wasting
> clock-cycles (improving the efficiency of memory has, in effect, also
> increased the proportion of clock-cycles wasted on pipeline interlock
> stalls).
>
> I had more recently added logic to my C compiler's "wexifier" to shuffle
> instructions around to try to reduce interlock penalties. At the moment,
> it is disabled by default as the wexifier seems to be buggy in some
> as-of-yet undetermined way (seems to be producing bundles which don't
> work correctly on the FPGA implementation, but I can't seem to isolate
> the behavior well enough to determine the cause).
>
>
> BJX2:
>   Rb + Imm<<Sc
>   Rb + Ri<<Sc
>

MRISC32:
Rb + Imm
Rb + Ri<<Sc

> Rb: R2..R15, R18..R31
> Ri: R0, R2..R14, R18..R31
>
> R0/R1: Used to encode special-cases
> R15 (SP): Only usable as a base register
> R16/R17: Reserved (for the possibility of more special cases)
>
> Sc: Typically hard-wired based on the element type
>
> Base registers encoded via special cases:
>   PC, GBR, TBR
> Index registers encoded via special cases:
>   R0, Sc=0 (Allows access to misaligned struct members)
>
> However, since pretty much all this is handled in the decoder, I don't
> really feel it classifies as distinct address modes (and, as far as the
> AGU is concerned (Rb+Ri<<Sc) is the only mode).
>
> Generally, (Rb) and (Rb,0) are treated as equivalent.
>
>
>
> Given the lack of autoincrement modes or similar, both PC and PC&(~3)
> addressing, ... BJX2 in practice actually has fewer distinct addressing
> modes than SuperH.
>
> The SuperH ISA also had a few edge cases where multiple memory accesses
> would be performed by a single instruction (and other violations of
> Load/Store), but was generally classified as a RISC.
>
>
> One thing BJX2 does have, that many RISC's lack, is a LEA instruction.
> LEA Generally takes the form of assuming the Zero-Extended Store doesn't
> make sense, so these cases are interpreted as a LEA.
>
> This was mostly because LEA helps in various edge cases, such as for:
>   Composing function pointers;
>   Long-distance branches;
>   Compound addressing within structures;
>     Eg: Accessing an element within an array within a structure.
>   ...
>

MRISC32 also has LEA (called "LDEA" - LoaD Effective Address). It
essentially uses the output from the AGU as the result (bypassing
the load-from-memory stage), and has a few different use cases.

Apart from preloading memory addresses into registers, it can be used
for simple arithmetic on the form A + B << N (where N is 0, 1, 2 or 3),
which can be used for implementing x * 3 and x * 5, for instance.

Another very useful use case is to load vector registers with a
"stride". In vector mode the AGU will generate addresses on the
form:

(1) addr[k] = Rb + Im * k (Immediate form stride)
(2) addr[k] = Rb + (Ri << Sc) * k (Register form stride)
(3) addr[k] = Rb + Vi[k] << Sc (Gather/scatter)

Forms (1) and (2) construct addresses with a constant offset
between each address (I call these "stride based" load/store).

The LDEA instruction can thus be used for loading a series of
numbers into a vector register, e.g. like this:

LDEA V1, Z, #1 ; V1 = {0, 1, 2, 3, 4, 5, ...}

LDI S1, #7
LDEA V1, S1, #3 ; V1 = {7, 10, 13, 16, 19, ...}

This is a common operation in vectorized code, and on classic SIMD
ISA:s you usually load a predefined constant (e.g. from memory) in this
case.

Since the MRISC32 ISA allows each implementation to define the vector
register size, using predefined vector constants is not a good solution
since the vector register size is not known at compile time.

/Marcus

Re: scaling, was FP8 (was Compact representation for common integer constants)

<s790a3$9hl$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16555&group=comp.arch#16555

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: scaling, was FP8 (was Compact representation for common integer
constants)
Date: Sun, 9 May 2021 08:49:21 -0700
Organization: A noiseless patient Spider
Lines: 43
Message-ID: <s790a3$9hl$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s75pe2$shi$1@dont-email.me> <s76as3$i7j$1@gal.iecc.com>
<s76ehd$65p$1@dont-email.me> <s76o7e$jpn$1@gal.iecc.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 9 May 2021 15:49:23 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="35d30453c63c5422c2674501f2b77c19";
logging-data="9781"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19yVcWHS5OqCD+6uGTJ4Qu1fj7OufuMQbE="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.1
Cancel-Lock: sha1:7RRTk6QeKrJ6ERhTsH+9EBX/FcU=
In-Reply-To: <s76o7e$jpn$1@gal.iecc.com>
Content-Language: en-US
 by: Stephen Fuld - Sun, 9 May 2021 15:49 UTC

On 5/8/2021 12:19 PM, John Levine wrote:
> According to Stephen Fuld <sfuld@alumni.cmu.edu.invalid>:
>>> The huge advantage is that it takes care of scaling automatically so
>>> the programmer doesn't have to worry about it in every operation.
>
>> While all of that is true, there were other alternatives. COBOL
>> supported, and still does, automatic scaling of fixed point numbers. I
>> don't know if other languages support this.
>
> COBOL gives you fixed scaling, e.g. PIC 9999V999 has four digits before
> the implicit decimal point and three after. When you do arithmetic
> it'll align the decimal point, but you don't get automatic scaling
> unless you tell the compiler that the data is COMP-1 or COMP-2 so
> it uses the internal floating point representation.

Again, true, of course. I guess I thought the "automatic scaling" is
included in the "dynamic" part of "dynamic range" that Marcus mentioned
earlier. But of course YMMV.

> Scaling crisis of the day:
>
> The NASDAQ stock exchange's computers represent prices as 32 bit
> unsigned integers, with units being 1/100 of a cent or 1/10000 of a
> dollar. Since the largest 32 bit integer is 4294967295 the highest
> price it can represnt is $429,496.7295. The price of Berkshire
> Hathaway reached $437,131 this week. Oops. They say they'll have a
> fix later this month.

Wow! Interesting. I wonder what the fix will be?

> The next highest price is about $5000, and Berkshire's CEO Warren
> Buffett has said for decades that he'll never split the shares like
> everyone else does.

Thanks, John. I didn't know that.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: FP8 (was Compact representation for common integer constants)

<jwv1rafyd1g.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16556&group=comp.arch#16556

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: FP8 (was Compact representation for common integer constants)
Date: Sun, 09 May 2021 14:35:28 -0400
Organization: A noiseless patient Spider
Lines: 9
Message-ID: <jwv1rafyd1g.fsf-monnier+comp.arch@gnu.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s73rcr$83m$1@dont-email.me> <s73u8m$tnj$1@dont-email.me>
<s75pe2$shi$1@dont-email.me> <s76as3$i7j$1@gal.iecc.com>
<jwv7dk93zcy.fsf-monnier+comp.arch@gnu.org>
<s787j3$99m$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="d2593f1a1309ef90aa56d72be9b44c83";
logging-data="17380"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19GHgeuj+li+Q/yRyhyNTO2"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:3Yg1HOCDneQ2GHoPzm36vNCs1rE=
sha1:NcVOPaqlwVMD8UdUckskRegmyJY=
 by: Stefan Monnier - Sun, 9 May 2021 18:35 UTC

> Yes that makes sense. Specifically for FP8, the increased dynamic range
> is by far the most important trait. I've even seen examples of using
> 1:5:2 in DNN:s (deep neural networks) where dynamic range is often more
> important than precision.

Indeed 1:5:2 is arguably more useful than 1:4:3.

Stefan

Re: FP8 (was Compact representation for common integer constants)

<jwvv97rwy3y.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16557&group=comp.arch#16557

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: FP8 (was Compact representation for common integer constants)
Date: Sun, 09 May 2021 14:45:03 -0400
Organization: A noiseless patient Spider
Lines: 20
Message-ID: <jwvv97rwy3y.fsf-monnier+comp.arch@gnu.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s73rcr$83m$1@dont-email.me> <s73u8m$tnj$1@dont-email.me>
<s75pe2$shi$1@dont-email.me> <s76as3$i7j$1@gal.iecc.com>
<jwv7dk93zcy.fsf-monnier+comp.arch@gnu.org>
<s787j3$99m$1@dont-email.me>
<jwv1rafyd1g.fsf-monnier+comp.arch@gnu.org>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="d2593f1a1309ef90aa56d72be9b44c83";
logging-data="4310"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19VeI/Iv1SKc28cFvM9phfd"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:PZkwTeL8MDxfGm4AhFyigqm3GC4=
sha1:hJHUQLLkAq3egsVMCLNN/LHZwOI=
 by: Stefan Monnier - Sun, 9 May 2021 18:45 UTC

Stefan Monnier [2021-05-09 14:35:28] wrote:
>> Yes that makes sense. Specifically for FP8, the increased dynamic range
>> is by far the most important trait. I've even seen examples of using
>> 1:5:2 in DNN:s (deep neural networks) where dynamic range is often more
>> important than precision.
> Indeed 1:5:2 is arguably more useful than 1:4:3.

An alternative of course is to use a log-base representation. So your
8bit data is split 1:7 between a sign and an exponent, with no mantissa
at all and then you choose your dynamic range by picking the base.
Multiplication is easy but addition is takes more effort.

But if those FP8 are only used for transport and always converted to
FP16 or FP32 before actual computations, then the ease of
addition/multiplication is irrelevant and the only worry is how to
convert to/from that (and converting from FP8 is easy since it's small
enough to use a lookup table).

Stefan

Re: scaling, was FP8 (was Compact representation for common integer constants)

<8a28bfa8-8bb0-41ce-9191-ea6f196172edn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16558&group=comp.arch#16558

  copy link   Newsgroups: comp.arch
X-Received: by 2002:aed:2042:: with SMTP id 60mr19955789qta.340.1620593407853;
Sun, 09 May 2021 13:50:07 -0700 (PDT)
X-Received: by 2002:a05:6830:a:: with SMTP id c10mr8573785otp.114.1620593407595;
Sun, 09 May 2021 13:50:07 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 9 May 2021 13:50:07 -0700 (PDT)
In-Reply-To: <s790a3$9hl$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s75pe2$shi$1@dont-email.me> <s76as3$i7j$1@gal.iecc.com> <s76ehd$65p$1@dont-email.me>
<s76o7e$jpn$1@gal.iecc.com> <s790a3$9hl$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <8a28bfa8-8bb0-41ce-9191-ea6f196172edn@googlegroups.com>
Subject: Re: scaling, was FP8 (was Compact representation for common integer constants)
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 09 May 2021 20:50:07 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Sun, 9 May 2021 20:50 UTC

On Sunday, May 9, 2021 at 1:21:44 PM UTC-5, Stephen Fuld wrote:
> On 5/8/2021 12:19 PM, John Levine wrote:
> > According to Stephen Fuld <sf...@alumni.cmu.edu.invalid>:
> >>> The huge advantage is that it takes care of scaling automatically so
> >>> the programmer doesn't have to worry about it in every operation.
> >
> >> While all of that is true, there were other alternatives. COBOL
> >> supported, and still does, automatic scaling of fixed point numbers. I
> >> don't know if other languages support this.
> >
> > COBOL gives you fixed scaling, e.g. PIC 9999V999 has four digits before
> > the implicit decimal point and three after. When you do arithmetic
> > it'll align the decimal point, but you don't get automatic scaling
> > unless you tell the compiler that the data is COMP-1 or COMP-2 so
> > it uses the internal floating point representation.
> Again, true, of course. I guess I thought the "automatic scaling" is
> included in the "dynamic" part of "dynamic range" that Marcus mentioned
> earlier. But of course YMMV.
> > Scaling crisis of the day:
> >
> > The NASDAQ stock exchange's computers represent prices as 32 bit
> > unsigned integers, with units being 1/100 of a cent or 1/10000 of a
> > dollar. Since the largest 32 bit integer is 4294967295 the highest
> > price it can represnt is $429,496.7295. The price of Berkshire
> > Hathaway reached $437,131 this week. Oops. They say they'll have a
> > fix later this month.
<
> Wow! Interesting. I wonder what the fix will be?
<
The only way to get BH to split their stock would be the thread of delisting.
<
> > The next highest price is about $5000, and Berkshire's CEO Warren
> > Buffett has said for decades that he'll never split the shares like
> > everyone else does.
<
Unless someone forces his hand.
<
> Thanks, John. I didn't know that.
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: FP8 (was Compact representation for common integer constants)

<30744e66-bb03-482d-8982-8508f8d966a3n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16559&group=comp.arch#16559

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5fd5:: with SMTP id k21mr16378949qta.231.1620593583292;
Sun, 09 May 2021 13:53:03 -0700 (PDT)
X-Received: by 2002:a9d:664c:: with SMTP id q12mr18763236otm.76.1620593583092;
Sun, 09 May 2021 13:53:03 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 9 May 2021 13:53:02 -0700 (PDT)
In-Reply-To: <2021May9.101917@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s73rcr$83m$1@dont-email.me> <s73u8m$tnj$1@dont-email.me> <s75pe2$shi$1@dont-email.me>
<s76as3$i7j$1@gal.iecc.com> <s76ehd$65p$1@dont-email.me> <2021May9.101917@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <30744e66-bb03-482d-8982-8508f8d966a3n@googlegroups.com>
Subject: Re: FP8 (was Compact representation for common integer constants)
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 09 May 2021 20:53:03 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Sun, 9 May 2021 20:53 UTC

On Sunday, May 9, 2021 at 1:20:57 PM UTC-5, Anton Ertl wrote:
> Stephen Fuld <sf...@alumni.cmu.edu.invalid> writes:
> >On 5/8/2021 8:31 AM, John Levine wrote:
> >> When people were designing early computers in the 1940s, they knew
> >> about floating point but didn't include it because they thought that
> >> the programer could easily keep track of the scale so the extra
> >> hardware complexity wasn't worth it. This was true if the programmer
> >> was John von Neumann, not so true otherwise, so FP hardware showed up
> >> on the 704 in the early 1950s.
> >
> >While all of that is true, there were other alternatives. COBOL
> >supported, and still does, automatic scaling of fixed point numbers. I
> >don't know if other languages support this.
> >
> >Of course, the time period you were discussing was before high level
> >languages, and writing the code in assembler was probably "a bridge too
> >far" for most programmers at the time.
<
> Not really sure what you mean with the latter. In the early 1950s,
> writing programs included steps that most of us don't know about these
> days, such as layout of instructions on the rotating memory for
> performance, where each instruction included the address of the next
> one. What we see as machine code since the 1960s does not have to
> deal with these complications, assembly language is even higher level,
> Fortran I higher level yet.
<
Before that we had "patch pannels" where the program was expressed in WIREs
>
> As for fixed vs. floating point, I guess that is a cross-cutting
> concern. Sure you can argue that, if you spend a lot of time on
> low-level steps such as coding layout, spending some time on range
> analysis is minor change, so fixed point is acceptable.
>
> OTOH, if you have a nice numerical routine for a certain fixed-point
> number range and notice that you need the same computation, but for a
> different numeric range, do you want to repeat all the low-level work?
> Ok, maybe you can change just a few immediate values and leave the
> code as-is otherwise, but if the code includes optimizations based on
> the knowledge of the values, that's not possible. In such a scenario,
> floating point offers advantages, just as it does now.
<
My how lazy we have become ! Even with high quality FP, numerical
analysis is still mandatory--except that it is seldom performed !?!
>
> What held back floating point for a long time was the slowness and/or
> high cost of FP hardware, but at least in general-purpose computers
> that's a thing of the past.
<
Since mid 1980s
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: FP8 (was Compact representation for common integer constants)

<2e9b61fd-7ac0-4b74-a8d9-ca03e9359455n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16560&group=comp.arch#16560

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:6c1:: with SMTP id 184mr19956124qkg.294.1620593626522;
Sun, 09 May 2021 13:53:46 -0700 (PDT)
X-Received: by 2002:a05:6830:40a4:: with SMTP id x36mr15107863ott.342.1620593626329;
Sun, 09 May 2021 13:53:46 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 9 May 2021 13:53:46 -0700 (PDT)
In-Reply-To: <s787j3$99m$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s73rcr$83m$1@dont-email.me> <s73u8m$tnj$1@dont-email.me> <s75pe2$shi$1@dont-email.me>
<s76as3$i7j$1@gal.iecc.com> <jwv7dk93zcy.fsf-monnier+comp.arch@gnu.org> <s787j3$99m$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2e9b61fd-7ac0-4b74-a8d9-ca03e9359455n@googlegroups.com>
Subject: Re: FP8 (was Compact representation for common integer constants)
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 09 May 2021 20:53:46 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Sun, 9 May 2021 20:53 UTC

On Sunday, May 9, 2021 at 1:21:34 PM UTC-5, Marcus wrote:
> On 2021-05-08, Stefan Monnier wrote:
> >>>>> Good to know - I didn't think of that. I mostly went with the
> >>>>> configuration that made the most sense: The main advantage (IMO) of
> >>>>> floating-point numbers compared to fixed point numbers is the increased
> >>>>> dynamic range,
> >>>> Isn't that the *only* reason to do floating point?
> >>> I can think of a few more advantages, such as the normalized
> >>> representation that simplifies the implementation of many algorithms ...
> >> The huge advantage is that it takes care of scaling automatically so
> >> the programmer doesn't have to worry about it in every operation.
> >
> > While this is mostly true in general, the comments above where made in
> > the context of an 8bit floating point format, where the exponent's range
> > is sufficiently limited that it is still something the programmer very
> > much has to worry about.
> >
> Yes that makes sense. Specifically for FP8, the increased dynamic range
> is by far the most important trait. I've even seen examples of using
> 1:5:2 in DNN:s (deep neural networks) where dynamic range is often more
> important than precision.
<
Useful precision is often 1-bit (yes or no)
>
> >
> > Stefan
> >
>
> /Marcus

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16561&group=comp.arch#16561

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:6c1:: with SMTP id j1mr8667110qth.350.1620593982533;
Sun, 09 May 2021 13:59:42 -0700 (PDT)
X-Received: by 2002:aca:30cf:: with SMTP id w198mr18784759oiw.175.1620593982337;
Sun, 09 May 2021 13:59:42 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 9 May 2021 13:59:42 -0700 (PDT)
In-Reply-To: <s78dnf$hpi$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me> <6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me> <77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me> <0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 09 May 2021 20:59:42 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Sun, 9 May 2021 20:59 UTC

On Sunday, May 9, 2021 at 1:21:42 PM UTC-5, Marcus wrote:
> On 2021-05-08, BGB wrote:
> > On 5/7/2021 3:40 PM, MitchAlsup wrote:
> >> On Friday, May 7, 2021 at 3:20:56 PM UTC-5, BGB wrote:
> >>> On 5/7/2021 8:15 AM, Stefan Monnier wrote:
> >>>>> It is possible that it could be generalized, but then the ISA would
> >>>>> be less
> >>>>> RISC style then it is already...
> >>>>>
> >>>>> Then again, I did see recently that someone had listed my ISA
> >>>>> somewhere, but
> >>>>> then classified it as a CISC; I don't entirely agree, but alas...
> >>>>>
> >>>>> Then again, many peoples' definitions of "RISC" exclude largish
> >>>>> instruction-sets with variable-length instruction encodings, so alas.
> >>>>>
> >>>>> But, taken at face value, then one would also need to exclude
> >>>>> Thumb2 and
> >>>>> similar from the RISC category.
> >>>>
> >>>> I think it's better not to worry about how other people label your ISA.
> >>>>
> >>>> The manichean RISC-vs-CISC labeling is stupid anyway: the design space
> >>>> has many more than 2 spots.
> >>>>
> >>>> Also I think the "classic RISC" is the result of specific conditions at
> >>>> that point in time which resulted in a particular sweet spot in the
> >>>> design space.
> >>>>
> >>>> But design constraints are quite different now, so applying the same
> >>>> "quantitative approach" that lead to MIPS/SPARC/... will now result in
> >>>> something quite different. One could argue that such new ISAs
> >>>> could be called RISC-2020 ;-)
> >>>>
> >>> Yeah.
> >>>
> >>> Classic RISC:
> >>> Fixed size instructions
> >>> Typically only a single addressing mode (Reg, Disp)
> >>> Typically only supporting aligned memory access
> >>> Aims for fixed 1 instruction per cycle
> >> <
> >> Mc 88K had
> >> fixed sized instructions
> >> [Rbase+IMM16] and [Rbase+Rindex<<scale] address modes
> >> Aligned memory has been show to be defective
> >> aimed at 1ipc but we did engineer a 6-wide OoO version
> >>
> >> My 66000
> >> has fixed sized instruction specifiers with 1 or 2 constants.
> >> [Rbase+IMM16] and [Rbase+Rindex<<scale] and
> >> [Rbase+Rindex<<scale+disp[32,64] address modes
> >> Misaligned memory model
> >> Aimed at low burden for LBIO implementations and low burden for GBOoO
> >> implementations
> >> Inherently parallel
> >> Never needs a NoOp
> >
> >
> > BJX2 allows misaligned access, and also does not need NOPs.
> >
> > Misaligned access may degrade performance in some cases though, doesn't
> > work with certain instructions, ...
> >
> > If one triggers an interlock on a memory load or similar, there will be
> > a partial pipeline stall, and it will behave as-if a NOP were present.
> > Triggering an interlock on various other instructions may also trigger
> > this behavior.
> >
> > Interlocks currently are in 2nd place (behind cache misses) for wasting
> > clock-cycles (improving the efficiency of memory has, in effect, also
> > increased the proportion of clock-cycles wasted on pipeline interlock
> > stalls).
> >
> > I had more recently added logic to my C compiler's "wexifier" to shuffle
> > instructions around to try to reduce interlock penalties. At the moment,
> > it is disabled by default as the wexifier seems to be buggy in some
> > as-of-yet undetermined way (seems to be producing bundles which don't
> > work correctly on the FPGA implementation, but I can't seem to isolate
> > the behavior well enough to determine the cause).
> >
> >
> > BJX2:
> > Rb + Imm<<Sc
> > Rb + Ri<<Sc
> >
> MRISC32:
> Rb + Imm
> Rb + Ri<<Sc
>
> > Rb: R2..R15, R18..R31
> > Ri: R0, R2..R14, R18..R31
> >
> > R0/R1: Used to encode special-cases
> > R15 (SP): Only usable as a base register
> > R16/R17: Reserved (for the possibility of more special cases)
> >
> > Sc: Typically hard-wired based on the element type
> >
> > Base registers encoded via special cases:
> > PC, GBR, TBR
> > Index registers encoded via special cases:
> > R0, Sc=0 (Allows access to misaligned struct members)
> >
> > However, since pretty much all this is handled in the decoder, I don't
> > really feel it classifies as distinct address modes (and, as far as the
> > AGU is concerned (Rb+Ri<<Sc) is the only mode).
> >
> > Generally, (Rb) and (Rb,0) are treated as equivalent.
> >
> >
> >
> > Given the lack of autoincrement modes or similar, both PC and PC&(~3)
> > addressing, ... BJX2 in practice actually has fewer distinct addressing
> > modes than SuperH.
> >
> > The SuperH ISA also had a few edge cases where multiple memory accesses
> > would be performed by a single instruction (and other violations of
> > Load/Store), but was generally classified as a RISC.
> >
> >
> > One thing BJX2 does have, that many RISC's lack, is a LEA instruction.
> > LEA Generally takes the form of assuming the Zero-Extended Store doesn't
> > make sense, so these cases are interpreted as a LEA.
> >
> > This was mostly because LEA helps in various edge cases, such as for:
> > Composing function pointers;
> > Long-distance branches;
<
How many branches are farther than 1/8 GB away with code compiled
from high level languages ??
<
> > Compound addressing within structures;
> > Eg: Accessing an element within an array within a structure.
> > ...
> >
> MRISC32 also has LEA (called "LDEA" - LoaD Effective Address). It
> essentially uses the output from the AGU as the result (bypassing
> the load-from-memory stage), and has a few different use cases.
>
I know one compiler writer that would enunciate "LEA" as 'Lee'-'ah'
>
> Apart from preloading memory addresses into registers, it can be used
> for simple arithmetic on the form A + B << N (where N is 0, 1, 2 or 3),
> which can be used for implementing x * 3 and x * 5, for instance.
>
Better still is to make integer multiply 3-cycles.
>
> Another very useful use case is to load vector registers with a
> "stride". In vector mode the AGU will generate addresses on the
> form:
>
> (1) addr[k] = Rb + Im * k (Immediate form stride)
> (2) addr[k] = Rb + (Ri << Sc) * k (Register form stride)
> (3) addr[k] = Rb + Vi[k] << Sc (Gather/scatter)
>
Both stride and gather forms fall out "for free" in VVM. {So, I am
agreeing with you that LEA is a valuable instruction.}
>
> Forms (1) and (2) construct addresses with a constant offset
> between each address (I call these "stride based" load/store).
<
Don't forget the stride-0 form in GPUs where every thread wants to
bang on the auto-update memory reference location.
>
> The LDEA instruction can thus be used for loading a series of
> numbers into a vector register, e.g. like this:
>
> LDEA V1, Z, #1 ; V1 = {0, 1, 2, 3, 4, 5, ...}
>
> LDI S1, #7
> LDEA V1, S1, #3 ; V1 = {7, 10, 13, 16, 19, ...}
>
> This is a common operation in vectorized code, and on classic SIMD
> ISA:s you usually load a predefined constant (e.g. from memory) in this
> case.
>
> Since the MRISC32 ISA allows each implementation to define the vector
> register size, using predefined vector constants is not a good solution
> since the vector register size is not known at compile time.
>
> /Marcus

Re: FP8 (was Compact representation for common integer constants)

<48015cc8-9326-427a-9fdd-36ed1e12939an@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16562&group=comp.arch#16562

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:a851:: with SMTP id r78mr17121107qke.95.1620594078152;
Sun, 09 May 2021 14:01:18 -0700 (PDT)
X-Received: by 2002:a4a:e715:: with SMTP id y21mr16395268oou.54.1620594077953;
Sun, 09 May 2021 14:01:17 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 9 May 2021 14:01:17 -0700 (PDT)
In-Reply-To: <jwvv97rwy3y.fsf-monnier+comp.arch@gnu.org>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s73rcr$83m$1@dont-email.me> <s73u8m$tnj$1@dont-email.me> <s75pe2$shi$1@dont-email.me>
<s76as3$i7j$1@gal.iecc.com> <jwv7dk93zcy.fsf-monnier+comp.arch@gnu.org>
<s787j3$99m$1@dont-email.me> <jwv1rafyd1g.fsf-monnier+comp.arch@gnu.org> <jwvv97rwy3y.fsf-monnier+comp.arch@gnu.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <48015cc8-9326-427a-9fdd-36ed1e12939an@googlegroups.com>
Subject: Re: FP8 (was Compact representation for common integer constants)
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 09 May 2021 21:01:18 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Sun, 9 May 2021 21:01 UTC

On Sunday, May 9, 2021 at 1:45:06 PM UTC-5, Stefan Monnier wrote:
> Stefan Monnier [2021-05-09 14:35:28] wrote:
> >> Yes that makes sense. Specifically for FP8, the increased dynamic range
> >> is by far the most important trait. I've even seen examples of using
> >> 1:5:2 in DNN:s (deep neural networks) where dynamic range is often more
> >> important than precision.
> > Indeed 1:5:2 is arguably more useful than 1:4:3.
> An alternative of course is to use a log-base representation. So your
> 8bit data is split 1:7 between a sign and an exponent, with no mantissa
> at all and then you choose your dynamic range by picking the base.
> Multiplication is easy but addition is takes more effort.
<
Really ?!?!?
Both are implemented with table look ups with concatenated values as the
address.
>
> But if those FP8 are only used for transport and always converted to
> FP16 or FP32 before actual computations, then the ease of
> addition/multiplication is irrelevant and the only worry is how to
> convert to/from that (and converting from FP8 is easy since it's small
> enough to use a lookup table).
>
>
> Stefan

Re: scaling, was FP8 (was Compact representation for common integer constants)

<jwv8s4nwr4o.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16563&group=comp.arch#16563

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: scaling, was FP8 (was Compact representation for common integer constants)
Date: Sun, 09 May 2021 17:10:37 -0400
Organization: A noiseless patient Spider
Lines: 7
Message-ID: <jwv8s4nwr4o.fsf-monnier+comp.arch@gnu.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s75pe2$shi$1@dont-email.me> <s76as3$i7j$1@gal.iecc.com>
<s76ehd$65p$1@dont-email.me> <s76o7e$jpn$1@gal.iecc.com>
<s790a3$9hl$1@dont-email.me>
<8a28bfa8-8bb0-41ce-9191-ea6f196172edn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="d2593f1a1309ef90aa56d72be9b44c83";
logging-data="18290"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1//YPxGIX2jLfY5Y48EyKhN"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:kl9MwPQrkCk8tq2ueMAUYFK+Oss=
sha1:Cj1mRd6Y73vYCaZxytgA5oTdyIA=
 by: Stefan Monnier - Sun, 9 May 2021 21:10 UTC

>> Wow! Interesting. I wonder what the fix will be?
> The only way to get BH to split their stock would be the threat of delisting.

They could also threaten to let it wrap around ;-)

Stefan

Re: FP8 (was Compact representation for common integer constants)

<jwv35uvwr24.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16564&group=comp.arch#16564

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: FP8 (was Compact representation for common integer constants)
Date: Sun, 09 May 2021 17:16:14 -0400
Organization: A noiseless patient Spider
Lines: 15
Message-ID: <jwv35uvwr24.fsf-monnier+comp.arch@gnu.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s73rcr$83m$1@dont-email.me> <s73u8m$tnj$1@dont-email.me>
<s75pe2$shi$1@dont-email.me> <s76as3$i7j$1@gal.iecc.com>
<jwv7dk93zcy.fsf-monnier+comp.arch@gnu.org>
<s787j3$99m$1@dont-email.me>
<jwv1rafyd1g.fsf-monnier+comp.arch@gnu.org>
<jwvv97rwy3y.fsf-monnier+comp.arch@gnu.org>
<48015cc8-9326-427a-9fdd-36ed1e12939an@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="d2593f1a1309ef90aa56d72be9b44c83";
logging-data="18290"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ArCXXbUCbfngFNUzQmOsO"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:qBo2t35xe/5kCxMBQIVkfl/Lezg=
sha1:9dq4KdZYLpHefsNBVbqxrxP6Mmc=
 by: Stefan Monnier - Sun, 9 May 2021 21:16 UTC

>> An alternative of course is to use a log-base representation. So your
>> 8bit data is split 1:7 between a sign and an exponent, with no mantissa
>> at all and then you choose your dynamic range by picking the base.
>> Multiplication is easy but addition is takes more effort.
> Really ?!?!?
> Both are implemented with table look ups with concatenated values as the
> address.

Really? A 64k entry table seems rather expensive for such
a functionality. Then again, if the output is FP8, maybe it can be
implemented as just a big mess of auto-generated and+or+not since many
inputs map to the same output.

Stefan

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<s79khu$u4v$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16565&group=comp.arch#16565

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
Date: Sun, 9 May 2021 16:34:44 -0500
Organization: A noiseless patient Spider
Lines: 205
Message-ID: <s79khu$u4v$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 9 May 2021 21:34:54 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="9d9dda36f9f81168c65a7c138483aecf";
logging-data="30879"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18tkLkrPazDTCZ20FsrUMao"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.0
Cancel-Lock: sha1:7MJkM4ehgQv22vSN+SbLnKgLwI0=
In-Reply-To: <f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
Content-Language: en-US
 by: BGB - Sun, 9 May 2021 21:34 UTC

On 5/9/2021 3:59 PM, MitchAlsup wrote:
> On Sunday, May 9, 2021 at 1:21:42 PM UTC-5, Marcus wrote:
>> On 2021-05-08, BGB wrote:
>>> On 5/7/2021 3:40 PM, MitchAlsup wrote:
>>>> On Friday, May 7, 2021 at 3:20:56 PM UTC-5, BGB wrote:
>>>>> On 5/7/2021 8:15 AM, Stefan Monnier wrote:
>>>>>>> It is possible that it could be generalized, but then the ISA would
>>>>>>> be less
>>>>>>> RISC style then it is already...
>>>>>>>
>>>>>>> Then again, I did see recently that someone had listed my ISA
>>>>>>> somewhere, but
>>>>>>> then classified it as a CISC; I don't entirely agree, but alas...
>>>>>>>
>>>>>>> Then again, many peoples' definitions of "RISC" exclude largish
>>>>>>> instruction-sets with variable-length instruction encodings, so alas.
>>>>>>>
>>>>>>> But, taken at face value, then one would also need to exclude
>>>>>>> Thumb2 and
>>>>>>> similar from the RISC category.
>>>>>>
>>>>>> I think it's better not to worry about how other people label your ISA.
>>>>>>
>>>>>> The manichean RISC-vs-CISC labeling is stupid anyway: the design space
>>>>>> has many more than 2 spots.
>>>>>>
>>>>>> Also I think the "classic RISC" is the result of specific conditions at
>>>>>> that point in time which resulted in a particular sweet spot in the
>>>>>> design space.
>>>>>>
>>>>>> But design constraints are quite different now, so applying the same
>>>>>> "quantitative approach" that lead to MIPS/SPARC/... will now result in
>>>>>> something quite different. One could argue that such new ISAs
>>>>>> could be called RISC-2020 ;-)
>>>>>>
>>>>> Yeah.
>>>>>
>>>>> Classic RISC:
>>>>> Fixed size instructions
>>>>> Typically only a single addressing mode (Reg, Disp)
>>>>> Typically only supporting aligned memory access
>>>>> Aims for fixed 1 instruction per cycle
>>>> <
>>>> Mc 88K had
>>>> fixed sized instructions
>>>> [Rbase+IMM16] and [Rbase+Rindex<<scale] address modes
>>>> Aligned memory has been show to be defective
>>>> aimed at 1ipc but we did engineer a 6-wide OoO version
>>>>
>>>> My 66000
>>>> has fixed sized instruction specifiers with 1 or 2 constants.
>>>> [Rbase+IMM16] and [Rbase+Rindex<<scale] and
>>>> [Rbase+Rindex<<scale+disp[32,64] address modes
>>>> Misaligned memory model
>>>> Aimed at low burden for LBIO implementations and low burden for GBOoO
>>>> implementations
>>>> Inherently parallel
>>>> Never needs a NoOp
>>>
>>>
>>> BJX2 allows misaligned access, and also does not need NOPs.
>>>
>>> Misaligned access may degrade performance in some cases though, doesn't
>>> work with certain instructions, ...
>>>
>>> If one triggers an interlock on a memory load or similar, there will be
>>> a partial pipeline stall, and it will behave as-if a NOP were present.
>>> Triggering an interlock on various other instructions may also trigger
>>> this behavior.
>>>
>>> Interlocks currently are in 2nd place (behind cache misses) for wasting
>>> clock-cycles (improving the efficiency of memory has, in effect, also
>>> increased the proportion of clock-cycles wasted on pipeline interlock
>>> stalls).
>>>
>>> I had more recently added logic to my C compiler's "wexifier" to shuffle
>>> instructions around to try to reduce interlock penalties. At the moment,
>>> it is disabled by default as the wexifier seems to be buggy in some
>>> as-of-yet undetermined way (seems to be producing bundles which don't
>>> work correctly on the FPGA implementation, but I can't seem to isolate
>>> the behavior well enough to determine the cause).
>>>
>>>
>>> BJX2:
>>> Rb + Imm<<Sc
>>> Rb + Ri<<Sc
>>>
>> MRISC32:
>> Rb + Imm
>> Rb + Ri<<Sc
>>
>>> Rb: R2..R15, R18..R31
>>> Ri: R0, R2..R14, R18..R31
>>>
>>> R0/R1: Used to encode special-cases
>>> R15 (SP): Only usable as a base register
>>> R16/R17: Reserved (for the possibility of more special cases)
>>>
>>> Sc: Typically hard-wired based on the element type
>>>
>>> Base registers encoded via special cases:
>>> PC, GBR, TBR
>>> Index registers encoded via special cases:
>>> R0, Sc=0 (Allows access to misaligned struct members)
>>>
>>> However, since pretty much all this is handled in the decoder, I don't
>>> really feel it classifies as distinct address modes (and, as far as the
>>> AGU is concerned (Rb+Ri<<Sc) is the only mode).
>>>
>>> Generally, (Rb) and (Rb,0) are treated as equivalent.
>>>
>>>
>>>
>>> Given the lack of autoincrement modes or similar, both PC and PC&(~3)
>>> addressing, ... BJX2 in practice actually has fewer distinct addressing
>>> modes than SuperH.
>>>
>>> The SuperH ISA also had a few edge cases where multiple memory accesses
>>> would be performed by a single instruction (and other violations of
>>> Load/Store), but was generally classified as a RISC.
>>>
>>>
>>> One thing BJX2 does have, that many RISC's lack, is a LEA instruction.
>>> LEA Generally takes the form of assuming the Zero-Extended Store doesn't
>>> make sense, so these cases are interpreted as a LEA.
>>>
>>> This was mostly because LEA helps in various edge cases, such as for:
>>> Composing function pointers;
>>> Long-distance branches;
> <
> How many branches are farther than 1/8 GB away with code compiled
> from high level languages ??
> <

The normal direct branch op in BJX2 has a 20-bit displacement, so can
reach +/- 1MB.

A Jumbo+LEA can do +/- 2GB, so:
LEA.B (PC, Disp33s), R3
JMP R3

But, this is also possible (+/- 32MB):
MOV #Imm25s, R0
BRA R0

In general, these don't really come up at present because, of my test
programs, all of them have < 1MB in the ".text" section.

>>> Compound addressing within structures;
>>> Eg: Accessing an element within an array within a structure.
>>> ...
>>>
>> MRISC32 also has LEA (called "LDEA" - LoaD Effective Address). It
>> essentially uses the output from the AGU as the result (bypassing
>> the load-from-memory stage), and has a few different use cases.
>>
> I know one compiler writer that would enunciate "LEA" as 'Lee'-'ah'

I always thought of it like "Lee".

>>
>> Apart from preloading memory addresses into registers, it can be used
>> for simple arithmetic on the form A + B << N (where N is 0, 1, 2 or 3),
>> which can be used for implementing x * 3 and x * 5, for instance.
>>
> Better still is to make integer multiply 3-cycles.
>>
>> Another very useful use case is to load vector registers with a
>> "stride". In vector mode the AGU will generate addresses on the
>> form:
>>
>> (1) addr[k] = Rb + Im * k (Immediate form stride)
>> (2) addr[k] = Rb + (Ri << Sc) * k (Register form stride)
>> (3) addr[k] = Rb + Vi[k] << Sc (Gather/scatter)
>>
> Both stride and gather forms fall out "for free" in VVM. {So, I am
> agreeing with you that LEA is a valuable instruction.}
>>
>> Forms (1) and (2) construct addresses with a constant offset
>> between each address (I call these "stride based" load/store).
> <
> Don't forget the stride-0 form in GPUs where every thread wants to
> bang on the auto-update memory reference location.
>>
>> The LDEA instruction can thus be used for loading a series of
>> numbers into a vector register, e.g. like this:
>>
>> LDEA V1, Z, #1 ; V1 = {0, 1, 2, 3, 4, 5, ...}
>>
>> LDI S1, #7
>> LDEA V1, S1, #3 ; V1 = {7, 10, 13, 16, 19, ...}
>>
>> This is a common operation in vectorized code, and on classic SIMD
>> ISA:s you usually load a predefined constant (e.g. from memory) in this
>> case.
>>
>> Since the MRISC32 ISA allows each implementation to define the vector
>> register size, using predefined vector constants is not a good solution
>> since the vector register size is not known at compile time.
>>
>> /Marcus


Click here to read the complete article
Re: FP8 (was Compact representation for common integer constants)

<8fe5e110-d4d0-4271-9de1-853d7156d194n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16566&group=comp.arch#16566

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1493:: with SMTP id t19mr19674401qtx.147.1620601389778;
Sun, 09 May 2021 16:03:09 -0700 (PDT)
X-Received: by 2002:a05:6830:90b:: with SMTP id v11mr17783719ott.110.1620601389448;
Sun, 09 May 2021 16:03:09 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 9 May 2021 16:03:09 -0700 (PDT)
In-Reply-To: <jwv35uvwr24.fsf-monnier+comp.arch@gnu.org>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s73rcr$83m$1@dont-email.me> <s73u8m$tnj$1@dont-email.me> <s75pe2$shi$1@dont-email.me>
<s76as3$i7j$1@gal.iecc.com> <jwv7dk93zcy.fsf-monnier+comp.arch@gnu.org>
<s787j3$99m$1@dont-email.me> <jwv1rafyd1g.fsf-monnier+comp.arch@gnu.org>
<jwvv97rwy3y.fsf-monnier+comp.arch@gnu.org> <48015cc8-9326-427a-9fdd-36ed1e12939an@googlegroups.com>
<jwv35uvwr24.fsf-monnier+comp.arch@gnu.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <8fe5e110-d4d0-4271-9de1-853d7156d194n@googlegroups.com>
Subject: Re: FP8 (was Compact representation for common integer constants)
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 09 May 2021 23:03:09 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Sun, 9 May 2021 23:03 UTC

On Sunday, May 9, 2021 at 4:16:17 PM UTC-5, Stefan Monnier wrote:
> >> An alternative of course is to use a log-base representation. So your
> >> 8bit data is split 1:7 between a sign and an exponent, with no mantissa
> >> at all and then you choose your dynamic range by picking the base.
> >> Multiplication is easy but addition is takes more effort.
> > Really ?!?!?
> > Both are implemented with table look ups with concatenated values as the
> > address.
> Really? A 64k entry table seems rather expensive for such
> a functionality. Then again, if the output is FP8, maybe it can be
> implemented as just a big mess of auto-generated and+or+not since many
> inputs map to the same output.
<
IBM 1620 "Can't add, doesn't even try" laid the groundwork where if the
operands are small enough it becomes cheaper to just read out the correct
result than to build the logic to calculate the result.
>
>
> Stefan

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16567&group=comp.arch#16567

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:7745:: with SMTP id s66mr19852622qkc.18.1620601713806;
Sun, 09 May 2021 16:08:33 -0700 (PDT)
X-Received: by 2002:a05:6830:1605:: with SMTP id g5mr18204381otr.22.1620601713606;
Sun, 09 May 2021 16:08:33 -0700 (PDT)
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.net!newsfeed.xs4all.nl!newsfeed9.news.xs4all.nl!feeder1.cambriumusenet.nl!feed.tweak.nl!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 9 May 2021 16:08:33 -0700 (PDT)
In-Reply-To: <s79khu$u4v$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=104.59.204.55; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 104.59.204.55
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me> <6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me> <77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me> <0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me> <f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 09 May 2021 23:08:33 +0000
Content-Type: text/plain; charset="UTF-8"
 by: MitchAlsup - Sun, 9 May 2021 23:08 UTC

On Sunday, May 9, 2021 at 4:34:56 PM UTC-5, BGB wrote:
> On 5/9/2021 3:59 PM, MitchAlsup wrote:

> > <
> > How many branches are farther than 1/8 GB away with code compiled
> > from high level languages ??
> > <
> The normal direct branch op in BJX2 has a 20-bit displacement, so can
> reach +/- 1MB.
>
> A Jumbo+LEA can do +/- 2GB, so:
> LEA.B (PC, Disp33s), R3
> JMP R3
>
> But, this is also possible (+/- 32MB):
> MOV #Imm25s, R0
> BRA R0
>
My 66000 has 16-bit actual word displacement (18-bit byte address) for
conditional branches and 26-bit unconditional branches (and calls) (28-bit
effective range).

But JMP instructions can have 32-bit or 64-bit immediates so you predicate
over the JMP with an inverted condition for the predicate.
>
> In general, these don't really come up at present because, of my test
> programs, all of them have < 1MB in the ".text" section.
<
In practice, few (<1%) are branches of those dimensions.
<
> >>> Compound addressing within structures;
> >>> Eg: Accessing an element within an array within a structure.
> >>> ...
> >>>
> >> MRISC32 also has LEA (called "LDEA" - LoaD Effective Address). It
> >> essentially uses the output from the AGU as the result (bypassing
> >> the load-from-memory stage), and has a few different use cases.
> >>
> > I know one compiler writer that would enunciate "LEA" as 'Lee'-'ah'
> I always thought of it like "Lee".
<
It was funny to hear him say "lee-ah", but he was from Russia so I gave him
a break.
> >>
> >> Apart from preloading memory addresses into registers, it can be used
> >> for simple arithmetic on the form A + B << N (where N is 0, 1, 2 or 3),
> >> which can be used for implementing x * 3 and x * 5, for instance.
> >>
> > Better still is to make integer multiply 3-cycles.
> >>
> >> Another very useful use case is to load vector registers with a
> >> "stride". In vector mode the AGU will generate addresses on the
> >> form:
> >>
> >> (1) addr[k] = Rb + Im * k (Immediate form stride)
> >> (2) addr[k] = Rb + (Ri << Sc) * k (Register form stride)
> >> (3) addr[k] = Rb + Vi[k] << Sc (Gather/scatter)
> >>
> > Both stride and gather forms fall out "for free" in VVM. {So, I am
> > agreeing with you that LEA is a valuable instruction.}
> >>
> >> Forms (1) and (2) construct addresses with a constant offset
> >> between each address (I call these "stride based" load/store).
> > <
> > Don't forget the stride-0 form in GPUs where every thread wants to
> > bang on the auto-update memory reference location.
> >>
> >> The LDEA instruction can thus be used for loading a series of
> >> numbers into a vector register, e.g. like this:
> >>
> >> LDEA V1, Z, #1 ; V1 = {0, 1, 2, 3, 4, 5, ...}
> >>
> >> LDI S1, #7
> >> LDEA V1, S1, #3 ; V1 = {7, 10, 13, 16, 19, ...}
> >>
> >> This is a common operation in vectorized code, and on classic SIMD
> >> ISA:s you usually load a predefined constant (e.g. from memory) in this
> >> case.
> >>
> >> Since the MRISC32 ISA allows each implementation to define the vector
> >> register size, using predefined vector constants is not a good solution
> >> since the vector register size is not known at compile time.
> >>
> >> /Marcus

Re: scaling, was FP8 (was Compact representation for common integer constants)

<s7a81t$1019$1@gal.iecc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16568&group=comp.arch#16568

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!feeder1.feed.usenet.farm!feed.usenet.farm!tr1.eu1.usenetexpress.com!feeder.usenetexpress.com!tr3.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: scaling, was FP8 (was Compact representation for common integer constants)
Date: Mon, 10 May 2021 03:07:41 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <s7a81t$1019$1@gal.iecc.com>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com> <s790a3$9hl$1@dont-email.me> <8a28bfa8-8bb0-41ce-9191-ea6f196172edn@googlegroups.com> <jwv8s4nwr4o.fsf-monnier+comp.arch@gnu.org>
Injection-Date: Mon, 10 May 2021 03:07:41 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="32809"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com> <s790a3$9hl$1@dont-email.me> <8a28bfa8-8bb0-41ce-9191-ea6f196172edn@googlegroups.com> <jwv8s4nwr4o.fsf-monnier+comp.arch@gnu.org>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
Lines: 26
 by: John Levine - Mon, 10 May 2021 03:07 UTC

According to Stefan Monnier <monnier@iro.umontreal.ca>:
>>> Wow! Interesting. I wonder what the fix will be?
>> The only way to get BH to split their stock would be the threat of delisting.

BH is listed on the NYSE which uses a different internal
representation. This is solely the NASDAQ's problem.

Buffett is 90 years old and has named a successor. They also have
class B shares which are worth 1/1500 of a class A share but only have
1/10000 of a class A vote. The obvious thing to do once Buffett is no
longer on the scene is to convert everything to class B so the price
is in the $300 range and all shares have the same vote.

>They could also threaten to let it wrap around ;-)

That had occurred to me.

Several decades ago I recall a somewhat similar issue when BH was the
first listed stock whose price exceeded $10,000. At that point the
problem was stock price tables were printed in newspapers, and the
typesetting software a lot of papers used wasn't expecting five digit
prices. They fixed that somehow, too.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: Compact representation for common integer constants

<s7a8u7$mui$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16569&group=comp.arch#16569

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Compact representation for common integer constants
Date: Sun, 9 May 2021 22:22:38 -0500
Organization: A noiseless patient Spider
Lines: 92
Message-ID: <s7a8u7$mui$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s6udkp$hs5$1@dont-email.me>
<6a45a966-9d86-40ed-9b16-67766956d46fn@googlegroups.com>
<s74akj$siq$1@dont-email.me>
<f94d31bd-ae99-4d0a-84d3-d16e9ba71c6fn@googlegroups.com>
<s74muh$vqf$1@dont-email.me> <s789v4$rv6$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 10 May 2021 03:22:47 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8fd450ac5d006b88eb9e65bad5397861";
logging-data="23506"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX192RVZvCNFAG/qKNaI7BWLm"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.0
Cancel-Lock: sha1:PwUPJU5C+UH7XxuzSOqASd3Uuzo=
In-Reply-To: <s789v4$rv6$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: BGB - Mon, 10 May 2021 03:22 UTC

On 5/9/2021 4:28 AM, Thomas Koenig wrote:
> BGB <cr88192@gmail.com> schrieb:
>
>> IMUL (Lane 1)
>> 32*32 -> 64
>
> Do you have two instructions (one for signed and one for unsigned)
> or three (one for the lower half, one for signed high, one for
> unsigned high)? The latter version could save you some ALU
> complexity and some latency in the (probably common) case where
> only a 32*32 multiplication is needed, at the cost of added
> instructions for the 32*32-> 64 bit case.
>

There are actually more cases:
MULS: 32*32->32, Result is sign-extended from low 32 bits
MULU: 32*32->32, Result is zero-extended from low 32 bits
DMULS: 32*32->64, 64-bit signed result
DMULU: 32*32->64, 64-bit unsigned result

The former give the typical behaviors one expects in C, the latter gives
the widened results.

These exist as 3R forms, so:
DMULU R4, R5, R7 // R7 = R4 * R5

Originally, there were also multiply ops which multiplied two inputs and
then stored a pair of results in R0 and R1, more like the original
SuperH multiply ops, but I dropped these for various reasons.

There are cases where DMACS or DMACU instructions could be useful:
DMACU R4, R5, R7 // R7 = R4 * R5 + R7

But, I don't currently have these.

Eg (64-bit signed multiply):
SHADQ R4, -32, R6 | SHADQ R5, -32, R7 //1c
DMULS R4, R7, R16 //1c
DMACS R5, R6, R16 //3c (2c penalty)
SHADQ R16, 32, R2 //3c (2c penalty)
DMACU R4, R5, R2 //1c
RTS

Though, while fewer instructions than the current form, the above
construction would still be pretty bad in terms of interlock penalties.

SHADQ R4, -32, R6 | SHADQ R5, -32, R7 //1c
DMULS R5, R6, R16 //1c
DMULS R4, R7, R17 //1c
DMULU R4, R5, R18 //1c
ADD R16, R17, R19 //2c (1c penalty, DMULS R17)
SHADQ R19, 32, R2 //2c (1c penalty, ADD R19)
ADD R18, R2 //1c
RTS

Both cases would have approximately the same clock-cycle count (assuming
both cases have a 3-cycle latency).

( Where recently, I have gotten around to modifying things such that the
multiplier is now fully pipelined... )

Otherwise, my time recently has mostly been being consumed by debugging...

Then tries and seeing if I can get stuff to pass timing at 75MHz again
(hopefully without wrecking stuff quite as bad this time). This
sub-effort also revealed a few bugs (*), though there are still some
bugs I have yet to resolve...

*: Eg, after boosting the core to 75MHz while leaving the MMIO bus at
50MHz, stuff was breaking in simulation due to the L2 Ringbus <->
MMIO-Bus bridge not waiting for the MMIO-Bus to return to a READY state
before returning back to an idle state.

It was then possible for the response to travel the rings and get back
to the L1, which then allows execution to continue, with the CPU core
then issuing another MMIO request, which then travels along the rings
back to the MMIO bridge, in less time than it took for the 'OK -> READY'
transition to happen on the MMIO bus...

The way the bridge was designed, it would then try to initiate a
request, see that the MMIO-Bus state was 'OK', and use whatever result
was present (losing the request or returning garbage).

This may have been happening at 50MHz as well, and could have possibly
been leading to some of the bugs I had seen.

Or such...

Re: FP8 (was Compact representation for common integer constants)

<s7abfl$1k5m$1@gal.iecc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16570&group=comp.arch#16570

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.cmpublishers.com!adore2!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: FP8 (was Compact representation for common integer constants)
Date: Mon, 10 May 2021 04:06:13 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <s7abfl$1k5m$1@gal.iecc.com>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com> <s76as3$i7j$1@gal.iecc.com> <s76ehd$65p$1@dont-email.me> <2021May9.101917@mips.complang.tuwien.ac.at>
Injection-Date: Mon, 10 May 2021 04:06:13 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="53430"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com> <s76as3$i7j$1@gal.iecc.com> <s76ehd$65p$1@dont-email.me> <2021May9.101917@mips.complang.tuwien.ac.at>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Mon, 10 May 2021 04:06 UTC

According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:
>As for fixed vs. floating point, I guess that is a cross-cutting
>concern. Sure you can argue that, if you spend a lot of time on
>low-level steps such as coding layout, spending some time on range
>analysis is minor change, so fixed point is acceptable.

We forget how flaky and unreliable those early computers were. They
used orders of magnitude more components than the most complex
electronic systems ever had before. The Williams tubes used in the
early 1950s just barely worked and needed endless tuning and fiddling.
Even once they switched to core, tubes burned out, components went out
of spec, solder joints cracked, and uptime was measured in hours if
you were lucky, minutes if you weren't.

So a compelling reason to leave out floating point was that all the extra
logic made the computer even less reliable. It's impressive that they
risked it on the 704.

I have read that the practical limit on the length
of a FORTRAN program on the 704 was set by fact that the compiler
had to compile it in one run between hardware failures.

>What held back floating point for a long time was the slowness and/or
>high cost of FP hardware, but at least in general-purpose computers
>that's a thing of the past.

On small systems, I suppose. Floating point was standard or at least a
widely provided option on large computers by the early 1960s.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: FP8 (was Compact representation for common integer constants)

<s7acud$tj0$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16571&group=comp.arch#16571

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: FP8 (was Compact representation for common integer constants)
Date: Sun, 9 May 2021 21:31:09 -0700
Organization: A noiseless patient Spider
Lines: 42
Message-ID: <s7acud$tj0$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s76as3$i7j$1@gal.iecc.com> <s76ehd$65p$1@dont-email.me>
<2021May9.101917@mips.complang.tuwien.ac.at> <s7abfl$1k5m$1@gal.iecc.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 10 May 2021 04:31:09 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="fd55d588e88460585721e0e49e5ed356";
logging-data="30304"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1++VPOicCVYyvq/4azXruBy"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.0
Cancel-Lock: sha1:eN2f+VcezLd+89J+2AD1PC8UMBk=
In-Reply-To: <s7abfl$1k5m$1@gal.iecc.com>
Content-Language: en-US
 by: Ivan Godard - Mon, 10 May 2021 04:31 UTC

On 5/9/2021 9:06 PM, John Levine wrote:
> According to Anton Ertl <anton@mips.complang.tuwien.ac.at>:
>> As for fixed vs. floating point, I guess that is a cross-cutting
>> concern. Sure you can argue that, if you spend a lot of time on
>> low-level steps such as coding layout, spending some time on range
>> analysis is minor change, so fixed point is acceptable.
>
> We forget how flaky and unreliable those early computers were. They
> used orders of magnitude more components than the most complex
> electronic systems ever had before. The Williams tubes used in the
> early 1950s just barely worked and needed endless tuning and fiddling.
> Even once they switched to core, tubes burned out, components went out
> of spec, solder joints cracked, and uptime was measured in hours if
> you were lucky, minutes if you weren't.
>
> So a compelling reason to leave out floating point was that all the extra
> logic made the computer even less reliable. It's impressive that they
> risked it on the 704.
>
> I have read that the practical limit on the length
> of a FORTRAN program on the 704 was set by fact that the compiler
> had to compile it in one run between hardware failures.
>
>> What held back floating point for a long time was the slowness and/or
>> high cost of FP hardware, but at least in general-purpose computers
>> that's a thing of the past.
>
> On small systems, I suppose. Floating point was standard or at least a
> widely provided option on large computers by the early 1960s.
>

Leaving out functionality is not the only solution to flaky hardware.
B6500 system 101 had a MTTF of 8 minutes after Jake Vigil dumped a cup
of coffee into mem mod zero. So when system 102 came up, with MTTF of
four hours, all the software team (all 23 of us for a new OS, five
compilers, and all the ancillaries) went over there. Which meant that
you couldn't get any time on it.

So I stayed on 101. And for me, and only me, it would stay up for half
an hour, which was enough to boot and get a compile through. Why me? I
remain convinced that it was because after every successful compile I
would pat the console D&D lights and thank it.

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<s7aeoj$tdu$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16572&group=comp.arch#16572

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
Date: Mon, 10 May 2021 00:02:01 -0500
Organization: A noiseless patient Spider
Lines: 164
Message-ID: <s7aeoj$tdu$1@dont-email.me>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 10 May 2021 05:02:11 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8fd450ac5d006b88eb9e65bad5397861";
logging-data="30142"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+LNAGtGkZ248rc1P9AWh/4"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.0
Cancel-Lock: sha1:f9SRBzcOCfM5cpMB6uy2DW2gzQc=
In-Reply-To: <55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
Content-Language: en-US
 by: BGB - Mon, 10 May 2021 05:02 UTC

On 5/9/2021 6:08 PM, MitchAlsup wrote:
> On Sunday, May 9, 2021 at 4:34:56 PM UTC-5, BGB wrote:
>> On 5/9/2021 3:59 PM, MitchAlsup wrote:
>
>>> <
>>> How many branches are farther than 1/8 GB away with code compiled
>>> from high level languages ??
>>> <
>> The normal direct branch op in BJX2 has a 20-bit displacement, so can
>> reach +/- 1MB.
>>
>> A Jumbo+LEA can do +/- 2GB, so:
>> LEA.B (PC, Disp33s), R3
>> JMP R3
>>
>> But, this is also possible (+/- 32MB):
>> MOV #Imm25s, R0
>> BRA R0
>>
> My 66000 has 16-bit actual word displacement (18-bit byte address) for
> conditional branches and 26-bit unconditional branches (and calls) (28-bit
> effective range).
>
> But JMP instructions can have 32-bit or 64-bit immediates so you predicate
> over the JMP with an inverted condition for the predicate.

All these cases are 20 bits in my case.

I had a few times considered eliminating the fixed BT/BF instructions in
favor of using the BRA?T and BRA?F encodings instead (semantically
equivalent), but haven't done so yet (partly inertia, partly I don't
like breaking binary compatibility unless there is a good reason).

Though, the main reason to do this would be to reclaim ~ 21 bits of
encoding space...

>>
>> In general, these don't really come up at present because, of my test
>> programs, all of them have < 1MB in the ".text" section.
> <
> In practice, few (<1%) are branches of those dimensions.
> <

In my current tests programs, it currently rests at 0%, since any larger
branches tend to be indirect calls through a function pointer or similar
(*2).

*2: Though, one of these programs is has ~ 1.2 MB of ".text" on an
x86-64 build (for both native Win64 and WSL), and 760K as BJX2 (in speed
optimized mode).

Similarly, it wont effect any local branches unless by some absurd
chance a *single function* were to exceed 1MB in the ".text" section.

Excluding the possibility of excessively large procedurally generated
"switch()" blocks or similar (eg: 10k+ case labels), this is unlikely.

But, these cases don't break the ISA, merely the limits of the existing
branch encodings. I could, in-theory add wider encodings, but then would
need to deal with them in the pipeline (and making this part any wider
would be a big source of timing hassles).

Well, it is either this, or allow a "jumbo branch" encoding where, say:
Can branch pretty much anywhere in the address space;
Completely ignored by the branch predictor;
Takes ~ 11 or so clock cycles to perform said branch...

In this case, the branch would likely be routed through the ALU (as a
64-bit integer ADD), and then the ADD result is used as a branch
destination.

Where, say, the branch instruction is actually decoded as a hacked
64-bit ADD instruction, and then in EX2 or EX3, some logic is like "Hey,
we just added something to PC!" and invokes the register-indirect branch
mechanism using the ADD result.

But, a lot of this would be mostly because, as-is, there is basically no
good way for the existing PC-relative branch mechanisms to handle a
displacement this large...

Though, it would technically be a bit simpler/cheaper to add a special
case to allow for a Jumbo-encoded "BRA (Abs48)" or similar in this case,
which wouldn't require any new logic on the EX side of things (can jump
anywhere... Just requires using base relocs...).

>>>>> Compound addressing within structures;
>>>>> Eg: Accessing an element within an array within a structure.
>>>>> ...
>>>>>
>>>> MRISC32 also has LEA (called "LDEA" - LoaD Effective Address). It
>>>> essentially uses the output from the AGU as the result (bypassing
>>>> the load-from-memory stage), and has a few different use cases.
>>>>
>>> I know one compiler writer that would enunciate "LEA" as 'Lee'-'ah'
>> I always thought of it like "Lee".
> <
> It was funny to hear him say "lee-ah", but he was from Russia so I gave him
> a break.

OK.

I vary between phonetic mappings, and letter-based mappings.

If I try to say things IRL, sometimes there is a slight delay if my mind
isn't sure how to map a mess of letters and numbers over to a spoken
form. As can be noted, I mostly think in visual images and text with
spoken language more as a 2nd class citizen.

Admittedly, this is also partly why my naming conventions tend to be
based more on patterns or visual aesthetic rather than whether or not
things can be easily spoken.

>>>>
>>>> Apart from preloading memory addresses into registers, it can be used
>>>> for simple arithmetic on the form A + B << N (where N is 0, 1, 2 or 3),
>>>> which can be used for implementing x * 3 and x * 5, for instance.
>>>>
>>> Better still is to make integer multiply 3-cycles.
>>>>
>>>> Another very useful use case is to load vector registers with a
>>>> "stride". In vector mode the AGU will generate addresses on the
>>>> form:
>>>>
>>>> (1) addr[k] = Rb + Im * k (Immediate form stride)
>>>> (2) addr[k] = Rb + (Ri << Sc) * k (Register form stride)
>>>> (3) addr[k] = Rb + Vi[k] << Sc (Gather/scatter)
>>>>
>>> Both stride and gather forms fall out "for free" in VVM. {So, I am
>>> agreeing with you that LEA is a valuable instruction.}
>>>>
>>>> Forms (1) and (2) construct addresses with a constant offset
>>>> between each address (I call these "stride based" load/store).
>>> <
>>> Don't forget the stride-0 form in GPUs where every thread wants to
>>> bang on the auto-update memory reference location.
>>>>
>>>> The LDEA instruction can thus be used for loading a series of
>>>> numbers into a vector register, e.g. like this:
>>>>
>>>> LDEA V1, Z, #1 ; V1 = {0, 1, 2, 3, 4, 5, ...}
>>>>
>>>> LDI S1, #7
>>>> LDEA V1, S1, #3 ; V1 = {7, 10, 13, 16, 19, ...}
>>>>
>>>> This is a common operation in vectorized code, and on classic SIMD
>>>> ISA:s you usually load a predefined constant (e.g. from memory) in this
>>>> case.
>>>>
>>>> Since the MRISC32 ISA allows each implementation to define the vector
>>>> register size, using predefined vector constants is not a good solution
>>>> since the vector register size is not known at compile time.
>>>>
>>>> /Marcus

Re: FP8 (was Compact representation for common integer constants)

<s7ai2u$ul4$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16573&group=comp.arch#16573

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!/FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: FP8 (was Compact representation for common integer constants)
Date: Mon, 10 May 2021 07:58:54 +0200
Organization: Aioe.org NNTP Server
Lines: 37
Message-ID: <s7ai2u$ul4$1@gioia.aioe.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s73rcr$83m$1@dont-email.me> <s73u8m$tnj$1@dont-email.me>
<s75pe2$shi$1@dont-email.me> <s76as3$i7j$1@gal.iecc.com>
<jwv7dk93zcy.fsf-monnier+comp.arch@gnu.org> <s787j3$99m$1@dont-email.me>
<jwv1rafyd1g.fsf-monnier+comp.arch@gnu.org>
<jwvv97rwy3y.fsf-monnier+comp.arch@gnu.org>
<48015cc8-9326-427a-9fdd-36ed1e12939an@googlegroups.com>
NNTP-Posting-Host: /FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Mon, 10 May 2021 05:58 UTC

MitchAlsup wrote:
> On Sunday, May 9, 2021 at 1:45:06 PM UTC-5, Stefan Monnier wrote:
>> Stefan Monnier [2021-05-09 14:35:28] wrote:
>>>> Yes that makes sense. Specifically for FP8, the increased dynamic range
>>>> is by far the most important trait. I've even seen examples of using
>>>> 1:5:2 in DNN:s (deep neural networks) where dynamic range is often more
>>>> important than precision.
>>> Indeed 1:5:2 is arguably more useful than 1:4:3.
>> An alternative of course is to use a log-base representation. So your
>> 8bit data is split 1:7 between a sign and an exponent, with no mantissa
>> at all and then you choose your dynamic range by picking the base.
>> Multiplication is easy but addition is takes more effort.
> <
> Really ?!?!?
> Both are implemented with table look ups with concatenated values as the
> address.

Agreed, in a HW implementation you don't even need a full 64K (8+8
bits), since the sign logic can go on in parallel with the table access.

I.e. 14-bit tables for each operation is sufficient, so that is about 14
KB x 4 = 56 KB of rom space?

If you for some really strange reason want to implement programmable
rounding for those 8x8->8 bit operations, then I would fake it with
8-bit (+ sign) outputs from the lookup tables and use the trailing bit
for rounding up/down/even, but having nearest_or_even built in to the
tables seems more reasonable.

OTOH, for AI training it seems like most implementations are using
higher precision accumulators, i.e. FP16 or FP32?

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: FP8 (was Compact representation for common integer constants)

<35f9ebde-3152-4770-bd57-a8b88a82d61dn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16574&group=comp.arch#16574

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:40f:: with SMTP id n15mr3193194qtx.10.1620626563197;
Sun, 09 May 2021 23:02:43 -0700 (PDT)
X-Received: by 2002:aca:6286:: with SMTP id w128mr16916883oib.119.1620626562966;
Sun, 09 May 2021 23:02:42 -0700 (PDT)
Path: i2pn2.org!i2pn.org!paganini.bofh.team!usenet.pasdenom.info!usenet-fr.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 9 May 2021 23:02:42 -0700 (PDT)
In-Reply-To: <30744e66-bb03-482d-8982-8508f8d966a3n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:f8e3:d700:99bc:5119:7692:9e3e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:f8e3:d700:99bc:5119:7692:9e3e
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s73rcr$83m$1@dont-email.me> <s73u8m$tnj$1@dont-email.me> <s75pe2$shi$1@dont-email.me>
<s76as3$i7j$1@gal.iecc.com> <s76ehd$65p$1@dont-email.me> <2021May9.101917@mips.complang.tuwien.ac.at>
<30744e66-bb03-482d-8982-8508f8d966a3n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <35f9ebde-3152-4770-bd57-a8b88a82d61dn@googlegroups.com>
Subject: Re: FP8 (was Compact representation for common integer constants)
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 10 May 2021 06:02:43 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Quadibloc - Mon, 10 May 2021 06:02 UTC

On Sunday, May 9, 2021 at 2:53:04 PM UTC-6, MitchAlsup wrote:

> My how lazy we have become ! Even with high quality FP, numerical
> analysis is still mandatory--except that it is seldom performed !?!

Numerical analysis is indeed necessary even when using floating-point,
but of course that doesn't mean that floating point is never really
needed!

John Savard

Re: The old RISC-vs-CISC

<s7aif0$13o3$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16575&group=comp.arch#16575

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!/FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC
Date: Mon, 10 May 2021 08:05:20 +0200
Organization: Aioe.org NNTP Server
Lines: 22
Message-ID: <s7aif0$13o3$1@gioia.aioe.org>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
NNTP-Posting-Host: /FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Mon, 10 May 2021 06:05 UTC

MitchAlsup wrote:
> On Sunday, May 9, 2021 at 4:34:56 PM UTC-5, BGB wrote:
>> On 5/9/2021 3:59 PM, MitchAlsup wrote:
>>> I know one compiler writer that would enunciate "LEA" as 'Lee'-'ah'
>> I always thought of it like "Lee".
> <
> It was funny to hear him say "lee-ah", but he was from Russia so I gave him
> a break.

That is funny:

'Lea' is an old female name here in Norway (probably from the
Norwegian-language bible), it is always pronounced as two emphasized
syllables, i.e. close to the 'Lee'-'ah' suggestion above.

I never even considered that it could be pronounced any other way!

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: FP8 (was Compact representation for common integer constants)

<7adbd682-81b7-4cca-a1ec-76e93dd23410n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16576&group=comp.arch#16576

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:2125:: with SMTP id r5mr21789664qvc.28.1620626743852;
Sun, 09 May 2021 23:05:43 -0700 (PDT)
X-Received: by 2002:a05:6830:a:: with SMTP id c10mr10082062otp.114.1620626743646;
Sun, 09 May 2021 23:05:43 -0700 (PDT)
Path: i2pn2.org!i2pn.org!paganini.bofh.team!usenet.pasdenom.info!usenet-fr.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 9 May 2021 23:05:43 -0700 (PDT)
In-Reply-To: <8fe5e110-d4d0-4271-9de1-853d7156d194n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:f8e3:d700:99bc:5119:7692:9e3e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:f8e3:d700:99bc:5119:7692:9e3e
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s73rcr$83m$1@dont-email.me> <s73u8m$tnj$1@dont-email.me> <s75pe2$shi$1@dont-email.me>
<s76as3$i7j$1@gal.iecc.com> <jwv7dk93zcy.fsf-monnier+comp.arch@gnu.org>
<s787j3$99m$1@dont-email.me> <jwv1rafyd1g.fsf-monnier+comp.arch@gnu.org>
<jwvv97rwy3y.fsf-monnier+comp.arch@gnu.org> <48015cc8-9326-427a-9fdd-36ed1e12939an@googlegroups.com>
<jwv35uvwr24.fsf-monnier+comp.arch@gnu.org> <8fe5e110-d4d0-4271-9de1-853d7156d194n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <7adbd682-81b7-4cca-a1ec-76e93dd23410n@googlegroups.com>
Subject: Re: FP8 (was Compact representation for common integer constants)
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 10 May 2021 06:05:43 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Quadibloc - Mon, 10 May 2021 06:05 UTC

On Sunday, May 9, 2021 at 5:03:11 PM UTC-6, MitchAlsup wrote:

> IBM 1620 "Can't add, doesn't even try" laid the groundwork where if the
> operands are small enough it becomes cheaper to just read out the correct
> result than to build the logic to calculate the result.

And that reminds me... of the FOCUS Number System.

Where numbers were represented by logarithms, and addition was done by
looking up log((a+b)/a) as a function of log(b/a).

John Savard

Re: FP8 (was Compact representation for common integer constants)

<41cb95a5-13ed-46c9-a884-667cd8a5d001n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16577&group=comp.arch#16577

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:a287:: with SMTP id l129mr11492494qke.481.1620627038805;
Sun, 09 May 2021 23:10:38 -0700 (PDT)
X-Received: by 2002:aca:b387:: with SMTP id c129mr23792985oif.30.1620627038637;
Sun, 09 May 2021 23:10:38 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 9 May 2021 23:10:38 -0700 (PDT)
In-Reply-To: <s7abfl$1k5m$1@gal.iecc.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:f8e3:d700:99bc:5119:7692:9e3e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:f8e3:d700:99bc:5119:7692:9e3e
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s76as3$i7j$1@gal.iecc.com> <s76ehd$65p$1@dont-email.me> <2021May9.101917@mips.complang.tuwien.ac.at>
<s7abfl$1k5m$1@gal.iecc.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <41cb95a5-13ed-46c9-a884-667cd8a5d001n@googlegroups.com>
Subject: Re: FP8 (was Compact representation for common integer constants)
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 10 May 2021 06:10:38 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Quadibloc - Mon, 10 May 2021 06:10 UTC

On Sunday, May 9, 2021 at 10:06:15 PM UTC-6, John Levine wrote:

> Even once they switched to core, tubes burned out, components went out
> of spec, solder joints cracked, and uptime was measured in hours if
> you were lucky, minutes if you weren't.

> So a compelling reason to leave out floating point was that all the extra
> logic made the computer even less reliable. It's impressive that they
> risked it on the 704.

> I have read that the practical limit on the length
> of a FORTRAN program on the 704 was set by fact that the compiler
> had to compile it in one run between hardware failures.

That may be. But aside from the 704 using core memory, which didn't
have the pattern sensitivity issues of Williams tubes, a lot of effort
went into making it reliable by IBM. One thing was that the tubes
were run at a lower-than-normal voltage, since they were being used
for digital amplification in a computer with thousands of them, instead
of in a radio with five of them.

John Savard

Re: The old RISC-vs-CISC (was: Compact representation for common integer constants)

<s7ak85$5dr$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16578&group=comp.arch#16578

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-6262-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: The old RISC-vs-CISC (was: Compact representation for common
integer constants)
Date: Mon, 10 May 2021 06:35:49 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <s7ak85$5dr$1@newsreader4.netcologne.de>
References: <44003c05-8b05-4e0e-acb8-bb252be14d26n@googlegroups.com>
<s713uv$707$1@gal.iecc.com> <s719lp$bg$1@dont-email.me>
<6ebdf17e-3188-44d2-b946-3c2e9e104672n@googlegroups.com>
<s71imq$3b4$1@dont-email.me>
<77cd652a-a3c4-48a9-a088-58fe96562dc7n@googlegroups.com>
<s72mv0$qai$1@dont-email.me> <jwv5yzuae2l.fsf-monnier+comp.arch@gnu.org>
<s747f6$5ri$1@dont-email.me>
<0a355bf9-7067-4076-9365-de1c63061df1n@googlegroups.com>
<s74rp7$laf$1@dont-email.me> <s78dnf$hpi$1@dont-email.me>
<f656541b-ffc7-407e-8f5f-c7ca93477f98n@googlegroups.com>
<s79khu$u4v$1@dont-email.me>
<55ccad4d-7603-415e-9e37-c6db84901e33n@googlegroups.com>
<s7aeoj$tdu$1@dont-email.me>
Injection-Date: Mon, 10 May 2021 06:35:49 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-6262-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:6262:0:7285:c2ff:fe6c:992d";
logging-data="5563"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Mon, 10 May 2021 06:35 UTC

BGB <cr88192@gmail.com> schrieb:

> Similarly, it wont effect any local branches unless by some absurd
> chance a *single function* were to exceed 1MB in the ".text" section.

> Excluding the possibility of excessively large procedurally generated
> "switch()" blocks or similar (eg: 10k+ case labels), this is unlikely.

One way I have encountered this is automatically generated formulas
from computer algebra sytems like Maple.

Look at https://gcc.gnu.org/bugzilla/attachment.cgi?id=41459 for
an example of such code. It isn't easy on compilers (but it does
not have branches).

> But, these cases don't break the ISA, merely the limits of the existing
> branch encodings. I could, in-theory add wider encodings, but then would
> need to deal with them in the pipeline (and making this part any wider
> would be a big source of timing hassles).

Code should be correct as first consideration, fast as a (distant)
second.

I assume you can jump to a register in your ISA, for function
pointers.

If that is the case, you can reverse the test and optionally branch
over an instruction sequence which loads the target address into
a register via loading the PC and adding the offset to it (as
determined by the assembler) and then jumping to that register.

[...]

> Though, it would technically be a bit simpler/cheaper to add a special
> case to allow for a Jumbo-encoded "BRA (Abs48)" or similar in this case,
> which wouldn't require any new logic on the EX side of things (can jump
> anywhere... Just requires using base relocs...).

That sounds even better. You have the long instructions, why not use
them?

Pages:123456789101112131415
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor