Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

It's ten o'clock; do you know where your processes are?


devel / comp.arch / Re: RISC-V vs. Aarch64

SubjectAuthor
* RISC-V vs. Aarch64Anton Ertl
+* Re: RISC-V vs. Aarch64MitchAlsup
|+* Re: RISC-V vs. Aarch64Anton Ertl
||`* Re: RISC-V vs. Aarch64MitchAlsup
|| +- Re: RISC-V vs. Aarch64BGB
|| `- Re: RISC-V vs. Aarch64Anton Ertl
|+* Re: RISC-V vs. Aarch64Ivan Godard
||+- Re: RISC-V vs. Aarch64robf...@gmail.com
||+- Re: RISC-V vs. Aarch64MitchAlsup
||`* Re: RISC-V vs. Aarch64Quadibloc
|| `* Re: RISC-V vs. Aarch64Quadibloc
||  `- Re: RISC-V vs. Aarch64Quadibloc
|+* Re: RISC-V vs. Aarch64Marcus
||+- Re: RISC-V vs. Aarch64BGB
||`* Re: RISC-V vs. Aarch64MitchAlsup
|| +- Re: RISC-V vs. Aarch64BGB
|| `- Re: RISC-V vs. Aarch64Ivan Godard
|`- Re: RISC-V vs. Aarch64MitchAlsup
`* Re: RISC-V vs. Aarch64BGB
 +* Re: RISC-V vs. Aarch64MitchAlsup
 |+- Re: RISC-V vs. Aarch64MitchAlsup
 |+* Re: RISC-V vs. Aarch64Thomas Koenig
 ||+* Re: RISC-V vs. Aarch64Ivan Godard
 |||`* Re: RISC-V vs. Aarch64EricP
 ||| `- Re: RISC-V vs. Aarch64Ivan Godard
 ||+* Re: RISC-V vs. Aarch64MitchAlsup
 |||`* Re: RISC-V vs. Aarch64Ivan Godard
 ||| `* Re: RISC-V vs. Aarch64MitchAlsup
 |||  `* Re: RISC-V vs. Aarch64Ivan Godard
 |||   `* Re: RISC-V vs. Aarch64MitchAlsup
 |||    `- Re: RISC-V vs. Aarch64Marcus
 ||`* Re: RISC-V vs. Aarch64BGB
 || `- Re: RISC-V vs. Aarch64MitchAlsup
 |+* Re: RISC-V vs. Aarch64BGB
 ||`* Re: RISC-V vs. Aarch64MitchAlsup
 || `- Re: RISC-V vs. Aarch64Thomas Koenig
 |`* Re: RISC-V vs. Aarch64Marcus
 | `* Re: RISC-V vs. Aarch64EricP
 |  +* Re: RISC-V vs. Aarch64Marcus
 |  |+* Re: RISC-V vs. Aarch64MitchAlsup
 |  ||+* Re: RISC-V vs. Aarch64Niklas Holsti
 |  |||+* Re: RISC-V vs. Aarch64Bill Findlay
 |  ||||`- Re: RISC-V vs. Aarch64MitchAlsup
 |  |||`- Re: RISC-V vs. Aarch64Ivan Godard
 |  ||`- Re: RISC-V vs. Aarch64Thomas Koenig
 |  |+* Re: RISC-V vs. Aarch64Thomas Koenig
 |  ||+* Re: RISC-V vs. Aarch64MitchAlsup
 |  |||`- Re: RISC-V vs. Aarch64BGB
 |  ||+* Re: RISC-V vs. Aarch64Ivan Godard
 |  |||`* Re: RISC-V vs. Aarch64Thomas Koenig
 |  ||| `- Re: RISC-V vs. Aarch64Ivan Godard
 |  ||`* Re: RISC-V vs. Aarch64Marcus
 |  || +* Re: RISC-V vs. Aarch64Thomas Koenig
 |  || |`* Re: RISC-V vs. Aarch64aph
 |  || | +- Re: RISC-V vs. Aarch64Michael S
 |  || | `* Re: RISC-V vs. Aarch64Thomas Koenig
 |  || |  `* Re: RISC-V vs. Aarch64robf...@gmail.com
 |  || |   +* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |   |`- Re: RISC-V vs. Aarch64Tim Rentsch
 |  || |   `* Re: RISC-V vs. Aarch64Terje Mathisen
 |  || |    `* Re: RISC-V vs. Aarch64Thomas Koenig
 |  || |     `* Re: RISC-V vs. Aarch64Marcus
 |  || |      `* Re: RISC-V vs. Aarch64Guillaume
 |  || |       `* Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        +- Re: RISC-V vs. Aarch64Marcus
 |  || |        +* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |`* Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        | `* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |  `* Re: RISC-V vs. Aarch64Thomas Koenig
 |  || |        |   `* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |    `* Re: RISC-V vs. Aarch64EricP
 |  || |        |     +* Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        |     |`* Re: RISC-V vs. Aarch64EricP
 |  || |        |     | `- Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        |     `* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |      `* Re: RISC-V vs. Aarch64EricP
 |  || |        |       +- Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        |       `* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |        +* Re: RISC-V vs. Aarch64Brett
 |  || |        |        |+* Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        |        ||`- Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |        |`- Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |        `* Re: RISC-V vs. Aarch64Stephen Fuld
 |  || |        |         `* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |          +* Re: RISC-V vs. Aarch64Stefan Monnier
 |  || |        |          |`- Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |          +* Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        |          |`* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |          | `- Re: RISC-V vs. Aarch64MitchAlsup
 |  || |        |          +* Re: RISC-V vs. Aarch64Stephen Fuld
 |  || |        |          |`- Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |          `* Re: RISC-V vs. Aarch64EricP
 |  || |        |           +* Re: RISC-V vs. Aarch64EricP
 |  || |        |           |`* Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        |           | `* The type of Mill's belt's slotsStefan Monnier
 |  || |        |           |  +- Re: The type of Mill's belt's slotsMitchAlsup
 |  || |        |           |  `* Re: The type of Mill's belt's slotsIvan Godard
 |  || |        |           |   `* Re: The type of Mill's belt's slotsStefan Monnier
 |  || |        |           |    `* Re: The type of Mill's belt's slotsIvan Godard
 |  || |        |           |     +* Re: The type of Mill's belt's slotsStefan Monnier
 |  || |        |           |     |`* Re: The type of Mill's belt's slotsIvan Godard
 |  || |        |           |     `* Re: The type of Mill's belt's slotsMitchAlsup
 |  || |        |           `- Re: RISC-V vs. Aarch64Ivan Godard
 |  || |        +* Re: RISC-V vs. Aarch64Guillaume
 |  || |        `* Re: RISC-V vs. Aarch64Quadibloc
 |  || `* MRISC32 vectorization (was: RISC-V vs. Aarch64)Thomas Koenig
 |  |`* Re: RISC-V vs. Aarch64Terje Mathisen
 |  `- Re: RISC-V vs. Aarch64Quadibloc
 +* Re: RISC-V vs. Aarch64Anton Ertl
 `- Re: RISC-V vs. Aarch64aph

Pages:123456789101112131415
Re: RISC-V vs. Aarch64

<UT%HJ.360$N31.161@fx45.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23118&group=comp.arch#23118

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx45.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <00add816-93d7-4763-a68b-33a67db6d770n@googlegroups.com> <2022Jan8.101413@mips.complang.tuwien.ac.at> <7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com> <ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com> <4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com> <2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com> <570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com> <850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com> <5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com> <srnkf0$4cb$1@dont-email.me> <b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com> <srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org> <81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com> <srpjgj$541$1@dont-email.me> <1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com> <ssep4k$vke$1@dont-email.me> <ssgqj2$vef$1@gioia.aioe.org> <ssn81h$fgu$1@dont-email.me> <mZUHJ.15870$mS1.14076@fx10.iad> <sspr2i$8de$1@dont-email.me>
In-Reply-To: <sspr2i$8de$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 101
Message-ID: <UT%HJ.360$N31.161@fx45.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Tue, 25 Jan 2022 23:40:36 UTC
Date: Tue, 25 Jan 2022 18:40:36 -0500
X-Received-Bytes: 5488
X-Original-Bytes: 5437
 by: EricP - Tue, 25 Jan 2022 23:40 UTC

Ivan Godard wrote:
> On 1/25/2022 7:48 AM, EricP wrote:
>> Stephen Fuld wrote:
>>> On 1/22/2022 3:42 AM, Terje Mathisen wrote:
>>>> Stephen Fuld wrote:
>>>
>>>>> Anyway, if one had this instruction, the main loop in the code
>>>>> above could be something like
>>>>>
>>>>>
>>>>> loop:
>>>>> LDUB R10,[R1+R9]
>>>>> CARRY R6,IO
>>>>> LBF R12,R10,R2 ;I am not sure about R2, It should
>>>>> be the start of the packed buffer.
>>>>> STD R12,[R3+R9<<3]
>>>>> ADD R9,R9,#1
>>>>> CMP R11,R9,R4
>>>>> BLT R11,loop
>>>>
>>>> That is really quite nice.
>>>
>>> Thank you!
>>>
>>>
>>>>>
>>>>> For a savings of about 10 instructions in the I cache, but fewer in
>>>>> execution (but still significant) depending upon how often the
>>>>> instructions under the predicate are executed.
>>>>>
>>>>>
>>>>> Anyway, Of course, I invite comments, criticisms, etc. One obvious
>>>>> drawback is that this only addresses the "decompression" side.
>>>>> While I briefly considered a "Store Bit Field", I discarded it as
>>>>> it seemed too complex, and presumably would used less frequently,
>>>>> as compression/coding happens less frequently than
>>>>> decompression/decoding.
>>>>
>>>> Encoding is almost always far easier than decoding, since there are
>>>> zero surprises when encoding. Yes, for codecs/compression it can be
>>>> a _lot_ of work to figure out a near-optimal encoding, but the
>>>> actual conversion of the selected option into a bit stream is easy.
>>>
>>> Good point, that I hadn't thought of. More reason why the
>>> hypothetical "Store Bit Field" isn't needed.
>>
>> High bandwith compression happens more these days with video
>> conferencing.
>> Maybe its a chicken & egg thing - we do more because we can do more.
>>
>> Unless Bit Field Insert requires significantly more hardware than
>> BF Extract I don't see why one would leave it out.
>>
>> BFEXT requires a data and BF specifier source regs + 1 dest reg,
>> so fits in the standard RISC 2R 1W port model.
>>
>> BFINS requires a struct source, data source, BF specifier regs, + dest
>> reg,
>> so 3R 1W ports. Also instruction would require 4 register specifiers
>> unless you allow the struct source and dest reg to be the same specifier.
>> That bothers some, but if you have FMA then you probably have already
>> crossed that Rubicon. I would rather have the functionality than stick
>> to a somewhat arbitrary design philosophy.
>>
>> There are a bunch of bit field manipulation instructions beyond those
>> useful for multimedia codec encode/decode, signal processing, crypto.
>> butterfly, reverse butterfly, permute, mix.
>>
>> Then there is the whole area of double-wide shifts and bit fields
>> to facilitate bit stream processing.
>>
>>
>
> You don't offer dynamic BFINS/EXT? Dynamc needs two more registers,
> unless you separately buid a descriptor like Mitch.

Similar to Mitch, I'd have a bit field specifier as two 1-byte items,
the width and start pos, either in lower 16 bits of a register,
or a 16-bit immed constant.

That allows both to fit in 32-bit instructions.

BFEXT rd_data, rs1_struct, rs2_wpfield
BFEXT rd_data, rs1_struct, <#width,#pos>

BFINS rsd_struct, rs1_data, rs2_wpfield
BFINS rsd_struct, rs1_data, <#width,#pos>

Note for insert the rsd_struct is both source and dest register.

Future long format instructions could separate out the 5 insert registers
but that requires a 48-bit instruction, and I didn't want to bite off
the wide decoder right off the bat as that affects the whole fetch logic.

I'm still debating double-wide instruction formats.
Insert/extract could be very long format with 5-8 register specifiers.
Alternatively it could be two instructions operating on high and low parts,
like MULL and MULH.

Re: RISC-V vs. Aarch64

<6ndIJ.10280$tW.9922@fx39.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23126&group=comp.arch#23126

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx39.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <2022Jan8.101413@mips.complang.tuwien.ac.at> <7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com> <ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com> <4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com> <2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com> <570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com> <850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com> <5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com> <srnkf0$4cb$1@dont-email.me> <b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com> <srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org> <81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com> <srpjgj$541$1@dont-email.me> <1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com> <ssep4k$vke$1@dont-email.me> <ssgqj2$vef$1@gioia.aioe.org> <ssn81h$fgu$1@dont-email.me> <mZUHJ.15870$mS1.14076@fx10.iad> <sspr2i$8de$1@dont-email.me> <UT%HJ.360$N31.161@fx45.iad>
In-Reply-To: <UT%HJ.360$N31.161@fx45.iad>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 68
Message-ID: <6ndIJ.10280$tW.9922@fx39.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 26 Jan 2022 15:01:22 UTC
Date: Wed, 26 Jan 2022 10:01:15 -0500
X-Received-Bytes: 4210
X-Original-Bytes: 4159
 by: EricP - Wed, 26 Jan 2022 15:01 UTC

EricP wrote:
> Ivan Godard wrote:
>>
>> You don't offer dynamic BFINS/EXT? Dynamc needs two more registers,
>> unless you separately buid a descriptor like Mitch.
>
> Similar to Mitch, I'd have a bit field specifier as two 1-byte items,
> the width and start pos, either in lower 16 bits of a register,
> or a 16-bit immed constant.
>
> That allows both to fit in 32-bit instructions.
>
> BFEXT rd_data, rs1_struct, rs2_wpfield
> BFEXT rd_data, rs1_struct, <#width,#pos>

Oh and BFEXT sign extends the result.
There is also BFEXTZ to extract and zero extend result.

> BFINS rsd_struct, rs1_data, rs2_wpfield
> BFINS rsd_struct, rs1_data, <#width,#pos>
>
> Note for insert the rsd_struct is both source and dest register.
>
> Future long format instructions could separate out the 5 insert registers
> but that requires a 48-bit instruction, and I didn't want to bite off
> the wide decoder right off the bat as that affects the whole fetch logic.
>
> I'm still debating double-wide instruction formats.
> Insert/extract could be very long format with 5-8 register specifiers.
> Alternatively it could be two instructions operating on high and low parts,
> like MULL and MULH.

The wide bit field instructions extract and insert a 1..64 bit
value in a 128-bit two register container.
These essentially take all the branchy code dealing with register
straddles and sign/zero extend and makes it straight line code.

My concern is to not embed in the ISA a requirement for a large
number of read or write ports. For example, on an FPGA which has
1R 1W register banks, every extra read port requires a duplicate
register bank, and a write port requires an extra write cycle.
So keeping the ISA to requiring 3R 1W ports is a goal.

For wide extract, the low part is extracted to position 0.
The high part, if any, is extracted and positioned at some
offset in its output data. Then the low and high are OR'd.
There is also some fiddling about to get the sign extension right
depending on whether it lands in the low or high part.

BWEXTL rd_datal, rs1_structl, rs2_wpfield
BWEXTH rd_datah, rs1_structh, rs2_wpfield
OR rsd_datah, rs_datal

So a wide extract takes 3 instructions and only needs 2R 1W ports.
And there are variants with immediate <#width,#pos> and for zero extended.

Wide insert modifies the high and low structs separately as needed
and needs 3R 1W ports like BFINS.

BWINSL rsd_structl, rs1_data, rs2_wpfield
BWINSH rsd_structh, rs1_data, rs2_wpfield

Again, the struct register is both source and dest but that
allows the 3 register instruction to fit into the 32-bit format.

Re: RISC-V vs. Aarch64

<19eIJ.10298$tW.7327@fx39.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23128&group=comp.arch#23128

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.neodome.net!feeder1.feed.usenet.farm!feed.usenet.farm!peer01.ams4!peer.am4.highwinds-media.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx39.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com> <ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com> <4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com> <2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com> <570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com> <850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com> <5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com> <srnkf0$4cb$1@dont-email.me> <b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com> <srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org> <81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com> <srpjgj$541$1@dont-email.me> <1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com> <ssep4k$vke$1@dont-email.me> <ssgqj2$vef$1@gioia.aioe.org> <ssn81h$fgu$1@dont-email.me> <mZUHJ.15870$mS1.14076@fx10.iad> <sspc48$p9r$1@dont-email.me> <wTXHJ.25684$7U.2006@fx42.iad> <d1959d5f-0d22-4b43-8156-0449f43d430dn@googlegroups.com>
In-Reply-To: <d1959d5f-0d22-4b43-8156-0449f43d430dn@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 55
Message-ID: <19eIJ.10298$tW.7327@fx39.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 26 Jan 2022 15:54:37 UTC
Date: Wed, 26 Jan 2022 10:54:22 -0500
X-Received-Bytes: 3431
 by: EricP - Wed, 26 Jan 2022 15:54 UTC

MitchAlsup wrote:
> On Tuesday, January 25, 2022 at 1:07:11 PM UTC-6, EricP wrote:
>>
>> A New Basis for Shifters in General-Purpose Processors for
>> Existing and Advanced Bit Manipulations, 2008
>> https://www.researchgate.net/publication/220332176_A_New_Basis_for_Shifters_in_General-Purpose_Processors_for_Existing_and_Advanced_Bit_Manipulations
>>
> <
> My 66000 ISA encodings:
> <
>> describes the following bit field operations:
>> - rotate right & left
> CARRY Rs,{I}
> SL/SR Rd,Rs,off
>> - shift right & left with zero or sign fill
> SL/SR Rd,Rs,off
>> - bit field extract & insert
> SL Rd,Rs,<len,off>
> INS Rd,Rb,<len,off>
>> - mix select right or left subwords
> MUX Rd,Rs1,Rs2,mask // but level multiplex between s1 and s2 based on S3
>> - butterfly and inverse butterfly
> BITR Rd,Rs,<len,off>
>> - parallel extract and insert (scatter, gather)
>> - popcount
> POP Rd,Rs
>> - bit dot product
> ???

It was described in a different paper
"Advanced Bit Manipulation Instruction Set Architecture" 2006 as:

The DOTPROD instruction ANDs its two inputs, r2 and r3,
and then computes the parity of the result.
DOTPROD replaces a sequence of
- AND of the two inputs,
- POPCNT of the result,
- AND to isolate the least significant bit.

Example Usage: Error correcting coding.

>> - bit matrix multiply
> BMM Rd,Rb,[Rbm]
>> I would add:
>> - find first/last bit set/clear
> FF1 Rd,Rs // can be from the left or from the right.
> FF1 Rd,~Rs
> SET = INS Rd,#~0,<len,off>
> CLR = IINS Rd,#0,<len,off>
> <
> Looks like I got most of them.

Re: RISC-V vs. Aarch64

<351f25a1-87de-4576-a16f-70623e410fc5n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23130&group=comp.arch#23130

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:400f:: with SMTP id kd15mr9541625qvb.127.1643219357123; Wed, 26 Jan 2022 09:49:17 -0800 (PST)
X-Received: by 2002:a05:6808:13ce:: with SMTP id d14mr4722433oiw.261.1643219356831; Wed, 26 Jan 2022 09:49:16 -0800 (PST)
Path: i2pn2.org!i2pn.org!aioe.org!feeder1.feed.usenet.farm!feed.usenet.farm!tr2.eu1.usenetexpress.com!feeder.usenetexpress.com!tr2.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 26 Jan 2022 09:49:16 -0800 (PST)
In-Reply-To: <19eIJ.10298$tW.7327@fx39.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:241f:af35:8f67:6f7f; posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:241f:af35:8f67:6f7f
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com> <ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com> <4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com> <2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com> <570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com> <850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com> <5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com> <srnkf0$4cb$1@dont-email.me> <b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com> <srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org> <81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com> <srpjgj$541$1@dont-email.me> <1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com> <ssep4k$vke$1@dont-email.me> <ssgqj2$vef$1@gioia.aioe.org> <ssn81h$fgu$1@dont-email.me> <mZUHJ.15870$mS1.14076@fx10.iad> <sspc48$p9r$1@dont-email.me> <wTXHJ.25684$7U.2006@fx42.iad> <d1959d5f-0d22-4b43-8156-0449f43d430dn@googlegroups.com> <19eIJ.10298$tW.7327@fx39.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <351f25a1-87de-4576-a16f-70623e410fc5n@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 26 Jan 2022 17:49:17 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 51
 by: MitchAlsup - Wed, 26 Jan 2022 17:49 UTC

On Wednesday, January 26, 2022 at 9:54:42 AM UTC-6, EricP wrote:
> MitchAlsup wrote:
> > On Tuesday, January 25, 2022 at 1:07:11 PM UTC-6, EricP wrote:
> >>
> >> A New Basis for Shifters in General-Purpose Processors for
> >> Existing and Advanced Bit Manipulations, 2008
> >> https://www.researchgate.net/publication/220332176_A_New_Basis_for_Shifters_in_General-Purpose_Processors_for_Existing_and_Advanced_Bit_Manipulations
> >>
> > <
> > My 66000 ISA encodings:
> > <
> >> describes the following bit field operations:
> >> - rotate right & left
> > CARRY Rs,{I}
> > SL/SR Rd,Rs,off
> >> - shift right & left with zero or sign fill
> > SL/SR Rd,Rs,off
> >> - bit field extract & insert
> > SL Rd,Rs,<len,off>
> > INS Rd,Rb,<len,off>
> >> - mix select right or left subwords
> > MUX Rd,Rs1,Rs2,mask // but level multiplex between s1 and s2 based on S3
> >> - butterfly and inverse butterfly
> > BITR Rd,Rs,<len,off>
> >> - parallel extract and insert (scatter, gather)
> >> - popcount
> > POP Rd,Rs
> >> - bit dot product
> > ???
> It was described in a different paper
> "Advanced Bit Manipulation Instruction Set Architecture" 2006 as:
>
> The DOTPROD instruction ANDs its two inputs, r2 and r3,
> and then computes the parity of the result.
> DOTPROD replaces a sequence of
> - AND of the two inputs,
> - POPCNT of the result,
> - AND to isolate the least significant bit.
<
Thanks.
>
> Example Usage: Error correcting coding.
> >> - bit matrix multiply
> > BMM Rd,Rb,[Rbm]
> >> I would add:
> >> - find first/last bit set/clear
> > FF1 Rd,Rs // can be from the left or from the right.
> > FF1 Rd,~Rs
> > SET = INS Rd,#~0,<len,off>
> > CLR = IINS Rd,#0,<len,off>
> > <
> > Looks like I got most of them.

Re: RISC-V vs. Aarch64

<sss374$fvv$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23132&group=comp.arch#23132

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Wed, 26 Jan 2022 10:17:06 -0800
Organization: A noiseless patient Spider
Lines: 49
Message-ID: <sss374$fvv$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<2022Jan8.101413@mips.complang.tuwien.ac.at>
<7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com>
<ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com>
<4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com>
<2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com>
<570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com>
<850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com>
<5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com>
<srnkf0$4cb$1@dont-email.me>
<b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com>
<srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org>
<81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com>
<srpjgj$541$1@dont-email.me>
<1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
<ssep4k$vke$1@dont-email.me> <ssgqj2$vef$1@gioia.aioe.org>
<ssn81h$fgu$1@dont-email.me> <mZUHJ.15870$mS1.14076@fx10.iad>
<sspc48$p9r$1@dont-email.me> <wTXHJ.25684$7U.2006@fx42.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 26 Jan 2022 18:17:08 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="ef005539cfd7a0b2b0ee257020e54149";
logging-data="16383"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18M85YymiiJ1HcW51doBfAY/eINtfgf4ks="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:IOQPmhvTokT9s3D+IeJig8aUJoo=
In-Reply-To: <wTXHJ.25684$7U.2006@fx42.iad>
Content-Language: en-US
 by: Stephen Fuld - Wed, 26 Jan 2022 18:17 UTC

On 1/25/2022 11:06 AM, EricP wrote:
> Stephen Fuld wrote:
>> On 1/25/2022 7:48 AM, EricP wrote:
>>
>>> There are a bunch of bit field manipulation instructions beyond those
>>> useful for multimedia codec encode/decode, signal processing, crypto.
>>> butterfly, reverse butterfly, permute, mix.
>>
>> Can you tell us what they are?
>
> I don't personally need such instructions as I rarely operate on
> bit fields and when I do it is not performance critical,
> but I note that others feel they do need them. One example paper:
>
> A New Basis for Shifters in General-Purpose Processors for
> Existing and Advanced Bit Manipulations, 2008
> https://www.researchgate.net/publication/220332176_A_New_Basis_for_Shifters_in_General-Purpose_Processors_for_Existing_and_Advanced_Bit_Manipulations
>
>
> describes the following bit field operations:
> - rotate right & left
> - shift right & left with zero or sign fill
> - bit field extract & insert
> - mix select right or left subwords
> - butterfly and inverse butterfly
> - parallel extract and insert (scatter, gather)
> - popcount
> - bit dot product
> - bit matrix multiply
>
> I would add:
> - find first/last bit set/clear
>
> potential usage in:
>
> - dna sequencing and sequence compression, alignment
> - crypto
> - error correction
> - bit stream signal processing, multiplexing
>

Thank you for the link. As Mitch said, his design includes almost all
of these, and many/most have been included in machines from mainframes
in the 1960s. As far as the details of the circuitry to implement them,
which the paper talks about, I am not competent to comment.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: RISC-V vs. Aarch64

<sss46e$mbt$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23133&group=comp.arch#23133

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Wed, 26 Jan 2022 10:33:50 -0800
Organization: A noiseless patient Spider
Lines: 66
Message-ID: <sss46e$mbt$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com>
<ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com>
<4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com>
<2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com>
<570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com>
<850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com>
<5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com>
<srnkf0$4cb$1@dont-email.me>
<b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com>
<srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org>
<81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com>
<srpjgj$541$1@dont-email.me>
<1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
<ssep4k$vke$1@dont-email.me> <ssgqj2$vef$1@gioia.aioe.org>
<ssn81h$fgu$1@dont-email.me> <mZUHJ.15870$mS1.14076@fx10.iad>
<sspr2i$8de$1@dont-email.me> <UT%HJ.360$N31.161@fx45.iad>
<6ndIJ.10280$tW.9922@fx39.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 26 Jan 2022 18:33:50 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="ef005539cfd7a0b2b0ee257020e54149";
logging-data="22909"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18YdGs4ZlJxKPl5TSsQQfbNIyxm11pXNXg="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:w9OWmiJsH9HEM/1uWRck3LvAN2o=
In-Reply-To: <6ndIJ.10280$tW.9922@fx39.iad>
Content-Language: en-US
 by: Stephen Fuld - Wed, 26 Jan 2022 18:33 UTC

On 1/26/2022 7:01 AM, EricP wrote:
> EricP wrote:
>> Ivan Godard wrote:
>>>
>>> You don't offer dynamic BFINS/EXT? Dynamc needs two more registers,
>>> unless you separately buid a descriptor like Mitch.
>>
>> Similar to Mitch, I'd have a bit field specifier as two 1-byte items,
>> the width and start pos, either in lower 16 bits of a register,
>> or a 16-bit immed constant.
>>
>> That allows both to fit in 32-bit instructions.
>>
>>   BFEXT rd_data, rs1_struct, rs2_wpfield
>>   BFEXT rd_data, rs1_struct, <#width,#pos>
>
> Oh and BFEXT sign extends the result.
> There is also BFEXTZ to extract and zero extend result.
>
>>   BFINS rsd_struct, rs1_data, rs2_wpfield
>>   BFINS rsd_struct, rs1_data, <#width,#pos>
>>
>> Note for insert the rsd_struct is both source and dest register.
>>
>> Future long format instructions could separate out the 5 insert registers
>> but that requires a 48-bit instruction, and I didn't want to bite off
>> the wide decoder right off the bat as that affects the whole fetch logic.
>>
>> I'm still debating double-wide instruction formats.
>> Insert/extract could be very long format with 5-8 register specifiers.
>> Alternatively it could be two instructions operating on high and low
>> parts,
>> like MULL and MULH.
>
> The wide bit field instructions extract and insert a 1..64 bit
> value in a 128-bit two register container.
> These essentially take all the branchy code dealing with register
> straddles and sign/zero extend and makes it straight line code.

I don't think so for successive operations, like decompressing the next
character from a compressed string. At least as I understand what you
are saying, on the first operation, you load the first two words and
extract some bits. But now you don't know if the next field to extract
is entirely contained in the words you have loaded, or spans into the
third word. Depending on this, you either do or don't need to load the
third word. If you say you always load the "starting word" and the next
one, you are sometimes doing extra loads of data that is already in a
register.

>
> My concern is to not embed in the ISA a requirement for a large
> number of read or write ports. For example, on an FPGA which has
> 1R 1W register banks, every extra read port requires a duplicate
> register bank, and a write port requires an extra write cycle.
> So keeping the ISA to requiring 3R 1W ports is a goal.

Understood. For your restrictions, my proposal would not work. :-( It
was "tailored" to capabilities already provided in the My 66000
architecture.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: RISC-V vs. Aarch64

<sss6jj$8cu$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23134&group=comp.arch#23134

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Wed, 26 Jan 2022 11:14:59 -0800
Organization: A noiseless patient Spider
Lines: 24
Message-ID: <sss6jj$8cu$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com>
<2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com>
<570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com>
<850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com>
<5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com>
<srnkf0$4cb$1@dont-email.me>
<b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com>
<srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org>
<81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com>
<srpjgj$541$1@dont-email.me>
<1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
<ssep4k$vke$1@dont-email.me> <ssgqj2$vef$1@gioia.aioe.org>
<ssn81h$fgu$1@dont-email.me> <mZUHJ.15870$mS1.14076@fx10.iad>
<sspc48$p9r$1@dont-email.me> <wTXHJ.25684$7U.2006@fx42.iad>
<d1959d5f-0d22-4b43-8156-0449f43d430dn@googlegroups.com>
<19eIJ.10298$tW.7327@fx39.iad>
<351f25a1-87de-4576-a16f-70623e410fc5n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 26 Jan 2022 19:14:59 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="ef005539cfd7a0b2b0ee257020e54149";
logging-data="8606"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19aVegEAyOuB8d4bKM2DgQIVwZyfVyOeZk="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:EXmkNqXIgDHe62NIcQLTSeP1jvA=
In-Reply-To: <351f25a1-87de-4576-a16f-70623e410fc5n@googlegroups.com>
Content-Language: en-US
 by: Stephen Fuld - Wed, 26 Jan 2022 19:14 UTC

On 1/26/2022 9:49 AM, MitchAlsup wrote:
> On Wednesday, January 26, 2022 at 9:54:42 AM UTC-6, EricP wrote:
>> MitchAlsup wrote:
>>> On Tuesday, January 25, 2022 at 1:07:11 PM UTC-6, EricP wrote:
>>>>
>>>> A New Basis for Shifters in General-Purpose Processors for
>>>> Existing and Advanced Bit Manipulations, 2008
>>>> https://www.researchgate.net/publication/220332176_A_New_Basis_for_Shifters_in_General-Purpose_Processors_for_Existing_and_Advanced_Bit_Manipulations
>>>>
>>> <
>>> My 66000 ISA encodings:

snip

>>>> - butterfly and inverse butterfly
>>> BITR Rd,Rs,<len,off>

I think this is the first time you have mentioned this instruction. Can
you give a description of what it does?

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: RISC-V vs. Aarch64

<576a3b47-8f69-427a-9c27-078b1ad3f049n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23136&group=comp.arch#23136

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:5d6d:: with SMTP id fn13mr15337125qvb.61.1643227164896;
Wed, 26 Jan 2022 11:59:24 -0800 (PST)
X-Received: by 2002:a05:6830:4491:: with SMTP id r17mr286870otv.112.1643227164601;
Wed, 26 Jan 2022 11:59:24 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 26 Jan 2022 11:59:24 -0800 (PST)
In-Reply-To: <sss46e$mbt$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:241f:af35:8f67:6f7f;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:241f:af35:8f67:6f7f
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <7557bf3a-61ce-4500-8cf8-ced2dbed7087n@googlegroups.com>
<ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com> <4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com>
<2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com> <570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com>
<850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com> <5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com>
<srnkf0$4cb$1@dont-email.me> <b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com>
<srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org>
<81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com> <srpjgj$541$1@dont-email.me>
<1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com> <ssep4k$vke$1@dont-email.me>
<ssgqj2$vef$1@gioia.aioe.org> <ssn81h$fgu$1@dont-email.me>
<mZUHJ.15870$mS1.14076@fx10.iad> <sspr2i$8de$1@dont-email.me>
<UT%HJ.360$N31.161@fx45.iad> <6ndIJ.10280$tW.9922@fx39.iad> <sss46e$mbt$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <576a3b47-8f69-427a-9c27-078b1ad3f049n@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 26 Jan 2022 19:59:24 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 65
 by: MitchAlsup - Wed, 26 Jan 2022 19:59 UTC

On Wednesday, January 26, 2022 at 12:33:53 PM UTC-6, Stephen Fuld wrote:
> On 1/26/2022 7:01 AM, EricP wrote:
> > EricP wrote:
> >> Ivan Godard wrote:
> >>>
> >>> You don't offer dynamic BFINS/EXT? Dynamc needs two more registers,
> >>> unless you separately buid a descriptor like Mitch.
> >>
> >> Similar to Mitch, I'd have a bit field specifier as two 1-byte items,
> >> the width and start pos, either in lower 16 bits of a register,
> >> or a 16-bit immed constant.
> >>
> >> That allows both to fit in 32-bit instructions.
> >>
> >> BFEXT rd_data, rs1_struct, rs2_wpfield
> >> BFEXT rd_data, rs1_struct, <#width,#pos>
> >
> > Oh and BFEXT sign extends the result.
> > There is also BFEXTZ to extract and zero extend result.
> >
> >> BFINS rsd_struct, rs1_data, rs2_wpfield
> >> BFINS rsd_struct, rs1_data, <#width,#pos>
> >>
> >> Note for insert the rsd_struct is both source and dest register.
> >>
> >> Future long format instructions could separate out the 5 insert registers
> >> but that requires a 48-bit instruction, and I didn't want to bite off
> >> the wide decoder right off the bat as that affects the whole fetch logic.
> >>
> >> I'm still debating double-wide instruction formats.
> >> Insert/extract could be very long format with 5-8 register specifiers.
> >> Alternatively it could be two instructions operating on high and low
> >> parts,
> >> like MULL and MULH.
> >
> > The wide bit field instructions extract and insert a 1..64 bit
> > value in a 128-bit two register container.
> > These essentially take all the branchy code dealing with register
> > straddles and sign/zero extend and makes it straight line code.
<
> I don't think so for successive operations, like decompressing the next
> character from a compressed string. At least as I understand what you
> are saying, on the first operation, you load the first two words and
> extract some bits. But now you don't know if the next field to extract
> is entirely contained in the words you have loaded, or spans into the
> third word. Depending on this, you either do or don't need to load the
> third word. If you say you always load the "starting word" and the next
> one, you are sometimes doing extra loads of data that is already in a
> register.
> >
> > My concern is to not embed in the ISA a requirement for a large
> > number of read or write ports. For example, on an FPGA which has
> > 1R 1W register banks, every extra read port requires a duplicate
> > register bank, and a write port requires an extra write cycle.
> > So keeping the ISA to requiring 3R 1W ports is a goal.
<
> Understood. For your restrictions, my proposal would not work. :-( It
> was "tailored" to capabilities already provided in the My 66000
> architecture.
<
The only functionality of My 66000 ISA that requires more than 3R1W
is when CARRY is attached to an instruction. Carry supplies another
Read and another write (up to 4R2W)
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: RISC-V vs. Aarch64

<sss9fc$tmh$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23137&group=comp.arch#23137

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Wed, 26 Jan 2022 12:03:56 -0800
Organization: A noiseless patient Spider
Lines: 72
Message-ID: <sss9fc$tmh$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<ad2ee700-b604-4565-9e24-3386580b90c8n@googlegroups.com>
<4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com>
<2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com>
<570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com>
<850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com>
<5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com>
<srnkf0$4cb$1@dont-email.me>
<b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com>
<srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org>
<81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com>
<srpjgj$541$1@dont-email.me>
<1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
<ssep4k$vke$1@dont-email.me> <ssgqj2$vef$1@gioia.aioe.org>
<ssn81h$fgu$1@dont-email.me> <mZUHJ.15870$mS1.14076@fx10.iad>
<sspr2i$8de$1@dont-email.me> <UT%HJ.360$N31.161@fx45.iad>
<6ndIJ.10280$tW.9922@fx39.iad> <sss46e$mbt$1@dont-email.me>
<576a3b47-8f69-427a-9c27-078b1ad3f049n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 26 Jan 2022 20:03:56 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="ef005539cfd7a0b2b0ee257020e54149";
logging-data="30417"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19cGvmNLS9R0Ysd6QawUlbD9hspG/WDHas="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:fGrQrhUTAKXq8DTd7JJ78wJ/gy4=
In-Reply-To: <576a3b47-8f69-427a-9c27-078b1ad3f049n@googlegroups.com>
Content-Language: en-US
 by: Stephen Fuld - Wed, 26 Jan 2022 20:03 UTC

On 1/26/2022 11:59 AM, MitchAlsup wrote:
> On Wednesday, January 26, 2022 at 12:33:53 PM UTC-6, Stephen Fuld wrote:
>> On 1/26/2022 7:01 AM, EricP wrote:
>>> EricP wrote:
>>>> Ivan Godard wrote:
>>>>>
>>>>> You don't offer dynamic BFINS/EXT? Dynamc needs two more registers,
>>>>> unless you separately buid a descriptor like Mitch.
>>>>
>>>> Similar to Mitch, I'd have a bit field specifier as two 1-byte items,
>>>> the width and start pos, either in lower 16 bits of a register,
>>>> or a 16-bit immed constant.
>>>>
>>>> That allows both to fit in 32-bit instructions.
>>>>
>>>> BFEXT rd_data, rs1_struct, rs2_wpfield
>>>> BFEXT rd_data, rs1_struct, <#width,#pos>
>>>
>>> Oh and BFEXT sign extends the result.
>>> There is also BFEXTZ to extract and zero extend result.
>>>
>>>> BFINS rsd_struct, rs1_data, rs2_wpfield
>>>> BFINS rsd_struct, rs1_data, <#width,#pos>
>>>>
>>>> Note for insert the rsd_struct is both source and dest register.
>>>>
>>>> Future long format instructions could separate out the 5 insert registers
>>>> but that requires a 48-bit instruction, and I didn't want to bite off
>>>> the wide decoder right off the bat as that affects the whole fetch logic.
>>>>
>>>> I'm still debating double-wide instruction formats.
>>>> Insert/extract could be very long format with 5-8 register specifiers.
>>>> Alternatively it could be two instructions operating on high and low
>>>> parts,
>>>> like MULL and MULH.
>>>
>>> The wide bit field instructions extract and insert a 1..64 bit
>>> value in a 128-bit two register container.
>>> These essentially take all the branchy code dealing with register
>>> straddles and sign/zero extend and makes it straight line code.
> <
>> I don't think so for successive operations, like decompressing the next
>> character from a compressed string. At least as I understand what you
>> are saying, on the first operation, you load the first two words and
>> extract some bits. But now you don't know if the next field to extract
>> is entirely contained in the words you have loaded, or spans into the
>> third word. Depending on this, you either do or don't need to load the
>> third word. If you say you always load the "starting word" and the next
>> one, you are sometimes doing extra loads of data that is already in a
>> register.
>>>
>>> My concern is to not embed in the ISA a requirement for a large
>>> number of read or write ports. For example, on an FPGA which has
>>> 1R 1W register banks, every extra read port requires a duplicate
>>> register bank, and a write port requires an extra write cycle.
>>> So keeping the ISA to requiring 3R 1W ports is a goal.
> <
>> Understood. For your restrictions, my proposal would not work. :-( It
>> was "tailored" to capabilities already provided in the My 66000
>> architecture.
> <
> The only functionality of My 66000 ISA that requires more than 3R1W
> is when CARRY is attached to an instruction. Carry supplies another
> Read and another write (up to 4R2W)

Yes. The availability of the extra ports that CARRY provides are
integral to my proposal.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: RISC-V vs. Aarch64

<30ddf3ce-61a1-45f6-8fae-5e4dd676544en@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23140&group=comp.arch#23140

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:4550:: with SMTP id u16mr405362qkp.680.1643229276458;
Wed, 26 Jan 2022 12:34:36 -0800 (PST)
X-Received: by 2002:a05:6808:1598:: with SMTP id t24mr4795620oiw.50.1643229276042;
Wed, 26 Jan 2022 12:34:36 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 26 Jan 2022 12:34:35 -0800 (PST)
In-Reply-To: <sss6jj$8cu$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:241f:af35:8f67:6f7f;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:241f:af35:8f67:6f7f
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <4d2fbc82-af69-4388-bfa5-e3b2be652744n@googlegroups.com>
<2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com> <570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com>
<850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com> <5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com>
<srnkf0$4cb$1@dont-email.me> <b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com>
<srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org>
<81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com> <srpjgj$541$1@dont-email.me>
<1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com> <ssep4k$vke$1@dont-email.me>
<ssgqj2$vef$1@gioia.aioe.org> <ssn81h$fgu$1@dont-email.me>
<mZUHJ.15870$mS1.14076@fx10.iad> <sspc48$p9r$1@dont-email.me>
<wTXHJ.25684$7U.2006@fx42.iad> <d1959d5f-0d22-4b43-8156-0449f43d430dn@googlegroups.com>
<19eIJ.10298$tW.7327@fx39.iad> <351f25a1-87de-4576-a16f-70623e410fc5n@googlegroups.com>
<sss6jj$8cu$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <30ddf3ce-61a1-45f6-8fae-5e4dd676544en@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 26 Jan 2022 20:34:36 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 45
 by: MitchAlsup - Wed, 26 Jan 2022 20:34 UTC

On Wednesday, January 26, 2022 at 1:15:02 PM UTC-6, Stephen Fuld wrote:
> On 1/26/2022 9:49 AM, MitchAlsup wrote:
> > On Wednesday, January 26, 2022 at 9:54:42 AM UTC-6, EricP wrote:
> >> MitchAlsup wrote:
> >>> On Tuesday, January 25, 2022 at 1:07:11 PM UTC-6, EricP wrote:
> >>>>
> >>>> A New Basis for Shifters in General-Purpose Processors for
> >>>> Existing and Advanced Bit Manipulations, 2008
> >>>> https://www.researchgate.net/publication/220332176_A_New_Basis_for_Shifters_in_General-Purpose_Processors_for_Existing_and_Advanced_Bit_Manipulations
> >>>>
> >>> <
> >>> My 66000 ISA encodings:
> snip
> >>>> - butterfly and inverse butterfly
> >>> BITR Rd,Rs,<len,off>
> I think this is the first time you have mentioned this instruction. Can
> you give a description of what it does?
<
What the document says: "Containers of size ≡ width are reverse into an
intermediate operand, and then this intermediate operand is shifted right
(SR) by the offset specification."
<
Len is restricted to powers of 2.
<
So reversing all of the bytes in a register would be::
BITR Rd,Rs,<8,0>
Reversing the higher order 5 bytes in a register would be::
BITR Rd,Rs,<8,40>
The 5 HoBytes in LE from Rs go into the LoBytes of Rd in BE.
<
To reverse all of the bits in a register:
BITR Rd,Rs,<1,0>
To reverse all of the 4-bit fields in a register
BITR Rd,Rs,<4,0>
To reverse the first three (3) 16-bit fields
BITR Rd,Rs,<16,16>
<
I don't remember why I restricted the len field to be power of 2.
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: RISC-V vs. Aarch64

<sssc70$joi$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23143&group=comp.arch#23143

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Wed, 26 Jan 2022 12:50:37 -0800
Organization: A noiseless patient Spider
Lines: 50
Message-ID: <sssc70$joi$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com>
<570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com>
<850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com>
<5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com>
<srnkf0$4cb$1@dont-email.me>
<b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com>
<srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org>
<81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com>
<srpjgj$541$1@dont-email.me>
<1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
<ssep4k$vke$1@dont-email.me> <ssgqj2$vef$1@gioia.aioe.org>
<ssn81h$fgu$1@dont-email.me> <mZUHJ.15870$mS1.14076@fx10.iad>
<sspc48$p9r$1@dont-email.me> <wTXHJ.25684$7U.2006@fx42.iad>
<d1959d5f-0d22-4b43-8156-0449f43d430dn@googlegroups.com>
<19eIJ.10298$tW.7327@fx39.iad>
<351f25a1-87de-4576-a16f-70623e410fc5n@googlegroups.com>
<sss6jj$8cu$1@dont-email.me>
<30ddf3ce-61a1-45f6-8fae-5e4dd676544en@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 26 Jan 2022 20:50:40 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="e48622d01422e30ebc3fe8bc806bf2f8";
logging-data="20242"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+AQOK5bbx3nT2vTTOqgNHzTPIToY+0xkc="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:2++u7BqBo6wO7TbnRNwAjMIXsRM=
In-Reply-To: <30ddf3ce-61a1-45f6-8fae-5e4dd676544en@googlegroups.com>
Content-Language: en-US
 by: Stephen Fuld - Wed, 26 Jan 2022 20:50 UTC

On 1/26/2022 12:34 PM, MitchAlsup wrote:
> On Wednesday, January 26, 2022 at 1:15:02 PM UTC-6, Stephen Fuld wrote:
>> On 1/26/2022 9:49 AM, MitchAlsup wrote:
>>> On Wednesday, January 26, 2022 at 9:54:42 AM UTC-6, EricP wrote:
>>>> MitchAlsup wrote:
>>>>> On Tuesday, January 25, 2022 at 1:07:11 PM UTC-6, EricP wrote:
>>>>>>
>>>>>> A New Basis for Shifters in General-Purpose Processors for
>>>>>> Existing and Advanced Bit Manipulations, 2008
>>>>>> https://www.researchgate.net/publication/220332176_A_New_Basis_for_Shifters_in_General-Purpose_Processors_for_Existing_and_Advanced_Bit_Manipulations
>>>>>>
>>>>> <
>>>>> My 66000 ISA encodings:
>> snip
>>>>>> - butterfly and inverse butterfly
>>>>> BITR Rd,Rs,<len,off>
>> I think this is the first time you have mentioned this instruction. Can
>> you give a description of what it does?
> <
> What the document says: "Containers of size ≡ width are reverse into an
> intermediate operand, and then this intermediate operand is shifted right
> (SR) by the offset specification."
> <
> Len is restricted to powers of 2.
> <
> So reversing all of the bytes in a register would be::
> BITR Rd,Rs,<8,0>
> Reversing the higher order 5 bytes in a register would be::
> BITR Rd,Rs,<8,40>
> The 5 HoBytes in LE from Rs go into the LoBytes of Rd in BE.
> <
> To reverse all of the bits in a register:
> BITR Rd,Rs,<1,0>
> To reverse all of the 4-bit fields in a register
> BITR Rd,Rs,<4,0>
> To reverse the first three (3) 16-bit fields
> BITR Rd,Rs,<16,16>
> <
> I don't remember why I restricted the len field to be power of 2.

Thanks. It must have been added after you sent me the then latest document.

Provides a lot of interesting capabilities, and I am sure there are
important uses, but not for anything I have done.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: RISC-V vs. Aarch64

<3ce00857-aa5d-42c2-8d2b-1adba72efc4en@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23144&group=comp.arch#23144

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:c45:: with SMTP id r5mr78340qvj.50.1643230568094;
Wed, 26 Jan 2022 12:56:08 -0800 (PST)
X-Received: by 2002:a05:6808:1598:: with SMTP id t24mr4840777oiw.50.1643230567931;
Wed, 26 Jan 2022 12:56:07 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 26 Jan 2022 12:56:07 -0800 (PST)
In-Reply-To: <sssc70$joi$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:241f:af35:8f67:6f7f;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:241f:af35:8f67:6f7f
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com>
<570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com> <850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com>
<5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com> <srnkf0$4cb$1@dont-email.me>
<b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com> <srp3n0$e0a$1@dont-email.me>
<srp4lu$hhg$1@gioia.aioe.org> <81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com>
<srpjgj$541$1@dont-email.me> <1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
<ssep4k$vke$1@dont-email.me> <ssgqj2$vef$1@gioia.aioe.org>
<ssn81h$fgu$1@dont-email.me> <mZUHJ.15870$mS1.14076@fx10.iad>
<sspc48$p9r$1@dont-email.me> <wTXHJ.25684$7U.2006@fx42.iad>
<d1959d5f-0d22-4b43-8156-0449f43d430dn@googlegroups.com> <19eIJ.10298$tW.7327@fx39.iad>
<351f25a1-87de-4576-a16f-70623e410fc5n@googlegroups.com> <sss6jj$8cu$1@dont-email.me>
<30ddf3ce-61a1-45f6-8fae-5e4dd676544en@googlegroups.com> <sssc70$joi$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <3ce00857-aa5d-42c2-8d2b-1adba72efc4en@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 26 Jan 2022 20:56:08 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 56
 by: MitchAlsup - Wed, 26 Jan 2022 20:56 UTC

On Wednesday, January 26, 2022 at 2:50:43 PM UTC-6, Stephen Fuld wrote:
> On 1/26/2022 12:34 PM, MitchAlsup wrote:
> > On Wednesday, January 26, 2022 at 1:15:02 PM UTC-6, Stephen Fuld wrote:
> >> On 1/26/2022 9:49 AM, MitchAlsup wrote:
> >>> On Wednesday, January 26, 2022 at 9:54:42 AM UTC-6, EricP wrote:
> >>>> MitchAlsup wrote:
> >>>>> On Tuesday, January 25, 2022 at 1:07:11 PM UTC-6, EricP wrote:
> >>>>>>
> >>>>>> A New Basis for Shifters in General-Purpose Processors for
> >>>>>> Existing and Advanced Bit Manipulations, 2008
> >>>>>> https://www.researchgate.net/publication/220332176_A_New_Basis_for_Shifters_in_General-Purpose_Processors_for_Existing_and_Advanced_Bit_Manipulations
> >>>>>>
> >>>>> <
> >>>>> My 66000 ISA encodings:
> >> snip
> >>>>>> - butterfly and inverse butterfly
> >>>>> BITR Rd,Rs,<len,off>
> >> I think this is the first time you have mentioned this instruction. Can
> >> you give a description of what it does?
> > <
> > What the document says: "Containers of size ≡ width are reverse into an
> > intermediate operand, and then this intermediate operand is shifted right
> > (SR) by the offset specification."
> > <
> > Len is restricted to powers of 2.
> > <
> > So reversing all of the bytes in a register would be::
> > BITR Rd,Rs,<8,0>
> > Reversing the higher order 5 bytes in a register would be::
> > BITR Rd,Rs,<8,40>
> > The 5 HoBytes in LE from Rs go into the LoBytes of Rd in BE.
> > <
> > To reverse all of the bits in a register:
> > BITR Rd,Rs,<1,0>
> > To reverse all of the 4-bit fields in a register
> > BITR Rd,Rs,<4,0>
> > To reverse the first three (3) 16-bit fields
> > BITR Rd,Rs,<16,16>
> > <
> > I don't remember why I restricted the len field to be power of 2.
> Thanks. It must have been added after you sent me the then latest document.
>
> Provides a lot of interesting capabilities, and I am sure there are
> important uses, but not for anything I have done.
<
Brian (compiler guy) ask for it to deal with endian flipping code sequences
from (I think) a network controller.
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

Re: RISC-V vs. Aarch64

<eUiIJ.15874$mS1.4038@fx10.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23146&group=comp.arch#23146

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx10.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com> <570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com> <850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com> <5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com> <srnkf0$4cb$1@dont-email.me> <b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com> <srp3n0$e0a$1@dont-email.me> <srp4lu$hhg$1@gioia.aioe.org> <81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com> <srpjgj$541$1@dont-email.me> <1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com> <ssep4k$vke$1@dont-email.me> <ssgqj2$vef$1@gioia.aioe.org> <ssn81h$fgu$1@dont-email.me> <mZUHJ.15870$mS1.14076@fx10.iad> <sspc48$p9r$1@dont-email.me> <wTXHJ.25684$7U.2006@fx42.iad> <d1959d5f-0d22-4b43-8156-0449f43d430dn@googlegroups.com> <19eIJ.10298$tW.7327@fx39.iad> <351f25a1-87de-4576-a16f-70623e410fc5n@googlegroups.com> <sss6jj$8cu$1@dont-email.me> <30ddf3ce-61a1-45f6-8fae-5e4dd676544en@googlegroups.com>
In-Reply-To: <30ddf3ce-61a1-45f6-8fae-5e4dd676544en@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 27
Message-ID: <eUiIJ.15874$mS1.4038@fx10.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 26 Jan 2022 21:18:02 UTC
Date: Wed, 26 Jan 2022 16:17:31 -0500
X-Received-Bytes: 2663
 by: EricP - Wed, 26 Jan 2022 21:17 UTC

MitchAlsup wrote:
> On Wednesday, January 26, 2022 at 1:15:02 PM UTC-6, Stephen Fuld wrote:
>> On 1/26/2022 9:49 AM, MitchAlsup wrote:
>>>>> My 66000 ISA encodings:
>> snip
>>>>>> - butterfly and inverse butterfly
>>>>> BITR Rd,Rs,<len,off>
>> I think this is the first time you have mentioned this instruction. Can
>> you give a description of what it does?
> <
> What the document says: "Containers of size ≡ width are reverse into an
> intermediate operand, and then this intermediate operand is shifted right
> (SR) by the offset specification."
> <
> Len is restricted to powers of 2.
> <
> So reversing all of the bytes in a register would be::
> BITR Rd,Rs,<8,0>
> Reversing the higher order 5 bytes in a register would be::
> BITR Rd,Rs,<8,40>
> The 5 HoBytes in LE from Rs go into the LoBytes of Rd in BE.

Shouldn't this be BITR Rd,Rs,<8,24> to reverse all the bytes and then
right shift bit offset 24 into offset 0, keeping just the HO 5 bytes?

Re: RISC-V vs. Aarch64

<318bdc76-6c94-4f75-87d1-a80aaf6d94b6n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23148&group=comp.arch#23148

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:a905:: with SMTP id s5mr572607qke.111.1643233323883;
Wed, 26 Jan 2022 13:42:03 -0800 (PST)
X-Received: by 2002:a05:6808:1a0c:: with SMTP id bk12mr4932329oib.64.1643233323658;
Wed, 26 Jan 2022 13:42:03 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 26 Jan 2022 13:42:03 -0800 (PST)
In-Reply-To: <eUiIJ.15874$mS1.4038@fx10.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:241f:af35:8f67:6f7f;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:241f:af35:8f67:6f7f
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <2e706405-006a-49bb-8e8a-f634d749205en@googlegroups.com>
<570acc73-a5da-497f-8ec4-810150e0a9f1n@googlegroups.com> <850b7681-204a-4df6-9095-cd6ee816a7d5n@googlegroups.com>
<5ea00397-5572-4fbd-bfb3-85c3554f1eb9n@googlegroups.com> <srnkf0$4cb$1@dont-email.me>
<b4e98991-4fb9-4ef7-a831-430c3fc10145n@googlegroups.com> <srp3n0$e0a$1@dont-email.me>
<srp4lu$hhg$1@gioia.aioe.org> <81c0ddc6-4b46-4b3d-b64b-e65963889214n@googlegroups.com>
<srpjgj$541$1@dont-email.me> <1e0b0ba3-e11c-4a14-a0c7-7c074f2f9ba7n@googlegroups.com>
<ssep4k$vke$1@dont-email.me> <ssgqj2$vef$1@gioia.aioe.org>
<ssn81h$fgu$1@dont-email.me> <mZUHJ.15870$mS1.14076@fx10.iad>
<sspc48$p9r$1@dont-email.me> <wTXHJ.25684$7U.2006@fx42.iad>
<d1959d5f-0d22-4b43-8156-0449f43d430dn@googlegroups.com> <19eIJ.10298$tW.7327@fx39.iad>
<351f25a1-87de-4576-a16f-70623e410fc5n@googlegroups.com> <sss6jj$8cu$1@dont-email.me>
<30ddf3ce-61a1-45f6-8fae-5e4dd676544en@googlegroups.com> <eUiIJ.15874$mS1.4038@fx10.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <318bdc76-6c94-4f75-87d1-a80aaf6d94b6n@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 26 Jan 2022 21:42:03 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 30
 by: MitchAlsup - Wed, 26 Jan 2022 21:42 UTC

On Wednesday, January 26, 2022 at 3:18:06 PM UTC-6, EricP wrote:
> MitchAlsup wrote:
> > On Wednesday, January 26, 2022 at 1:15:02 PM UTC-6, Stephen Fuld wrote:
> >> On 1/26/2022 9:49 AM, MitchAlsup wrote:
> >>>>> My 66000 ISA encodings:
> >> snip
> >>>>>> - butterfly and inverse butterfly
> >>>>> BITR Rd,Rs,<len,off>
> >> I think this is the first time you have mentioned this instruction. Can
> >> you give a description of what it does?
> > <
> > What the document says: "Containers of size ≡ width are reverse into an
> > intermediate operand, and then this intermediate operand is shifted right
> > (SR) by the offset specification."
> > <
> > Len is restricted to powers of 2.
> > <
> > So reversing all of the bytes in a register would be::
> > BITR Rd,Rs,<8,0>
> > Reversing the higher order 5 bytes in a register would be::
> > BITR Rd,Rs,<8,40>
> > The 5 HoBytes in LE from Rs go into the LoBytes of Rd in BE.
<
> Shouldn't this be BITR Rd,Rs,<8,24> to reverse all the bytes and then
> right shift bit offset 24 into offset 0, keeping just the HO 5 bytes?
<
you are probably correct.

Re: fixing Spectre (was: The type of Mill's belt's slots)

<8ec2646b-bf55-4dfa-a9eb-50b1b37da4acn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23178&group=comp.arch#23178

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:5be8:: with SMTP id k8mr5039859qvc.118.1643311770431;
Thu, 27 Jan 2022 11:29:30 -0800 (PST)
X-Received: by 2002:a05:6830:1db8:: with SMTP id z24mr2833466oti.282.1643311770076;
Thu, 27 Jan 2022 11:29:30 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 27 Jan 2022 11:29:29 -0800 (PST)
In-Reply-To: <2022Jan24.133212@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=64.26.99.248; posting-account=6JNn0QoAAAD-Scrkl0ClrfutZTkrOS9S
NNTP-Posting-Host: 64.26.99.248
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <ssfreu$ck$1@dont-email.me>
<666a6ae5-0bac-421e-81c6-8e3d752192e0n@googlegroups.com> <2022Jan22.144235@mips.complang.tuwien.ac.at>
<effb9090-c7c1-4e64-9a71-5b4b5d444d4en@googlegroups.com> <2022Jan23.002646@mips.complang.tuwien.ac.at>
<221708c9-f4be-4421-a680-93a3e3599bd1n@googlegroups.com> <b7b60e9f-e043-4265-90bc-5e83a496ccc6n@googlegroups.com>
<a2c3f8cc-526b-40d0-9937-e54bb3fb833cn@googlegroups.com> <2022Jan24.133212@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <8ec2646b-bf55-4dfa-a9eb-50b1b37da4acn@googlegroups.com>
Subject: Re: fixing Spectre (was: The type of Mill's belt's slots)
From: paaroncl...@gmail.com (Paul A. Clayton)
Injection-Date: Thu, 27 Jan 2022 19:29:30 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 65
 by: Paul A. Clayton - Thu, 27 Jan 2022 19:29 UTC

On Monday, January 24, 2022 at 8:45:56 AM UTC-5, Anton Ertl wrote:
[snip]
> Yet when we look at the performance of 2-wide in-order vs. 2-wide OoO
> machines, we see that the OoO machines are roughly twice as fast; Only
> a small part of this can be explained with OoO papering over cache
> misses:
>
> - Intel Atom 330, 1.6GHz, 512K L2 Zotac ION A, Debian 9 64bit 2.368
> - AMD E-450 1650MHz (Lenovo Thinkpad X121e), Ubuntu 11.10 64-bit 1.216
> - Celeron J1900 (Silvermont) 2416MHz (Shuttle XS35V4) Ubuntu16.10 1.052
>
> - Odroid N2 (1896MHz Cortex A53) Ubuntu 18.04 2.488
> - Odroid N2 (1800MHz Cortex A73) Ubuntu 18.04 1.224
>
> The Atom 330 and Cortex-A53 are in-order, while the AMD E-450,
> Silvermont, and Cortex-A73 are OoO.

As has been mentioned before, hiding L1 latency (reducing the
penalty for larger L1 caches) is also a significant benefit of
OoO. With wire delay increasing more than switching delay (and
cache latency being dominated by wire delay) in more recent
manufacturing processes, this factor likely becomes more
significant. High FO4 cycle time and skewed pipelines would
probably reduce this. (It might be interesting for academic
purposes to graph cycle time, cache size and latency, and other
factors relative to performance across multiple past processes.)

(Skewed pipelines and especially second-chance pipelines where
an operation can issue in two different cycled depending on
operand availability seem to fuzz the boundary of between I-O
and OoO. Forwarding already presents register renaming and in
the case of out-of-order completion handles RAW hazards.)

(While an unconditional load can be hoisted by a compiler, most
ISAs lack the ability to indicate that a load is speculative
much less that it is dependent on a particular condition — this
latter feature might allow branch prediction to defer unlikely
to be used loads, more closely emulating OoO. [Even the Mill
could use such since memory-access scheduling is somewhat
decoupled. Deferring a unlikely load could reduce energy and
potentially improve performance when resources are overcommited
relative to the best case such as no cache bank conflicts.] I
suspect such an indication is not worth the encoding cost, but
the concept seems interesting.)

OoO also has the side effect that wider execution than decode
makes more sense. A 2-wide decode in-order design could
presumably benefit a little from 3- or 4-wide issue even if
no instruction cracking occurs, but such would also presumably
make the design less area and power efficient.

I *suspect* the LaTex benchmark also has much more control flow
complexity and pointer chasing than many workloads. This would
presumably give OoO some advantage over I-O.

The goals of (and design resources for) the in-order designs
are likely different than for the OoO designs; a simple
comparison based on actual products is unlikely to accurately
reflect the best-case differences.

While the I-O designs seem to be more constrained than earlier
(when a 4-wide in-order design seemed reasonable), I doubt
every processor will be an OoO design any time soon. (However,
I am very biased to favor diversity both from a desire for
interesting designs and from a bias toward "over-optimization".)

Re: fixing Spectre (was: The type of Mill's belt's slots)

<2022Jan30.180831@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23226&group=comp.arch#23226

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: fixing Spectre (was: The type of Mill's belt's slots)
Date: Sun, 30 Jan 2022 17:08:31 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 62
Message-ID: <2022Jan30.180831@mips.complang.tuwien.ac.at>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <2022Jan22.144235@mips.complang.tuwien.ac.at> <effb9090-c7c1-4e64-9a71-5b4b5d444d4en@googlegroups.com> <2022Jan23.002646@mips.complang.tuwien.ac.at> <221708c9-f4be-4421-a680-93a3e3599bd1n@googlegroups.com> <b7b60e9f-e043-4265-90bc-5e83a496ccc6n@googlegroups.com> <a2c3f8cc-526b-40d0-9937-e54bb3fb833cn@googlegroups.com> <2022Jan24.133212@mips.complang.tuwien.ac.at> <8ec2646b-bf55-4dfa-a9eb-50b1b37da4acn@googlegroups.com>
Injection-Info: reader02.eternal-september.org; posting-host="78eab69f564667b097362b4e4e48024d";
logging-data="4317"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/eQGFJrYNasiGlED0K9Y41"
Cancel-Lock: sha1:PjPUEr7R8edsAF6nCCa0KIGlcnM=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Sun, 30 Jan 2022 17:08 UTC

"Paul A. Clayton" <paaronclayton@gmail.com> writes:
>As has been mentioned before, hiding L1 latency (reducing the=20
>penalty for larger L1 caches) is also a significant benefit of=20
>OoO.

Well, "hiding" may be the wrong word. In any case it supports the
idea that OoO is better at extracting instruction-level parallelism
out of a program than compiler scheduling.

>(While an unconditional load can be hoisted by a compiler, most=20
>ISAs lack the ability to indicate that a load is speculative=20
>much less that it is dependent on a particular condition =E2=80=94 this=20
>latter feature might allow branch prediction to defer unlikely=20
>to be used loads, more closely emulating OoO.

That's easy to indicate for in-order: Just put the load where it would
naively be compiled, rather than trying to pull it upwards.

It's much harder to deal with loads that are likely to be used: then
you often want to pull it upwards, but IIRC even IA-64 has limited
possibilities for that: It has predicated instructions, but they have
a data dependency on the condition (and the instruction waits until
the condition is available); it has prefetches, but they don't help
with the final L1 latency; and it has the ALAT, but it only helps to
pull the load up across stores, not across branches.

>I *suspect* the LaTex benchmark also has much more control flow=20
>complexity and pointer chasing than many workloads. This would=20
>presumably give OoO some advantage over I-O.

Yes, it's not matrix multiplication. However, I think that there are
many applications with similar amounts of control flow and pointer
chasing as our LaTeX benchmark.

>The goals of (and design resources for) the in-order designs=20
>are likely different than for the OoO designs;

Not in the case on Bonnell and Silvermont. Both are designed for as
low-power, low-cost cores, and both had the resources of Intel at
their disposal.

>While the I-O designs seem to be more constrained than earlier
>(when a 4-wide in-order design seemed reasonable), I doubt=20
>every processor will be an OoO design any time soon.

I certainly don't expect an OoO replacement for Cortex-M0 or
Cortex-M4, but certainly in AMD64 space in-order has died out.

And it's completely unclear to me why ARM is still designing in-order
A64 cores; they seem to offer no real advantage in J/instruction nor
performance per area over the low-power OoO cores from ARM. What they
offer is more cores/area; maybe the bragging rights of offering 4
E-cores rather than 2 that are twice as fast are the reason. Or maybe
the actual usage of their in-order cores is not reflected in the
performance results I have seen on anandtech. In any case, Apple uses
OoO cores as E-cores.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: fixing Spectre (was: The type of Mill's belt's slots)

<05d0aa45-af05-4163-a26c-51a579dd8ccen@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23246&group=comp.arch#23246

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:59c3:: with SMTP id f3mr15881930qtf.307.1643653332507;
Mon, 31 Jan 2022 10:22:12 -0800 (PST)
X-Received: by 2002:a4a:8343:: with SMTP id q3mr10656576oog.89.1643653332259;
Mon, 31 Jan 2022 10:22:12 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 31 Jan 2022 10:22:12 -0800 (PST)
In-Reply-To: <2022Jan30.180831@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=64.26.97.60; posting-account=6JNn0QoAAAD-Scrkl0ClrfutZTkrOS9S
NNTP-Posting-Host: 64.26.97.60
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <2022Jan22.144235@mips.complang.tuwien.ac.at>
<effb9090-c7c1-4e64-9a71-5b4b5d444d4en@googlegroups.com> <2022Jan23.002646@mips.complang.tuwien.ac.at>
<221708c9-f4be-4421-a680-93a3e3599bd1n@googlegroups.com> <b7b60e9f-e043-4265-90bc-5e83a496ccc6n@googlegroups.com>
<a2c3f8cc-526b-40d0-9937-e54bb3fb833cn@googlegroups.com> <2022Jan24.133212@mips.complang.tuwien.ac.at>
<8ec2646b-bf55-4dfa-a9eb-50b1b37da4acn@googlegroups.com> <2022Jan30.180831@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <05d0aa45-af05-4163-a26c-51a579dd8ccen@googlegroups.com>
Subject: Re: fixing Spectre (was: The type of Mill's belt's slots)
From: paaroncl...@gmail.com (Paul A. Clayton)
Injection-Date: Mon, 31 Jan 2022 18:22:12 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 102
 by: Paul A. Clayton - Mon, 31 Jan 2022 18:22 UTC

On Sunday, January 30, 2022 at 1:17:54 PM UTC-5, Anton Ertl wrote:
> "Paul A. Clayton" <paaron...@gmail.com> writes:
>>As has been mentioned before, hiding L1 latency (reducing the=20
>>penalty for larger L1 caches) is also a significant benefit of=20
>>OoO.
>
> Well, "hiding" may be the wrong word. In any case it supports the
> idea that OoO is better at extracting instruction-level parallelism
> out of a program than compiler scheduling.

For most ISAs the compiler is very limited in how much it can
schedule operations. While Itanium could hoist loads before
guarding conditions, each load used a register name — even with
128 GPRs this may be a constraint. Use of speculative load values
was possible on Itanium, providing a limited form of software
branch prediction with a software misprediction fallback path.
(Given that static prediction will typically be less accurate than
dynamic prediction, having less information available, this seems
less than ideal.)

I do not know of any implementation of "informing loads", where
a hit/miss condition could be tested and an alternate path taken
to cover miss latency. This is another technique that would allow
software to provide parallelism to an in-order implementation.
(Small threads could also be used for supplying useful work
while waiting for data return on a high latency load.

Compiler-assumed resource constraints (which may be current
but not fundamental) presumably also limit scheduling.

>>(While an unconditional load can be hoisted by a compiler, most=20
>>ISAs lack the ability to indicate that a load is speculative=20
>>much less that it is dependent on a particular condition =E2=80=94 this=20
>>latter feature might allow branch prediction to defer unlikely=20
>>to be used loads, more closely emulating OoO.
>
> That's easy to indicate for in-order: Just put the load where it would
> naively be compiled, rather than trying to pull it upwards.

Did you mean "for out-of-order"? Even with out-of-order, hoisting
a load can be useful.

> It's much harder to deal with loads that are likely to be used: then
> you often want to pull it upwards, but IIRC even IA-64 has limited
> possibilities for that: It has predicated instructions, but they have
> a data dependency on the condition (and the instruction waits until
> the condition is available); it has prefetches, but they don't help
> with the final L1 latency; and it has the ALAT, but it only helps to
> pull the load up across stores, not across branches.

I agree that using a speculative value is more difficult. Branch
resolution and pointer chasing seem to be critical uses; I suspect
even an OoO implementation might benefit from more and more
timely information.

> >I *suspect* the LaTex benchmark also has much more control flow=20
> >complexity and pointer chasing than many workloads. This would=20
> >presumably give OoO some advantage over I-O.
>
> Yes, it's not matrix multiplication. However, I think that there are
> many applications with similar amounts of control flow and pointer
> chasing as our LaTeX benchmark.

I agree that it is probably a more useful "general purpose" benchmark
and certainly useful even if seen as an important corner case.

> >The goals of (and design resources for) the in-order designs=20
> >are likely different than for the OoO designs;
>
> Not in the case on Bonnell and Silvermont. Both are designed for as
> low-power, low-cost cores, and both had the resources of Intel at
> their disposal.

I am not convinced of the equivalence of skill/effort. x86-64 is also
not register rich.

> >While the I-O designs seem to be more constrained than earlier
> >(when a 4-wide in-order design seemed reasonable), I doubt=20
> >every processor will be an OoO design any time soon.
> I certainly don't expect an OoO replacement for Cortex-M0 or
> Cortex-M4, but certainly in AMD64 space in-order has died out.
>
> And it's completely unclear to me why ARM is still designing in-order
> A64 cores; they seem to offer no real advantage in J/instruction nor
> performance per area over the low-power OoO cores from ARM. What they
> offer is more cores/area; maybe the bragging rights of offering 4
> E-cores rather than 2 that are twice as fast are the reason. Or maybe
> the actual usage of their in-order cores is not reflected in the
> performance results I have seen on anandtech. In any case, Apple uses
> OoO cores as E-cores.

I suspect there are valid use cases as well as marketing factors.
(I want to think more on this, but I need to get to work. [Also my dislike
for Google Groups interface is increasing. Composing a post is much
slower — I suspect background javascript is to blame.])
>
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: fixing Spectre (was: The type of Mill's belt's slots)

<st9bep$po1$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23247&group=comp.arch#23247

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: fixing Spectre (was: The type of Mill's belt's slots)
Date: Mon, 31 Jan 2022 10:57:27 -0800
Organization: A noiseless patient Spider
Lines: 10
Message-ID: <st9bep$po1$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<2022Jan22.144235@mips.complang.tuwien.ac.at>
<effb9090-c7c1-4e64-9a71-5b4b5d444d4en@googlegroups.com>
<2022Jan23.002646@mips.complang.tuwien.ac.at>
<221708c9-f4be-4421-a680-93a3e3599bd1n@googlegroups.com>
<b7b60e9f-e043-4265-90bc-5e83a496ccc6n@googlegroups.com>
<a2c3f8cc-526b-40d0-9937-e54bb3fb833cn@googlegroups.com>
<2022Jan24.133212@mips.complang.tuwien.ac.at>
<8ec2646b-bf55-4dfa-a9eb-50b1b37da4acn@googlegroups.com>
<2022Jan30.180831@mips.complang.tuwien.ac.at>
<05d0aa45-af05-4163-a26c-51a579dd8ccen@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 31 Jan 2022 18:57:29 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="04ecd58dfc8c48517a8d585ba8b80522";
logging-data="26369"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18w4ec3Q8NJE0nQ6/0+Dfq9"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:3U5eu6hu03bQiAJJU8uw5dK/R0U=
In-Reply-To: <05d0aa45-af05-4163-a26c-51a579dd8ccen@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Mon, 31 Jan 2022 18:57 UTC

On 1/31/2022 10:22 AM, Paul A. Clayton wrote:

<snip>

> (I want to think more on this, but I need to get to work. [Also my dislike
> for Google Groups interface is increasing. Composing a post is much
> slower — I suspect background javascript is to blame.])

Others report similar. I use Thunderbird via news.eternal-september.org
and have no trouble.

Re: fixing Spectre (was: The type of Mill's belt's slots)

<st9s0t$m4$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23248&group=comp.arch#23248

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: fixing Spectre (was: The type of Mill's belt's slots)
Date: Mon, 31 Jan 2022 15:40:11 -0800
Organization: A noiseless patient Spider
Lines: 25
Message-ID: <st9s0t$m4$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<2022Jan22.144235@mips.complang.tuwien.ac.at>
<effb9090-c7c1-4e64-9a71-5b4b5d444d4en@googlegroups.com>
<2022Jan23.002646@mips.complang.tuwien.ac.at>
<221708c9-f4be-4421-a680-93a3e3599bd1n@googlegroups.com>
<b7b60e9f-e043-4265-90bc-5e83a496ccc6n@googlegroups.com>
<a2c3f8cc-526b-40d0-9937-e54bb3fb833cn@googlegroups.com>
<2022Jan24.133212@mips.complang.tuwien.ac.at>
<8ec2646b-bf55-4dfa-a9eb-50b1b37da4acn@googlegroups.com>
<2022Jan30.180831@mips.complang.tuwien.ac.at>
<05d0aa45-af05-4163-a26c-51a579dd8ccen@googlegroups.com>
<st9bep$po1$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 31 Jan 2022 23:40:13 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="9971b7608a3ab3de624efa6c6f366f2d";
logging-data="708"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/o5NM+IhS2fvNx8xs8xZ0yWDYjPWYu3jc="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.1
Cancel-Lock: sha1:OHgFvfSzG5A4M/DOmWWz8OkreyM=
In-Reply-To: <st9bep$po1$1@dont-email.me>
Content-Language: en-US
 by: Stephen Fuld - Mon, 31 Jan 2022 23:40 UTC

On 1/31/2022 10:57 AM, Ivan Godard wrote:
> On 1/31/2022 10:22 AM, Paul A. Clayton wrote:
>
> <snip>
>
>> (I want to think more on this, but I need to get to work. [Also my
>> dislike
>> for Google Groups interface is increasing. Composing a post is much
>> slower — I suspect background javascript is to blame.])
>
> Others report similar. I use Thunderbird via news.eternal-september.org
> and have no trouble.

I use the same setup. It is all free (you have to give
eternal-september an e-mail address to receive your password) and works
quite well. I do use some add-ons to Tbird to make the display more to
my liking. The one down side versus Google Groups is there seems not
to be any way (at least I haven't found one) to search newsgroup
messages content.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: fixing Spectre (was: The type of Mill's belt's slots)

<stbnt0$5m3$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23249&group=comp.arch#23249

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: fixing Spectre (was: The type of Mill's belt's slots)
Date: Tue, 1 Feb 2022 10:42:07 -0600
Organization: A noiseless patient Spider
Lines: 15
Message-ID: <stbnt0$5m3$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<2022Jan22.144235@mips.complang.tuwien.ac.at>
<effb9090-c7c1-4e64-9a71-5b4b5d444d4en@googlegroups.com>
<2022Jan23.002646@mips.complang.tuwien.ac.at>
<221708c9-f4be-4421-a680-93a3e3599bd1n@googlegroups.com>
<b7b60e9f-e043-4265-90bc-5e83a496ccc6n@googlegroups.com>
<a2c3f8cc-526b-40d0-9937-e54bb3fb833cn@googlegroups.com>
<2022Jan24.133212@mips.complang.tuwien.ac.at>
<8ec2646b-bf55-4dfa-a9eb-50b1b37da4acn@googlegroups.com>
<2022Jan30.180831@mips.complang.tuwien.ac.at>
<05d0aa45-af05-4163-a26c-51a579dd8ccen@googlegroups.com>
<st9bep$po1$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 1 Feb 2022 16:42:08 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="7043ce0e0a583772f3976907a0cd1437";
logging-data="5827"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+IXemM5sz9BBkbe6IvBcr6"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.1
Cancel-Lock: sha1:EBzcsAwyq10NmepeQN9rd/a/4IY=
In-Reply-To: <st9bep$po1$1@dont-email.me>
Content-Language: en-US
 by: BGB - Tue, 1 Feb 2022 16:42 UTC

On 1/31/2022 12:57 PM, Ivan Godard wrote:
> On 1/31/2022 10:22 AM, Paul A. Clayton wrote:
>
> <snip>
>
>> (I want to think more on this, but I need to get to work. [Also my
>> dislike
>> for Google Groups interface is increasing. Composing a post is much
>> slower — I suspect background javascript is to blame.])
>
> Others report similar. I use Thunderbird via news.eternal-september.org
> and have no trouble.

Same basic setup, though Thunderbird periodically crashes which is kinda
annoying...

Re: fixing Spectre

<stdlvd$v3q$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23251&group=comp.arch#23251

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: spamj...@blueyonder.co.uk (Tom Gardner)
Newsgroups: comp.arch
Subject: Re: fixing Spectre
Date: Wed, 2 Feb 2022 10:21:32 +0000
Organization: A noiseless patient Spider
Lines: 23
Message-ID: <stdlvd$v3q$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<2022Jan22.144235@mips.complang.tuwien.ac.at>
<effb9090-c7c1-4e64-9a71-5b4b5d444d4en@googlegroups.com>
<2022Jan23.002646@mips.complang.tuwien.ac.at>
<221708c9-f4be-4421-a680-93a3e3599bd1n@googlegroups.com>
<b7b60e9f-e043-4265-90bc-5e83a496ccc6n@googlegroups.com>
<a2c3f8cc-526b-40d0-9937-e54bb3fb833cn@googlegroups.com>
<2022Jan24.133212@mips.complang.tuwien.ac.at>
<8ec2646b-bf55-4dfa-a9eb-50b1b37da4acn@googlegroups.com>
<2022Jan30.180831@mips.complang.tuwien.ac.at>
<05d0aa45-af05-4163-a26c-51a579dd8ccen@googlegroups.com>
<st9bep$po1$1@dont-email.me> <stbnt0$5m3$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 2 Feb 2022 10:21:33 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="a4be91917f4b5c443b70f77735883c4f";
logging-data="31866"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18pUwezpKfhmD6eBOQFUPPH"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
Firefox/52.0 SeaMonkey/2.49.4
Cancel-Lock: sha1:M5I6/j3+FYzZvcacu797sk5v1kg=
In-Reply-To: <stbnt0$5m3$1@dont-email.me>
 by: Tom Gardner - Wed, 2 Feb 2022 10:21 UTC

On 01/02/22 16:42, BGB wrote:
> On 1/31/2022 12:57 PM, Ivan Godard wrote:
>> On 1/31/2022 10:22 AM, Paul A. Clayton wrote:
>>
>> <snip>
>>
>>> (I want to think more on this, but I need to get to work. [Also my dislike
>>> for Google Groups interface is increasing. Composing a post is much
>>> slower — I suspect background javascript is to blame.])
>>
>> Others report similar. I use Thunderbird via news.eternal-september.org and
>> have no trouble.
>
> Same basic setup, though Thunderbird periodically crashes which is kinda
> annoying...

A decade or so ago my Thunderbird started to work very slowly
with email directories that had an excessive number of entries.
(Now up to 20k, ahem :) )

I switched to SeaMonkey and the problem has not reappeared.
Since SeaMonkey and Thunderbird are so similar, the switch
is trivial and can be reversed.

Re: fixing Spectre

<stdnjm$eqs$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23252&group=comp.arch#23252

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!To5nvU/sTaigmVbgRJ05pQ.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: fixing Spectre
Date: Wed, 2 Feb 2022 11:49:34 +0100
Organization: Aioe.org NNTP Server
Message-ID: <stdnjm$eqs$1@gioia.aioe.org>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<2022Jan22.144235@mips.complang.tuwien.ac.at>
<effb9090-c7c1-4e64-9a71-5b4b5d444d4en@googlegroups.com>
<2022Jan23.002646@mips.complang.tuwien.ac.at>
<221708c9-f4be-4421-a680-93a3e3599bd1n@googlegroups.com>
<b7b60e9f-e043-4265-90bc-5e83a496ccc6n@googlegroups.com>
<a2c3f8cc-526b-40d0-9937-e54bb3fb833cn@googlegroups.com>
<2022Jan24.133212@mips.complang.tuwien.ac.at>
<8ec2646b-bf55-4dfa-a9eb-50b1b37da4acn@googlegroups.com>
<2022Jan30.180831@mips.complang.tuwien.ac.at>
<05d0aa45-af05-4163-a26c-51a579dd8ccen@googlegroups.com>
<st9bep$po1$1@dont-email.me> <stbnt0$5m3$1@dont-email.me>
<stdlvd$v3q$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="15196"; posting-host="To5nvU/sTaigmVbgRJ05pQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.10.2
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Wed, 2 Feb 2022 10:49 UTC

Tom Gardner wrote:
> On 01/02/22 16:42, BGB wrote:
>> On 1/31/2022 12:57 PM, Ivan Godard wrote:
>>> On 1/31/2022 10:22 AM, Paul A. Clayton wrote:
>>>
>>> <snip>
>>>
>>>> (I want to think more on this, but I need to get to work. [Also my
>>>> dislike
>>>> for Google Groups interface is increasing. Composing a post is much
>>>> slower — I suspect background javascript is to blame.])
>>>
>>> Others report similar. I use Thunderbird via
>>> news.eternal-september.org and have no trouble.
>>
>> Same basic setup, though Thunderbird periodically crashes which is
>> kinda annoying...
>
> A decade or so ago my Thunderbird started to work very slowly
> with email directories that had an excessive number of entries.
> (Now up to 20k, ahem :) )
>
> I switched to SeaMonkey and the problem has not reappeared.
> Since SeaMonkey and Thunderbird are so similar, the switch
> is trivial and can be reversed.

Interesting!

I have been using SeaMonkey forever, i.e. since before it changed it
name from Mozilla to SeaMonkey, and I have never had any serious issues
with my news reading.

I did run my own local news server for a few years but I've been using
news.aioe.org both before and after that period.

ethernal-september has been my backup option which I've never had to
actually use.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: fixing Spectre

<stdslp$hnm$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23253&group=comp.arch#23253

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: spamj...@blueyonder.co.uk (Tom Gardner)
Newsgroups: comp.arch
Subject: Re: fixing Spectre
Date: Wed, 2 Feb 2022 12:15:53 +0000
Organization: A noiseless patient Spider
Lines: 44
Message-ID: <stdslp$hnm$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<2022Jan22.144235@mips.complang.tuwien.ac.at>
<effb9090-c7c1-4e64-9a71-5b4b5d444d4en@googlegroups.com>
<2022Jan23.002646@mips.complang.tuwien.ac.at>
<221708c9-f4be-4421-a680-93a3e3599bd1n@googlegroups.com>
<b7b60e9f-e043-4265-90bc-5e83a496ccc6n@googlegroups.com>
<a2c3f8cc-526b-40d0-9937-e54bb3fb833cn@googlegroups.com>
<2022Jan24.133212@mips.complang.tuwien.ac.at>
<8ec2646b-bf55-4dfa-a9eb-50b1b37da4acn@googlegroups.com>
<2022Jan30.180831@mips.complang.tuwien.ac.at>
<05d0aa45-af05-4163-a26c-51a579dd8ccen@googlegroups.com>
<st9bep$po1$1@dont-email.me> <stbnt0$5m3$1@dont-email.me>
<stdlvd$v3q$1@dont-email.me> <stdnjm$eqs$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 2 Feb 2022 12:15:53 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="a4be91917f4b5c443b70f77735883c4f";
logging-data="18166"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+YrPfK1QZPh7pB+C7MXohC"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
Firefox/52.0 SeaMonkey/2.49.4
Cancel-Lock: sha1:lyaTcGtSj0loLiy8/qmNktVfyeU=
In-Reply-To: <stdnjm$eqs$1@gioia.aioe.org>
 by: Tom Gardner - Wed, 2 Feb 2022 12:15 UTC

On 02/02/22 10:49, Terje Mathisen wrote:
> Tom Gardner wrote:
>> On 01/02/22 16:42, BGB wrote:
>>> On 1/31/2022 12:57 PM, Ivan Godard wrote:
>>>> On 1/31/2022 10:22 AM, Paul A. Clayton wrote:
>>>>
>>>> <snip>
>>>>
>>>>> (I want to think more on this, but I need to get to work. [Also my dislike
>>>>> for Google Groups interface is increasing. Composing a post is much
>>>>> slower — I suspect background javascript is to blame.])
>>>>
>>>> Others report similar. I use Thunderbird via news.eternal-september.org and
>>>> have no trouble.
>>>
>>> Same basic setup, though Thunderbird periodically crashes which is kinda
>>> annoying...
>>
>> A decade or so ago my Thunderbird started to work very slowly
>> with email directories that had an excessive number of entries.
>> (Now up to 20k, ahem :) )
>>
>> I switched to SeaMonkey and the problem has not reappeared.
>> Since SeaMonkey and Thunderbird are so similar, the switch
>> is trivial and can be reversed.
>
> Interesting!
>
> I have been using SeaMonkey forever, i.e. since before it changed it name from
> Mozilla to SeaMonkey, and I have never had any serious issues with my news reading.

My problem was with the email side. I don't retain sufficient
usenet headers for any problem to become apparent.

> I did run my own local news server for a few years but I've been using
> news.aioe.org both before and after that period.
>
> ethernal-september has been my backup option which I've never had to actually use.

I switched to eternal september when VirginMedia (LibertyGlobal now,
IIRC) stopped having a newsnet feed. It is adequate.

I might try news.aioe.org

Re: RISC-V vs. Aarch64

<86lexnfdf3.fsf@linuxsc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23984&group=comp.arch#23984

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: tr.17...@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Sun, 06 Mar 2022 04:55:44 -0800
Organization: A noiseless patient Spider
Lines: 100
Message-ID: <86lexnfdf3.fsf@linuxsc.com>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sq5dj1$1q9$1@dont-email.me> <59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com> <sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad> <sql2cm$3h7$1@dont-email.me> <sqmsqq$14kp$1@gioia.aioe.org> <sqmthh$2ea$1@dont-email.me> <86lezsvy8v.fsf@linuxsc.com> <sr8t0k$mt5$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: reader02.eternal-september.org; posting-host="13246d5653fd8faf4d64a45856686609";
logging-data="904"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Yjy/Hwu0d41ZezyOtgJsgyFBwfVDdMhE="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:OM0LYH82j7GOtY8Qrs1C3CrV9wU=
sha1:eEUc1Nrz2EM1/6L6kK3zvkUQ2CU=
 by: Tim Rentsch - Sun, 6 Mar 2022 12:55 UTC

Marcus <m.delete@this.bitsnbites.eu> writes:

> On 2022-01-07, Tim Rentsch wrote:
>
>> Marcus <m.delete@this.bitsnbites.eu> writes:
>>
>>> On 2021-12-31, Terje Mathisen wrote:
>>>
>>>> Marcus wrote:
>>>>
>>>>> On 2021-12-30, EricP wrote:
>>>>>
>>>>>> C,C++ and a bunch of languages explicitly define booleans as 0
>>>>>> or 1 so this definition won't be optimal for those languages.
>>>>>> VAX Fortran used 0,-1 for LOGICAL but I don't know if that was
>>>>>> defined by the language or implementation dependant.
>>>>
>>>> -1 is better than 1, it can be used as a mask.
>>>>
>>>>> As a software developer I'm painfully aware of this. I decided
>>>>> not to care too much about it, though. Really, most software
>>>>> that relies on this property of C should be frowned upon. E.g.,
>>>>> expressions like:
>>>>>
>>>>> a = b + (c == d);
>>>>>
>>>>> ...aren't really good programming practice.
>>>>
>>>> No! Please tell me it ain't so!
>>>>
>>>> I use that type of constructs all over the place when writing
>>>> branchless code/doing table lookups etc.
>>>
>>> I think that you'll find that the following code produces the
>>> exact same result:
>>>
>>> int a;
>>> if (c == d) {
>>> a = b + 1;
>>> } else {
>>> a = b;
>>> }
>>>
>>> It too is completely branchless.
>>
>> Being branchless is not the high order bit here.
>
> ?? I was responding to a note about relying on comparison results
> having the value 0 or 1 for the purpose of producing branchless
> code.

That isn't what he said. He said he (often) uses such constructs
when writing branchless code. He did not say he uses them only
because they produce branchless code. Producing branchless code is
a benefit, but not the only benefit, or even the most important
benefit.

>>> My main gripe with the former version is the implicit type
>>> conversion (boolean to integer), and that I don't like to see
>>> logical operands and arithmetic operands mixed in the same
>>> expression.
>>
>> Apparently you are thinking of some other language, not C.
>> The result of comparison operators such as == have type int,
>> not some boolean type. And that is not just an accident.
>
> True. I usually dwell in C++ land, and there I'm pretty sure that
> comparison operators return bool, not int.
>
> Well, there's one more thing... In C, integer zero is "falsy", and
> *anything else* is "truthy". But only integer 1 is the valid
> value of "true". E.g. consider:
>
> #define FALSE 0
> #define TRUE 42
>
> int c = (foo > bar);
> if (c) // OK
> if (c == 1) // OK
> if (c == TRUE) // BAD
>
> int c = (foo > bar) ? TRUE : FALSE;
> if (c) // OK
> if (c == 1) // BAD
> if (c == TRUE) // OK
>
> int c = foo - bar; // c is truthy if foo != bar
> if (c) // OK
> if (c == TRUE) // BAD
> if (c == 1) // BAD
>
> So, in general I do not like code that makes assumptions about the
> integer value of a comparison result. I much rather see explicit
> if-then-else statements, and have the compiler reduce it to the
> most optimal form for the target machine (they do that these
> days). But maybe I'm just too influenced by C++ philosophies.

I think I understand your reaction. The last paragraph relates to
your questions in another posting; my response to that posting
may return to your comments here.

Re: RISC-V vs. Aarch64

<86h78bfbcf.fsf@linuxsc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23985&group=comp.arch#23985

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: tr.17...@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Sun, 06 Mar 2022 05:40:32 -0800
Organization: A noiseless patient Spider
Lines: 76
Message-ID: <86h78bfbcf.fsf@linuxsc.com>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sq5dj1$1q9$1@dont-email.me> <59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com> <sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad> <sql2cm$3h7$1@dont-email.me> <sqmsqq$14kp$1@gioia.aioe.org> <86pmp4vyqa.fsf@linuxsc.com> <sr8tvv$t2j$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: reader02.eternal-september.org; posting-host="13246d5653fd8faf4d64a45856686609";
logging-data="21730"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18vCyXp8yy6zWYWVHz+bSjIZcuDybjUjq4="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:Pos8ifUyDU0inlsCxsexBeuFzgs=
sha1:0CejBaGrcx8j4fgNGj5G/ivt6DU=
 by: Tim Rentsch - Sun, 6 Mar 2022 13:40 UTC

Marcus <m.delete@this.bitsnbites.eu> writes:

> On 2022-01-07, Tim Rentsch wrote:
>
>> Terje Mathisen <terje.mathisen@tmsw.no> writes:
>>
>>> Marcus wrote:
>>>
>>>> On 2021-12-30, EricP wrote:
>>>>
>>>>> C,C++ and a bunch of languages explicitly define booleans as 0 or 1
>>>>> so this definition won't be optimal for those languages.
>>>>> VAX Fortran used 0,-1 for LOGICAL but I don't know if that
>>>>> was defined by the language or implementation dependant.
>>>
>>> -1 is better than 1, it can be used as a mask.
>>>
>>>> As a software developer I'm painfully aware of this. I decided not to
>>>> care too much about it, though. Really, most software that relies on
>>>> this property of C should be frowned upon. E.g. expressions like:
>>>>
>>>> a = b + (c == d);
>>>>
>>>> ...aren't really good programming practice.
>>>
>>> No! Please tell me it ain't so!
>>>
>>> I use that type of constructs [frequently ...]
>>
>> Totally with you on this. People who don't see the benefit of
>> using 0/1 values instead of if/else in cases like this are stuck
>> in an antiquated way of thinking.
>
> Care to elaborate? What context are we talking about here? C?
> Machine code? Microarchitecture?

Programming languages. I don't mean to include low-level
languages that are basically just glorified assembly languages.
Let's say at the level of C or higher.

> Have a look at https://godbolt.org/z/K7ndn57T1 and feel free to
> compare what machine code you get on different architectures for
> the different C code snippets (spoiler alert: with modern
> compilers there will typically be no difference at all).

I take it as given that compilers of today (or even ten years ago)
can and do generate excellent code in such cases. For the most part
I'm not interested in what machine code is produced, except in those
rare situations where performance is critical to what the program is
doing (and in almost all code written, it isn't).

> My point is that you should strive for writing (high level) code
> that is as clear as possible w.r.t. the intent of the program
> (e.g. avoid error prone constructs). [...]

We agree on the principle. The question is which constructions are
more clear? In another posting in this general thread you say this:

> So, in general I do not like code that makes assumptions about the
> integer value of a comparison result. I much rather see explicit
> if-then-else statements,

My reaction is just the opposite. Insisting on if-then-else is
firmly stuck in the von Neumann imperative mindset. Using "logical
values" as integers is one of the great advances in programming
languages. APL illustrates the power of if-less programming. By
contrast, if/else is usually harder to understand and more likely to
be a source of error. When I'm looking at source code, I usually
find that the more if()s there are, the more likely the code is to
be buggy. It is the if/else, von Neumann imperative style of
programming that was (in part) what I meant above by "antiquated
way of thinking".

Please don't take any of these comments as personal. I am trying to
express a general point of view; I don't mean to focus on any of
your views in particular. My apologies if it came across otherwise.

Pages:123456789101112131415
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor