novaBBS - comp.arch - Re: RISC-V vs. Aarch64

Re: MRISC32 vectorization (was: RISC-V vs. Aarch64)

<sqngb0$m52$1@newsreader4.netcologne.de>

https://www.novabbs.com/devel/article-flat.php?id=22662&group=comp.arch#22662

Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.de!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-eb03-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: MRISC32 vectorization (was: RISC-V vs. Aarch64)
Date: Fri, 31 Dec 2021 17:57:52 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <sqngb0$m52$1@newsreader4.netcologne.de>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com>
<sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad>
<sql2cm$3h7$1@dont-email.me> <sql73d$6es$2@newsreader4.netcologne.de>
<sqmj5j$s31$1@dont-email.me> <sqmtil$9jq$1@newsreader4.netcologne.de>
<sqmttb$4m9$1@dont-email.me>
Injection-Date: Fri, 31 Dec 2021 17:57:52 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-eb03-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:eb03:0:7285:c2ff:fe6c:992d";
logging-data="22690"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Fri, 31 Dec 2021 17:57 UTC

Marcus <m.delete@this.bitsnbites.eu> schrieb:
> On 2021-12-31, Thomas Koenig wrote:
>> Marcus <m.delete@this.bitsnbites.eu> schrieb:
>>
>>> As a matter of fact, you can now try out MRISC32 on Compiler Explorer
>>> (support was added about a week ago): https://godbolt.org/z/z9sK3MYeM
>>
>> I could not help but start experimenting with that :-)
>>
>> I seen no vectorization for
>>
>> void foo(int * const restrict a, int * const restrict b, int * restrict c, int n)
>> {
>> for (int i=0; i<n; i++)
>> c[i] = a[i]+b[i];
>> }
>>
>> and -fopt-info-vec-missed tells me, at https://godbolt.org/z/vPM3KTrnz ,
>> <source>:3:20: missed: couldn't vectorize loop
>> <source>:3:20: missed: not vectorized: unsupported data-type
>>
>> so it seems some more delving into gcc's secret internals
>> will be required...
>>
>
>
> Yup! Vectorization is not at all implemented (the compiler doesn't even
> know how to use the vector registers), so it's one of the more
> interesting improvements to be done to the MRISC32 GCC machine
> description.
>
> Since I'm not a compiler guy (at all), I'm pretty happy that I've gotten
> this far.

And pretty impressive it is, too. I've never even gotten close to
the machine description part of gcc.

> Auto-vectorization is not on the near term roadmap,
> unfortunately (things like better C++ support is higher up on the prio
> list). Also, the vector part of the ISA is not really finalized yet.
>
> Inline assembler is the solution for MRISC32 vector code. E.g:
>
> https://github.com/mbitsnbites/mc1-doom/blob/master/src/r_draw.c#L89

It might be possible to upgrade this to __builtin, at least, that way
it would be possible to at least the register allocation part to gcc.

Hmm... the only thing that comes to mind when selecting a vector
with size-agnostic vectors is the SVE feature of aarch64, so it might
be possible to pick up something from that. However, this is a _big_
ISA, and I am not sure that poking around there will lead to anything
useful in a reasonable amount of time.

Re: RISC-V vs. Aarch64

<sqnj3g$fh1$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22663&group=comp.arch#22663

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: m.del...@this.bitsnbites.eu (Marcus)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Fri, 31 Dec 2021 19:45:04 +0100
Organization: A noiseless patient Spider
Lines: 60
Message-ID: <sqnj3g$fh1$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com>
<sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad>
<sql2cm$3h7$1@dont-email.me> <sqmsqq$14kp$1@gioia.aioe.org>
<VSFzJ.136700$7D4.47834@fx37.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 31 Dec 2021 18:45:05 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="fec78306e793eadb5c409dad2d5049b8";
logging-data="15905"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX194GEp9wbrVntJWRHVKeqBb3LFpAcxiqVk="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
Cancel-Lock: sha1:s5jv4pgwnB8XRltyhBn1PpwguTg=
In-Reply-To: <VSFzJ.136700$7D4.47834@fx37.iad>
Content-Language: en-US

by: Marcus - Fri, 31 Dec 2021 18:45 UTC

On 2021-12-31, EricP wrote:
> Terje Mathisen wrote:
>> Marcus wrote:
>>> On 2021-12-30, EricP wrote:
>>>> C,C++ and a bunch of languages explicitly define booleans as 0 or 1
>>>> so this definition won't be optimal for those languages.
>>>> VAX Fortran used 0,-1 for LOGICAL but I don't know if that
>>>> was defined by the language or implementation dependant.
>>
>> -1 is better than 1, it can be used as a mask.
>
> One or the other usage has to do a negate.
> The question is which usage, boolean or mask, is more common?
> In general it is boolean so it is mask user that should do the negate.
>
> There is a lot of C code that expects that a = b < c; produces 0 or 1.
> So if CMP produces a -1 then a C compiler would always generate a NEG too.
>
> For boolean expressions in IF statements the NEG is unnecessary
> if the branch tests zero/non-zero and so it might be optimized away.
> But that requires the compiler code gen knowing that boolean expressions
> in IF statements are not quite the same as other boolean expressions
> (and we give a special dispensation to & and | operators on booleans).
> Similar rational optimizes any for CMOV conditionals.
>
> It seems easier to have the mask user pay for the NEG.

MRISC32 is a vector-first ISA. Producing masks is the natural choice
when working with vectors.

I was mostly happy to discover that it worked pretty much equally well
for scalars & branches.

So far it seems that using -1 (mask) instead of +1 gives me roughly the
same instruction count in most situations as for architectures that use
a flags register:

* Cond. branch: S[cc] + BS ~= CMP + B[cc]
* Cond. select: S[cc] + SEL ~= CMP + CSEL[cc]/MOV[cc]
* Cond. to 0/1: S[cc] + NEG ~= CMP + CSET[cc]/MOV[cc]

OTOH it's a very good fit when dealing with vectors and even packed
data (SIMD). E.g. the following sequence does bytewise selection for
each vector element:

sltu.b v1, v2, v3
sel v1, v4, v5

SLTU.B does "set if less than (unsigned)" for each byte of a 32-bit
word, and since the instruction operands are vector registers, it will
perform the operation for each vector element (each element is 32 bits
wide).

SEL does "bitwise select", i.e. a = (b & a) | (c & ~a)

One of the key design principles of the MRISC32 ISA is that the same
instructions are used for scalar and for vector operands, with the same
semantics.

/Marcus

Re: RISC-V vs. Aarch64

<2021Dec31.203710@mips.complang.tuwien.ac.at>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22666&group=comp.arch#22666

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Fri, 31 Dec 2021 19:37:10 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 85
Message-ID: <2021Dec31.203710@mips.complang.tuwien.ac.at>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sq5dj1$1q9$1@dont-email.me> <59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com> <sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad> <sql2cm$3h7$1@dont-email.me> <sqmsqq$14kp$1@gioia.aioe.org> <VSFzJ.136700$7D4.47834@fx37.iad>
Injection-Info: reader02.eternal-september.org; posting-host="0378d83b25f1312aaa05e6b22cb0be46";
logging-data="12454"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+vkRJ3rk4lwHXrVEzCvn4v"
Cancel-Lock: sha1:LJ96OHQcxS8F174puB4wXNdr52U=
X-newsreader: xrn 10.00-beta-3

by: Anton Ertl - Fri, 31 Dec 2021 19:37 UTC

EricP <ThatWouldBeTelling@thevillage.com> writes:
>Terje Mathisen wrote:
>> Marcus wrote:
>>> On 2021-12-30, EricP wrote:
>>>> C,C++ and a bunch of languages explicitly define booleans as 0 or 1
>>>> so this definition won't be optimal for those languages.
>>>> VAX Fortran used 0,-1 for LOGICAL but I don't know if that
>>>> was defined by the language or implementation dependant.
>>
>> -1 is better than 1, it can be used as a mask.
>
>One or the other usage has to do a negate.

Not necessarily, there are architectures that can do both.

E.g., on Aarch64, the Forth word < (which produces 0 or -1) looks as
follows:

55555843A8: ldr x0, [x25,#0x8]! #load second stack item
55555843AC: add x26, x26, #0x8 #update instruction pointer
55555843B0: subs xzr, x0, x27 #compare
55555843B4: csinv x27, xzr, xzr, ge #and reify the ge condition as 0/-1
55555843B8: ldur x1, [x26,#-0x8] #dispatch next primitive
55555843BC: br x1 #dispatch next primitive

>The question is which usage, boolean or mask, is more common?

You can represent a boolean true as -1.

The issue is not "boolean", but C.

>There is a lot of C code that expects that a = b < c; produces 0 or 1.

Not just that, the language guarantees it.

>So if CMP produces a -1 then a C compiler would always generate a NEG too.

Only if the flag needs to be reified (not that common in C, with
branching operators like && and ||).

>For boolean expressions in IF statements the NEG is unnecessary
>if the branch tests zero/non-zero and so it might be optimized away.
>But that requires the compiler code gen knowing that boolean expressions
>in IF statements are not quite the same as other boolean expressions

This is easy to implement in a compiler. E.g., in tree-parsing
instruction selection you would have stuff like:

grammar rule # generated code
root: Condbranch(cond) # branch on the comparison result
cond: Lt(reg,reg) # generate a comparison
reg: cond # reify the cond as 0/1

and as long as the conds are consumed only by condbranches, you get no
reification and not negation.

>(and we give a special dispensation to & and | operators on booleans).

Not sure what you mean by that. You can also optimize & and | with
flags as parameters; assuming a cond is 0/-1 in a register (and not
just something in the condition codes; in that case I would introduce
additional nonterminals):

znz: cond # no code
cond: And(cond,cond) # and
znz: And(cond,reg) # and
znz: And(reg,cond) # and
cond: Or(cond,cond) # or
znz: Or(znz,reg) # or
znz: Or(reg,znz) # or

znz represents a flag as zero/nonzero value, which is useful in
combination with branch on zero instructions (i.e., if the result is
used by such a branch).

>It seems easier to have the mask user pay for the NEG.

Certainly MIPS, Alpha, and RISC-V designers thought so. But of course
it is possible to use instruction combining to eliminate the latency
cost of the negation.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: RISC-V vs. Aarch64

<f715843c-6b9c-46dd-8064-700e670b7af4n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22668&group=comp.arch#22668

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:5b01:: with SMTP id m1mr31236015qtw.313.1640982603587;
Fri, 31 Dec 2021 12:30:03 -0800 (PST)
X-Received: by 2002:a4a:8902:: with SMTP id f2mr18755955ooi.59.1640982603287;
Fri, 31 Dec 2021 12:30:03 -0800 (PST)
Path: i2pn2.org!rocksolid2!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 31 Dec 2021 12:30:02 -0800 (PST)
In-Reply-To: <2021Dec31.203710@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:44d8:5de8:1a0e:df71;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:44d8:5de8:1a0e:df71
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com> <sqkcvk$n97$1@dont-email.me>
<RrlzJ.130558$SR4.25229@fx43.iad> <sql2cm$3h7$1@dont-email.me>
<sqmsqq$14kp$1@gioia.aioe.org> <VSFzJ.136700$7D4.47834@fx37.iad> <2021Dec31.203710@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f715843c-6b9c-46dd-8064-700e670b7af4n@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 31 Dec 2021 20:30:03 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 103

by: MitchAlsup - Fri, 31 Dec 2021 20:30 UTC

On Friday, December 31, 2021 at 2:04:34 PM UTC-6, Anton Ertl wrote:
> EricP <ThatWould...@thevillage.com> writes:
> >Terje Mathisen wrote:
> >> Marcus wrote:
> >>> On 2021-12-30, EricP wrote:
> >>>> C,C++ and a bunch of languages explicitly define booleans as 0 or 1
> >>>> so this definition won't be optimal for those languages.
> >>>> VAX Fortran used 0,-1 for LOGICAL but I don't know if that
> >>>> was defined by the language or implementation dependant.
> >>
> >> -1 is better than 1, it can be used as a mask.
> >
> >One or the other usage has to do a negate.
> Not necessarily, there are architectures that can do both.
<
And architectures which provide both--this is my recommendation--
provide both.
>
> E.g., on Aarch64, the Forth word < (which produces 0 or -1) looks as
> follows:
>
> 55555843A8: ldr x0, [x25,#0x8]! #load second stack item
> 55555843AC: add x26, x26, #0x8 #update instruction pointer
> 55555843B0: subs xzr, x0, x27 #compare
> 55555843B4: csinv x27, xzr, xzr, ge #and reify the ge condition as 0/-1
> 55555843B8: ldur x1, [x26,#-0x8] #dispatch next primitive
> 55555843BC: br x1 #dispatch next primitive
> >The question is which usage, boolean or mask, is more common?
> You can represent a boolean true as -1.
>
> The issue is not "boolean", but C.
> >There is a lot of C code that expects that a = b < c; produces 0 or 1.
> Not just that, the language guarantees it.
> >So if CMP produces a -1 then a C compiler would always generate a NEG too.
> Only if the flag needs to be reified (not that common in C, with
> branching operators like && and ||).
<
And then there is My 66000 which has sign control bits in integer calculation
instructions::
<
Rd = -Rs1 - Rs2
<
to avoid 90%± of negation instructions; these pair with the constant attachment
encoding to decrease entropy of the encoding.
<
> >For boolean expressions in IF statements the NEG is unnecessary
> >if the branch tests zero/non-zero and so it might be optimized away.
> >But that requires the compiler code gen knowing that boolean expressions
> >in IF statements are not quite the same as other boolean expressions
> This is easy to implement in a compiler. E.g., in tree-parsing
> instruction selection you would have stuff like:
>
> grammar rule # generated code
> root: Condbranch(cond) # branch on the comparison result
> cond: Lt(reg,reg) # generate a comparison
> reg: cond # reify the cond as 0/1
>
> and as long as the conds are consumed only by condbranches, you get no
> reification and not negation.
<
Almost every reasonable(tm) ISA there is a way to CMP and BC without
having to jump through hoops of NEGs.
<
> >(and we give a special dispensation to & and | operators on booleans).
> Not sure what you mean by that. You can also optimize & and | with
> flags as parameters; assuming a cond is 0/-1 in a register (and not
> just something in the condition codes; in that case I would introduce
> additional nonterminals):
>
> znz: cond # no code
> cond: And(cond,cond) # and
> znz: And(cond,reg) # and
> znz: And(reg,cond) # and
> cond: Or(cond,cond) # or
> znz: Or(znz,reg) # or
> znz: Or(reg,znz) # or
>
> znz represents a flag as zero/nonzero value, which is useful in
> combination with branch on zero instructions (i.e., if the result is
> used by such a branch).
> >It seems easier to have the mask user pay for the NEG.
<
Mask consumers of conditions are less then 10% of consumers of conditions;
so, it makes sense to make them pay.
<
> Certainly MIPS, Alpha, and RISC-V designers thought so. But of course
> it is possible to use instruction combining to eliminate the latency
> cost of the negation.
<
It is even possible to make 90%+ of NEGs disappear by embedding negation/inversion
into the semantics of ISA encoding.
>
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: RISC-V vs. Aarch64

<5fccd1a5-55f2-4618-af28-bf2967270f8an@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22669&group=comp.arch#22669

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:4e8a:: with SMTP id 10mr28430757qtp.43.1640983859997;
Fri, 31 Dec 2021 12:50:59 -0800 (PST)
X-Received: by 2002:a05:6808:1448:: with SMTP id x8mr27774840oiv.84.1640983859380;
Fri, 31 Dec 2021 12:50:59 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 31 Dec 2021 12:50:59 -0800 (PST)
In-Reply-To: <RrlzJ.130558$SR4.25229@fx43.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:4d41:3991:778b:66db;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:4d41:3991:778b:66db
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com> <sqkcvk$n97$1@dont-email.me>
<RrlzJ.130558$SR4.25229@fx43.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <5fccd1a5-55f2-4618-af28-bf2967270f8an@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 31 Dec 2021 20:50:59 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 9

by: Quadibloc - Fri, 31 Dec 2021 20:50 UTC

On Thursday, December 30, 2021 at 9:51:01 AM UTC-7, EricP wrote:

> VAX Fortran used 0,-1 for LOGICAL but I don't know if that
> was defined by the language or implementation dependant.

I believe it would have been implementation dependent, since
IBM mainframe FORTRAN used 0 and 1 for .FALSE. and .TRUE.
respectively.

John Savard

Re: RISC-V vs. Aarch64

<6248bec2-10b8-4278-af01-e7836ecc491dn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22670&group=comp.arch#22670

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:5a84:: with SMTP id c4mr32083394qtc.565.1640986083831;
Fri, 31 Dec 2021 13:28:03 -0800 (PST)
X-Received: by 2002:a05:6808:3097:: with SMTP id bl23mr29089708oib.0.1640986083642;
Fri, 31 Dec 2021 13:28:03 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!3.eu.feeder.erje.net!2.us.feeder.erje.net!feeder.erje.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 31 Dec 2021 13:28:03 -0800 (PST)
In-Reply-To: <172ecb29-5ba8-4a35-9a77-7fba617e7389n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:4d41:3991:778b:66db;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:4d41:3991:778b:66db
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <0a8ff16a-53de-420e-9c82-cfc9e87f62e9n@googlegroups.com>
<sq675n$tht$1@dont-email.me> <172ecb29-5ba8-4a35-9a77-7fba617e7389n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6248bec2-10b8-4278-af01-e7836ecc491dn@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 31 Dec 2021 21:28:03 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 31

by: Quadibloc - Fri, 31 Dec 2021 21:28 UTC

On Thursday, December 30, 2021 at 2:24:37 PM UTC-7, Quadibloc wrote:
> On Friday, December 24, 2021 at 9:37:14 PM UTC-7, Ivan Godard wrote:
> > On 12/24/2021 10:17 AM, MitchAlsup wrote:
>
> > > There is NO JUSTIFIABLE reason that an instruction is not entirely
> > > self-contained!
>
> > Really? I suppose that might be true for OOO that is emulating
> > sequential execution, but what about VLIW and other wide multi-issue?
>
> > Chopping off the variable and large parts so they can be recognized in
> > parallel lets the issue width no longer depend on how many constants you
> > have.
>
> This is the sort of thing I've been fooling around with finding a way to
> achieve. To avoid too much complexity, my recent designs all involved
> a self-contained 256-bit block which combined as many fixed-length
> instructions as possible with the variable or large parts they needed.

I've turned my attention to Concertina II again, having thought of a
way to simplify one aspect of that design considerably.

There would just be one instruction format, and either blocks would
have no header, or just one kind of header.

Unfortunately, that turns out to have constrained available opcode space
severely enough to require some serious compromises, so it will take
some time to make those livable. Essentially, I have only 30 bits for the
32-bit instructions, since I include a break bit in the instruction itself,
and use half the opcode space for pairs of 15-bit instructions.

John Savard

Re: RISC-V vs. Aarch64

<sqpd0i$spj$1@newsreader4.netcologne.de>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22674&group=comp.arch#22674

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-eb03-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Sat, 1 Jan 2022 11:13:22 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <sqpd0i$spj$1@newsreader4.netcologne.de>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com>
<sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad>
<sql2cm$3h7$1@dont-email.me> <sql73d$6es$2@newsreader4.netcologne.de>
<sqmj5j$s31$1@dont-email.me> <sqmmso$446$2@newsreader4.netcologne.de>
<gs2dnRZj-ucyZ1P8nZ2dnUU78YfNnZ2d@supernews.com>
Injection-Date: Sat, 1 Jan 2022 11:13:22 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-eb03-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:eb03:0:7285:c2ff:fe6c:992d";
logging-data="29491"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Sat, 1 Jan 2022 11:13 UTC

aph@littlepinkcloud.invalid <aph@littlepinkcloud.invalid> schrieb:
> Thomas Koenig <tkoenig@netcologne.de> wrote:
>> Marcus <m.delete@this.bitsnbites.eu> schrieb:
>>> On 2021-12-30 22:07, Thomas Koenig wrote:
>>
>>>> What is the assembly for
>>>>
>>>> _Bool foo (int a, int b)
>>>> {
>>>> return a > b;
>>>> }
>>>>
>>>> for your architecture at a reasonable optimization level?
>>>>
>>>
>>> As a matter of fact, you can now try out MRISC32 on Compiler Explorer
>>> (support was added about a week ago): https://godbolt.org/z/z9sK3MYeM
>>
>>> POWER:
>>> subf 3,3,4
>>> srdi 3,3,63
>>> blr
>>
>> Subtract and right shift of the sign bit. That's another
>> way of doing this, I guess. When I saw this, I thought
>> "Somebody must have been feeling hackish"...
>
> Is it even correct? Surely that's (b - a < 0), not (a > b). I think
> it'd return true for foo(-0x7fffffff, 0x7fffffff). But I don't know
> the POWER ISA at all well, so pls ignore if this is cleverer than I
> think. :-).

POWER uses 64-bit data only (but it generates a 32-bit carry,
just in case), so this is indeed correct.

The code emitted for

_Bool foo (long a, long b)
{ return a > b;
}

is far less pretty and depends on selected CPU and comipiler.

gcc 4.8.5:

cmpd cr7,r3,r4
mfocrf r3,1
rlwinm r3,r3,30,31,31
blr

This moves a condition code into a register and extracts
the bit field.

gcc 12 with -mcpu=power8 (the current default):

sradi r10,r4,63
rldicl r9,r3,1,63
subfc r3,r3,r4
adde r3,r9,r10
xori r3,r3,1
clrlwi r3,r3,24
blr

That is weird. Two shifts and masks, one subtraction, one
addition, one xor, and one instruction that clears bits.

gcc 12 with -mcpu=power9:

li r9,0
li r10,1
cmpd r3,r4
iselgt r3,r10,r9
blr

Load constants into two registers and select based on a compare
instruction.

xlc:

cmpd r3,r4
li r3,1
bgt 14 <foo+0x14>
li r3,0
ori r2,r2,0
blr

Load 1, and conditionally branch around the load of 0. The or
with immediate is a not and some sort of hint to the linker, IIRC.

Clang:

sradi r5,r4,63
rldicl r6,r3,1,63
subfc r3,r3,r4
adde r3,r6,r5
xori r3,r3,1
blr

Same as gcc 12 for POWER8 above, but without the final clear.

At leat from an instruction count, the oldest solution looks
to be the best.

Re: RISC-V vs. Aarch64

<650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22675&group=comp.arch#22675

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:1652:: with SMTP id y18mr34009527qtj.63.1641038758981;
Sat, 01 Jan 2022 04:05:58 -0800 (PST)
X-Received: by 2002:a4a:8746:: with SMTP id a6mr23379161ooi.93.1641038758656;
Sat, 01 Jan 2022 04:05:58 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 1 Jan 2022 04:05:58 -0800 (PST)
In-Reply-To: <sqpd0i$spj$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:fb00:5505:c070:7945:2082;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:fb00:5505:c070:7945:2082
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com> <sqkcvk$n97$1@dont-email.me>
<RrlzJ.130558$SR4.25229@fx43.iad> <sql2cm$3h7$1@dont-email.me>
<sql73d$6es$2@newsreader4.netcologne.de> <sqmj5j$s31$1@dont-email.me>
<sqmmso$446$2@newsreader4.netcologne.de> <gs2dnRZj-ucyZ1P8nZ2dnUU78YfNnZ2d@supernews.com>
<sqpd0i$spj$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Sat, 01 Jan 2022 12:05:58 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 122

by: robf...@gmail.com - Sat, 1 Jan 2022 12:05 UTC

On Saturday, January 1, 2022 at 6:13:25 AM UTC-5, Thomas Koenig wrote:
> a...@littlepinkcloud.invalid <a...@littlepinkcloud.invalid> schrieb:
> > Thomas Koenig <tko...@netcologne.de> wrote:
> >> Marcus <m.de...@this.bitsnbites.eu> schrieb:
> >>> On 2021-12-30 22:07, Thomas Koenig wrote:
> >>
> >>>> What is the assembly for
> >>>>
> >>>> _Bool foo (int a, int b)
> >>>> {
> >>>> return a > b;
> >>>> }
> >>>>
> >>>> for your architecture at a reasonable optimization level?
> >>>>
> >>>
> >>> As a matter of fact, you can now try out MRISC32 on Compiler Explorer
> >>> (support was added about a week ago): https://godbolt.org/z/z9sK3MYeM
> >>
> >>> POWER:
> >>> subf 3,3,4
> >>> srdi 3,3,63
> >>> blr
> >>
> >> Subtract and right shift of the sign bit. That's another
> >> way of doing this, I guess. When I saw this, I thought
> >> "Somebody must have been feeling hackish"...
> >
> > Is it even correct? Surely that's (b - a < 0), not (a > b). I think
> > it'd return true for foo(-0x7fffffff, 0x7fffffff). But I don't know
> > the POWER ISA at all well, so pls ignore if this is cleverer than I
> > think. :-).
> POWER uses 64-bit data only (but it generates a 32-bit carry,
> just in case), so this is indeed correct.
>
> The code emitted for
>
> _Bool foo (long a, long b)
> {
> return a > b;
> }
>
> is far less pretty and depends on selected CPU and comipiler.
>
> gcc 4.8.5:
>
> cmpd cr7,r3,r4
> mfocrf r3,1
> rlwinm r3,r3,30,31,31
> blr
>
> This moves a condition code into a register and extracts
> the bit field.
>
> gcc 12 with -mcpu=power8 (the current default):
>
> sradi r10,r4,63
> rldicl r9,r3,1,63
> subfc r3,r3,r4
> adde r3,r9,r10
> xori r3,r3,1
> clrlwi r3,r3,24
> blr
>
> That is weird. Two shifts and masks, one subtraction, one
> addition, one xor, and one instruction that clears bits.
>
> gcc 12 with -mcpu=power9:
>
> li r9,0
> li r10,1
> cmpd r3,r4
> iselgt r3,r10,r9
> blr
>
> Load constants into two registers and select based on a compare
> instruction.
>
> xlc:
>
> cmpd r3,r4
> li r3,1
> bgt 14 <foo+0x14>
> li r3,0
> ori r2,r2,0
> blr
>
> Load 1, and conditionally branch around the load of 0. The or
> with immediate is a not and some sort of hint to the linker, IIRC.
>
> Clang:
>
> sradi r5,r4,63
> rldicl r6,r3,1,63
> subfc r3,r3,r4
> adde r3,r6,r5
> xori r3,r3,1
> blr
>
> Same as gcc 12 for POWER8 above, but without the final clear.
>
> At leat from an instruction count, the oldest solution looks
> to be the best.

The PowerPC code demonstrates one of the issues with using condition
registers instead of GPRs for comparison results. They end up being
transferred to GPRs anyway. One would think that the compiler should
return a value in a condition register for a function returning a bool. Which
would be much more efficient. There are eight condition register available.
That way the condition register could be specified in a subsequent
conditional statement without needing to convert the GPR value to a
condition code. The equals flag of the condition register could be used to
store the Boolean value. The compiler would have to track values in
condition registers. There should be a ‘set’ instruction for the condition
registers allowing the bits to be transferred to the equals bit.

Compiled code should look something like:
SLT cr0,r3,r4
RET

Re: RISC-V vs. Aarch64

<d8ada1ec-12ae-4314-86f4-0f843ad9f304n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22676&group=comp.arch#22676

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:5c50:: with SMTP id j16mr34698542qtj.255.1641043720815;
Sat, 01 Jan 2022 05:28:40 -0800 (PST)
X-Received: by 2002:aca:646:: with SMTP id 67mr30253518oig.175.1641043720601;
Sat, 01 Jan 2022 05:28:40 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 1 Jan 2022 05:28:40 -0800 (PST)
In-Reply-To: <6248bec2-10b8-4278-af01-e7836ecc491dn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:20af:dcf:4a41:e82;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:20af:dcf:4a41:e82
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <0a8ff16a-53de-420e-9c82-cfc9e87f62e9n@googlegroups.com>
<sq675n$tht$1@dont-email.me> <172ecb29-5ba8-4a35-9a77-7fba617e7389n@googlegroups.com>
<6248bec2-10b8-4278-af01-e7836ecc491dn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <d8ada1ec-12ae-4314-86f4-0f843ad9f304n@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sat, 01 Jan 2022 13:28:40 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 18

by: Quadibloc - Sat, 1 Jan 2022 13:28 UTC

On Friday, December 31, 2021 at 2:28:04 PM UTC-7, Quadibloc wrote:

> I've turned my attention to Concertina II again, having thought of a
> way to simplify one aspect of that design considerably.
>
> There would just be one instruction format, and either blocks would
> have no header, or just one kind of header.
>
> Unfortunately, that turns out to have constrained available opcode space
> severely enough to require some serious compromises, so it will take
> some time to make those livable. Essentially, I have only 30 bits for the
> 32-bit instructions, since I include a break bit in the instruction itself,
> and use half the opcode space for pairs of 15-bit instructions.

I've come up with something that should work. The sacrifice is that
16-bit instructions will only be available within code blocks with headers -
and they will not be able to be predicated.

John Savard

Re: RISC-V vs. Aarch64

<sqpn9u$mi5$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22677&group=comp.arch#22677

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Sat, 1 Jan 2022 06:09:00 -0800
Organization: A noiseless patient Spider
Lines: 127
Message-ID: <sqpn9u$mi5$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com>
<sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad>
<sql2cm$3h7$1@dont-email.me> <sql73d$6es$2@newsreader4.netcologne.de>
<sqmj5j$s31$1@dont-email.me> <sqmmso$446$2@newsreader4.netcologne.de>
<gs2dnRZj-ucyZ1P8nZ2dnUU78YfNnZ2d@supernews.com>
<sqpd0i$spj$1@newsreader4.netcologne.de>
<650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 1 Jan 2022 14:09:02 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="ea7f2618af31aacb02a429a4749e0332";
logging-data="23109"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/SeLUPJTgBkloSJRAylB+y"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.4.1
Cancel-Lock: sha1:i70a2IxGUFPhuw/NjN+VHfBhDms=
In-Reply-To: <650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>
Content-Language: en-US

by: Ivan Godard - Sat, 1 Jan 2022 14:09 UTC

On 1/1/2022 4:05 AM, robf...@gmail.com wrote:
> On Saturday, January 1, 2022 at 6:13:25 AM UTC-5, Thomas Koenig wrote:
>> a...@littlepinkcloud.invalid <a...@littlepinkcloud.invalid> schrieb:
>>> Thomas Koenig <tko...@netcologne.de> wrote:
>>>> Marcus <m.de...@this.bitsnbites.eu> schrieb:
>>>>> On 2021-12-30 22:07, Thomas Koenig wrote:
>>>>
>>>>>> What is the assembly for
>>>>>>
>>>>>> _Bool foo (int a, int b)
>>>>>> {
>>>>>> return a > b;
>>>>>> }
>>>>>>
>>>>>> for your architecture at a reasonable optimization level?
>>>>>>
>>>>>
>>>>> As a matter of fact, you can now try out MRISC32 on Compiler Explorer
>>>>> (support was added about a week ago): https://godbolt.org/z/z9sK3MYeM
>>>>
>>>>> POWER:
>>>>> subf 3,3,4
>>>>> srdi 3,3,63
>>>>> blr
>>>>
>>>> Subtract and right shift of the sign bit. That's another
>>>> way of doing this, I guess. When I saw this, I thought
>>>> "Somebody must have been feeling hackish"...
>>>
>>> Is it even correct? Surely that's (b - a < 0), not (a > b). I think
>>> it'd return true for foo(-0x7fffffff, 0x7fffffff). But I don't know
>>> the POWER ISA at all well, so pls ignore if this is cleverer than I
>>> think. :-).
>> POWER uses 64-bit data only (but it generates a 32-bit carry,
>> just in case), so this is indeed correct.
>>
>> The code emitted for
>>
>> _Bool foo (long a, long b)
>> {
>> return a > b;
>> }
>>
>> is far less pretty and depends on selected CPU and comipiler.
>>
>> gcc 4.8.5:
>>
>> cmpd cr7,r3,r4
>> mfocrf r3,1
>> rlwinm r3,r3,30,31,31
>> blr
>>
>> This moves a condition code into a register and extracts
>> the bit field.
>>
>> gcc 12 with -mcpu=power8 (the current default):
>>
>> sradi r10,r4,63
>> rldicl r9,r3,1,63
>> subfc r3,r3,r4
>> adde r3,r9,r10
>> xori r3,r3,1
>> clrlwi r3,r3,24
>> blr
>>
>> That is weird. Two shifts and masks, one subtraction, one
>> addition, one xor, and one instruction that clears bits.
>>
>> gcc 12 with -mcpu=power9:
>>
>> li r9,0
>> li r10,1
>> cmpd r3,r4
>> iselgt r3,r10,r9
>> blr
>>
>> Load constants into two registers and select based on a compare
>> instruction.
>>
>> xlc:
>>
>> cmpd r3,r4
>> li r3,1
>> bgt 14 <foo+0x14>
>> li r3,0
>> ori r2,r2,0
>> blr
>>
>> Load 1, and conditionally branch around the load of 0. The or
>> with immediate is a not and some sort of hint to the linker, IIRC.
>>
>> Clang:
>>
>> sradi r5,r4,63
>> rldicl r6,r3,1,63
>> subfc r3,r3,r4
>> adde r3,r6,r5
>> xori r3,r3,1
>> blr
>>
>> Same as gcc 12 for POWER8 above, but without the final clear.
>>
>> At leat from an instruction count, the oldest solution looks
>> to be the best.
>
> The PowerPC code demonstrates one of the issues with using condition
> registers instead of GPRs for comparison results. They end up being
> transferred to GPRs anyway. One would think that the compiler should
> return a value in a condition register for a function returning a bool. Which
> would be much more efficient. There are eight condition register available.
> That way the condition register could be specified in a subsequent
> conditional statement without needing to convert the GPR value to a
> condition code. The equals flag of the condition register could be used to
> store the Boolean value. The compiler would have to track values in
> condition registers. There should be a ‘set’ instruction for the condition
> registers allowing the bits to be transferred to the equals bit.
>
> Compiled code should look something like:
> SLT cr0,r3,r4
> RET

This is very similar in effect to having different regfiles for integer
and FP, only in your proposal it's integer (and FP?) and bool. Splitting
regs by type does offer some code (and encoding) advantages, but also
some drawbacks. Think about how you would do VARARGS when bools are
passed in the flags :-)

Re: RISC-V vs. Aarch64

<sqpocs$1so3$1@gioia.aioe.org>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22678&group=comp.arch#22678

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!rd9pRsUZyxkRLAEK7e/Uzw.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Sat, 1 Jan 2022 15:27:41 +0100
Organization: Aioe.org NNTP Server
Message-ID: <sqpocs$1so3$1@gioia.aioe.org>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com>
<sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad>
<sql2cm$3h7$1@dont-email.me> <sql73d$6es$2@newsreader4.netcologne.de>
<sqmj5j$s31$1@dont-email.me> <sqmmso$446$2@newsreader4.netcologne.de>
<gs2dnRZj-ucyZ1P8nZ2dnUU78YfNnZ2d@supernews.com>
<sqpd0i$spj$1@newsreader4.netcologne.de>
<650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="62211"; posting-host="rd9pRsUZyxkRLAEK7e/Uzw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.10.1
X-Notice: Filtered by postfilter v. 0.9.2

by: Terje Mathisen - Sat, 1 Jan 2022 14:27 UTC

robf...@gmail.com wrote:
> On Saturday, January 1, 2022 at 6:13:25 AM UTC-5, Thomas Koenig wrote:
>> a...@littlepinkcloud.invalid <a...@littlepinkcloud.invalid> schrieb:
>>> Thomas Koenig <tko...@netcologne.de> wrote:
>>>> Marcus <m.de...@this.bitsnbites.eu> schrieb:
>>>>> On 2021-12-30 22:07, Thomas Koenig wrote:
>>>>
>>>>>> What is the assembly for
>>>>>>
>>>>>> _Bool foo (int a, int b)
>>>>>> {
>>>>>> return a > b;
>>>>>> }
>>>>>>
>>>>>> for your architecture at a reasonable optimization level?
>>>>>>
>>>>>
>>>>> As a matter of fact, you can now try out MRISC32 on Compiler Explorer
>>>>> (support was added about a week ago): https://godbolt.org/z/z9sK3MYeM
>>>>
>>>>> POWER:
>>>>> subf 3,3,4
>>>>> srdi 3,3,63
>>>>> blr
>>>>
>>>> Subtract and right shift of the sign bit. That's another
>>>> way of doing this, I guess. When I saw this, I thought
>>>> "Somebody must have been feeling hackish"...
>>>
>>> Is it even correct? Surely that's (b - a < 0), not (a > b). I think
>>> it'd return true for foo(-0x7fffffff, 0x7fffffff). But I don't know
>>> the POWER ISA at all well, so pls ignore if this is cleverer than I
>>> think. :-).
>> POWER uses 64-bit data only (but it generates a 32-bit carry,
>> just in case), so this is indeed correct.
>>
>> The code emitted for
>>
>> _Bool foo (long a, long b)
>> {
>> return a > b;
>> }
>>
>> is far less pretty and depends on selected CPU and comipiler.
>>
>> gcc 4.8.5:
>>
>> cmpd cr7,r3,r4
>> mfocrf r3,1
>> rlwinm r3,r3,30,31,31
>> blr
>>
>> This moves a condition code into a register and extracts
>> the bit field.
>>
>> gcc 12 with -mcpu=power8 (the current default):
>>
>> sradi r10,r4,63
>> rldicl r9,r3,1,63
>> subfc r3,r3,r4
>> adde r3,r9,r10
>> xori r3,r3,1
>> clrlwi r3,r3,24
>> blr
>>
>> That is weird. Two shifts and masks, one subtraction, one
>> addition, one xor, and one instruction that clears bits.
>>
>> gcc 12 with -mcpu=power9:
>>
>> li r9,0
>> li r10,1
>> cmpd r3,r4
>> iselgt r3,r10,r9
>> blr
>>
>> Load constants into two registers and select based on a compare
>> instruction.
>>
>> xlc:
>>
>> cmpd r3,r4
>> li r3,1
>> bgt 14 <foo+0x14>
>> li r3,0
>> ori r2,r2,0
>> blr
>>
>> Load 1, and conditionally branch around the load of 0. The or
>> with immediate is a not and some sort of hint to the linker, IIRC.
>>
>> Clang:
>>
>> sradi r5,r4,63
>> rldicl r6,r3,1,63
>> subfc r3,r3,r4
>> adde r3,r6,r5
>> xori r3,r3,1
>> blr
>>
>> Same as gcc 12 for POWER8 above, but without the final clear.
>>
>> At leat from an instruction count, the oldest solution looks
>> to be the best.
>
> The PowerPC code demonstrates one of the issues with using condition
> registers instead of GPRs for comparison results. They end up being
> transferred to GPRs anyway. One would think that the compiler should
> return a value in a condition register for a function returning a bool. Which
> would be much more efficient. There are eight condition register available.
> That way the condition register could be specified in a subsequent
> conditional statement without needing to convert the GPR value to a
> condition code. The equals flag of the condition register could be used to
> store the Boolean value. The compiler would have to track values in
> condition registers. There should be a âsetâ instruction for the condition
> registers allowing the bits to be transferred to the equals bit.
>
> Compiled code should look something like:
> SLT cr0,r3,r4
> RET

Just like x86 condition codes, POWER compilers will probably do a better
job if they can inline bool functions, so that the condition code can be
used directly instead of first having to be reified as an int, then
tested again in the calling function.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: RISC-V vs. Aarch64

<sqpqbm$7qo$1@newsreader4.netcologne.de>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22679&group=comp.arch#22679

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.de!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-eb03-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Sat, 1 Jan 2022 15:01:10 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <sqpqbm$7qo$1@newsreader4.netcologne.de>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com>
<sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad>
<sql2cm$3h7$1@dont-email.me> <sql73d$6es$2@newsreader4.netcologne.de>
<sqmj5j$s31$1@dont-email.me> <sqmmso$446$2@newsreader4.netcologne.de>
<gs2dnRZj-ucyZ1P8nZ2dnUU78YfNnZ2d@supernews.com>
<sqpd0i$spj$1@newsreader4.netcologne.de>
<650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>
<sqpocs$1so3$1@gioia.aioe.org>
Injection-Date: Sat, 1 Jan 2022 15:01:10 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-eb03-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:eb03:0:7285:c2ff:fe6c:992d";
logging-data="8024"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Sat, 1 Jan 2022 15:01 UTC

Terje Mathisen <terje.mathisen@tmsw.no> schrieb:

> Just like x86 condition codes, POWER compilers will probably do a better
> job if they can inline bool functions, so that the condition code can be
> used directly instead of first having to be reified as an int, then
> tested again in the calling function.

Very much so.

_Bool gt (long int a, long int b)
{ return a > b;
}

long int mymax(long int a,long int b)
{ return gt(a,b) ? a : b;
}

will give you, with -O3 on a current trunk,

cmpd r3,r4
isellt r3,r4,r3
blr

for mymax.

Re: RISC-V vs. Aarch64

<KC_zJ.59028$Ak2.12921@fx20.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22680&group=comp.arch#22680

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!paganini.bofh.team!news.dns-netz.com!news.freedyn.de!newsreader4.netcologne.de!news.netcologne.de!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx20.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sq5dj1$1q9$1@dont-email.me> <59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com> <sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad> <sql2cm$3h7$1@dont-email.me> <sqmsqq$14kp$1@gioia.aioe.org> <VSFzJ.136700$7D4.47834@fx37.iad> <2021Dec31.203710@mips.complang.tuwien.ac.at>
In-Reply-To: <2021Dec31.203710@mips.complang.tuwien.ac.at>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 68
Message-ID: <KC_zJ.59028$Ak2.12921@fx20.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sat, 01 Jan 2022 15:41:30 UTC
Date: Sat, 01 Jan 2022 10:40:39 -0500
X-Received-Bytes: 3775

by: EricP - Sat, 1 Jan 2022 15:40 UTC

Anton Ertl wrote:
> EricP <ThatWouldBeTelling@thevillage.com> writes:
>
>> So if CMP produces a -1 then a C compiler would always generate a NEG too.
>
> Only if the flag needs to be reified (not that common in C, with
> branching operators like && and ||).

What I was getting at was that the code gen point would likely be
so far down in the compiler that it would have lost the context necessary
to know whether this expression was a non-reified bool or integer,
it would have to assume it was integer and be forced to spit out a NEG
(as required by the C standard).

That left it with the much harder job of trying to optimize NEG away later.

But I see below that you show how to retain the necessary context.

>> For boolean expressions in IF statements the NEG is unnecessary
>> if the branch tests zero/non-zero and so it might be optimized away.
>> But that requires the compiler code gen knowing that boolean expressions
>> in IF statements are not quite the same as other boolean expressions
>
> This is easy to implement in a compiler. E.g., in tree-parsing
> instruction selection you would have stuff like:
>
> grammar rule # generated code
> root: Condbranch(cond) # branch on the comparison result
> cond: Lt(reg,reg) # generate a comparison
> reg: cond # reify the cond as 0/1
>
> and as long as the conds are consumed only by condbranches, you get no
> reification and not negation.
>
>> (and we give a special dispensation to & and | operators on booleans).
>
> Not sure what you mean by that. You can also optimize & and | with
> flags as parameters; assuming a cond is 0/-1 in a register (and not
> just something in the condition codes; in that case I would introduce
> additional nonterminals):

The & and | operators normally act on integral data types and return
the same types. It would need special handling so they operate on your
cond type and don't trigger reification.

As you show below.

> znz: cond # no code
> cond: And(cond,cond) # and
> znz: And(cond,reg) # and
> znz: And(reg,cond) # and
> cond: Or(cond,cond) # or
> znz: Or(znz,reg) # or
> znz: Or(reg,znz) # or
>
> znz represents a flag as zero/nonzero value, which is useful in
> combination with branch on zero instructions (i.e., if the result is
> used by such a branch).

Yes, this overloading for & and | would do.

The compiler would have to be built to propagate cond as a distinct type.
I was assuming that none of this would be in a C compiler already,
why would it since C defines these all as integer expressions,
which is why it for a port to this ISA that it would always generate
the NEG and then have to try to optimize it away.

Re: RISC-V vs. Aarch64

<2022Jan1.173658@mips.complang.tuwien.ac.at>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22682&group=comp.arch#22682

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Sat, 01 Jan 2022 16:36:58 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 95
Message-ID: <2022Jan1.173658@mips.complang.tuwien.ac.at>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sq5dj1$1q9$1@dont-email.me> <59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com> <sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad> <sql2cm$3h7$1@dont-email.me> <sqmsqq$14kp$1@gioia.aioe.org> <VSFzJ.136700$7D4.47834@fx37.iad> <2021Dec31.203710@mips.complang.tuwien.ac.at> <KC_zJ.59028$Ak2.12921@fx20.iad>
Injection-Info: reader02.eternal-september.org; posting-host="5322b205682380f5271dc1257dc29a51";
logging-data="13110"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/qhEDOmRdD1V2uwxwvDnmL"
Cancel-Lock: sha1:YX6nDAl9naETZlwaryD/fsuyaCE=
X-newsreader: xrn 10.00-beta-3

by: Anton Ertl - Sat, 1 Jan 2022 16:36 UTC

EricP <ThatWouldBeTelling@thevillage.com> writes:
>Anton Ertl wrote:
>What I was getting at was that the code gen point would likely be
>so far down in the compiler that it would have lost the context necessary
>to know whether this expression was a non-reified bool or integer,

For that you would have to hide the data flow from the comparison to
the branch. Even relatively simple compilers will see this for most
code, and more sophisticated compilers keep track of data flow quite
far; admittedly, it gets a little tricky if there is a phi function in
between (but that's pretty rare for the stuff we are discussing here).

>> This is easy to implement in a compiler. E.g., in tree-parsing
>> instruction selection you would have stuff like:
>>
>> grammar rule # generated code
>> root: Condbranch(cond) # branch on the comparison result
>> cond: Lt(reg,reg) # generate a comparison
>> reg: cond # reify the cond as 0/1
>>
>> and as long as the conds are consumed only by condbranches, you get no
>> reification and not negation.
>>
>>> (and we give a special dispensation to & and | operators on booleans).
>>
>> Not sure what you mean by that. You can also optimize & and | with
>> flags as parameters; assuming a cond is 0/-1 in a register (and not
>> just something in the condition codes; in that case I would introduce
>> additional nonterminals):
>
>The & and | operators normally act on integral data types and return
>the same types. It would need special handling so they operate on your
>cond type and don't trigger reification.

cond is a nonterminal of the tree grammar, not a type. The trick with
tree grammars is that they typically are ambiguous, and the costs (not
shown here) of the code generation rules are used to select the
lowest-cost tree parse. In the present context, reifying will usually
be more expensive and will not be selected if an alternative exists.

>As you show below.
>
>> znz: cond # no code
>> cond: And(cond,cond) # and
>> znz: And(cond,reg) # and
>> znz: And(reg,cond) # and
>> cond: Or(cond,cond) # or
>> znz: Or(znz,reg) # or
>> znz: Or(reg,znz) # or
>>
>> znz represents a flag as zero/nonzero value, which is useful in
>> combination with branch on zero instructions (i.e., if the result is
>> used by such a branch).
>
>Yes, this overloading for & and | would do.
>
>The compiler would have to be built to propagate cond as a distinct type.

As mentioned, cond is a nonterminal, not a type (and root, reg, and
znz are also nonterminals). The rest of the compiler knows nothing
about these nonterminals, and does not need to concern itself with
them. It delivers a fully instantiated tree (i.e., only operators and
terminals) to the tree parser, and the tree parser selects the optimal
(wrt the tree grammar) parse (and thus code) for the tree. E.g.,

if (a<b) ...

could become the tree

Condbranch(Lt(Reg,Reg))

(note that the terminal "Reg" is different from the nonterminal "reg".)

Tree parsing is just one instruction selection technology, but others
(e.g., Davidson-Fraser) are just as capable of avoiding reification of
flags in the common cases. In Davidson-Fraser the conceptual process
might be along the lines of first generating the reification, and then
eliminating it through peephole optimization.

I guess that this is a major reason why we have not yet seen in the
condition handling area the kind of standardization on one particular
architectural style that we have seen in other areas (e.g., register
machines have won quite a while agod, 2s-complement has won a long
time ago, little-endian has won a short time ago, etc.).

In particular, (general-purpose) register machines have won because
compilers are so bad at making good use of special-purpose registers,
yet the MIPS or 88k approaches (both of which use general-purpose
registers for conditions) have not won out over the condition-code
register approach (AMD64, ARM A32, ARM A64).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: RISC-V vs. Aarch64

<sqq3ce$c4n$2@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22684&group=comp.arch#22684

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: m.del...@this.bitsnbites.eu (Marcus)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Sat, 1 Jan 2022 18:35:10 +0100
Organization: A noiseless patient Spider
Lines: 36
Message-ID: <sqq3ce$c4n$2@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com>
<sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad>
<sql2cm$3h7$1@dont-email.me> <sql73d$6es$2@newsreader4.netcologne.de>
<sqmj5j$s31$1@dont-email.me> <sqmmso$446$2@newsreader4.netcologne.de>
<gs2dnRZj-ucyZ1P8nZ2dnUU78YfNnZ2d@supernews.com>
<sqpd0i$spj$1@newsreader4.netcologne.de>
<650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>
<sqpocs$1so3$1@gioia.aioe.org> <sqpqbm$7qo$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 1 Jan 2022 17:35:10 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="7f459bd6dad7740af46c98d0e5a8aaf4";
logging-data="12439"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18HJZDW1AdtXqJVUkcqOn8Y1UVUvMWQa7Y="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
Cancel-Lock: sha1:Fk/gpTg+c6sKcDIavDOPA/yUqmE=
In-Reply-To: <sqpqbm$7qo$1@newsreader4.netcologne.de>
Content-Language: en-US

by: Marcus - Sat, 1 Jan 2022 17:35 UTC

On 2022-01-01, Thomas Koenig wrote:
> Terje Mathisen <terje.mathisen@tmsw.no> schrieb:
>
>> Just like x86 condition codes, POWER compilers will probably do a better
>> job if they can inline bool functions, so that the condition code can be
>> used directly instead of first having to be reified as an int, then
>> tested again in the calling function.
>
> Very much so.
>
> _Bool gt (long int a, long int b)
> {
> return a > b;
> }
>
> long int mymax(long int a,long int b)
> {
> return gt(a,b) ? a : b;
> }
>
> will give you, with -O3 on a current trunk,
>
> cmpd r3,r4
> isellt r3,r4,r3
> blr
>
> for mymax.
>

The MRISC32 version, https://godbolt.org/z/r6rTj5aWv

mymax:
max r1,r1,r2
ret

;-)

Re: RISC-V vs. Aarch64

<sqssff$a9j$1@gioia.aioe.org>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22690&group=comp.arch#22690

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!UgLt14+w9tVHe1BtIa3HDQ.user.46.165.242.75.POSTED!not-for-mail
From: mess...@bottle.org (Guillaume)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Sun, 2 Jan 2022 19:55:37 +0100
Organization: Aioe.org NNTP Server
Message-ID: <sqssff$a9j$1@gioia.aioe.org>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com>
<sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad>
<sql2cm$3h7$1@dont-email.me> <sql73d$6es$2@newsreader4.netcologne.de>
<sqmj5j$s31$1@dont-email.me> <sqmmso$446$2@newsreader4.netcologne.de>
<gs2dnRZj-ucyZ1P8nZ2dnUU78YfNnZ2d@supernews.com>
<sqpd0i$spj$1@newsreader4.netcologne.de>
<650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>
<sqpocs$1so3$1@gioia.aioe.org> <sqpqbm$7qo$1@newsreader4.netcologne.de>
<sqq3ce$c4n$2@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="10547"; posting-host="UgLt14+w9tVHe1BtIa3HDQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.4.1
X-Notice: Filtered by postfilter v. 0.9.2
Content-Language: fr

by: Guillaume - Sun, 2 Jan 2022 18:55 UTC

Le 01/01/2022 à 18:35, Marcus a écrit :
> On 2022-01-01, Thomas Koenig wrote:
>> Terje Mathisen <terje.mathisen@tmsw.no> schrieb:
>>
>>> Just like x86 condition codes, POWER compilers will probably do a better
>>> job if they can inline bool functions, so that the condition code can be
>>> used directly instead of first having to be reified as an int, then
>>> tested again in the calling function.
>>
>> Very much so.
>>
>> _Bool gt (long int a, long int b)
>> {
>>    return a > b;
>> }
>>
>> long int mymax(long int a,long int b)
>> {
>>    return gt(a,b) ? a : b;
>> }
>>
>> will give you, with -O3 on a current trunk,
>>
>>     cmpd    r3,r4
>>     isellt r3,r4,r3
>>     blr
>>
>> for mymax.
>>
>
> The MRISC32 version, https://godbolt.org/z/r6rTj5aWv
>
> mymax:
>         max     r1,r1,r2
>         ret
>
> ;-)

For RISCV, it's:
mymax:
bge a0,a1,.L4
mv a0,a1
..L4:
ret

So, requires a conditional branch...
But, for floating point (if supported), there is the 'fmax' instruction.

Re: RISC-V vs. Aarch64

<gWnAJ.5792$yl1.5519@fx23.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22692&group=comp.arch#22692

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!feeder1.feed.usenet.farm!feed.usenet.farm!peer02.ams4!peer.am4.highwinds-media.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx23.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sq5dj1$1q9$1@dont-email.me> <59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com> <sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad> <sql2cm$3h7$1@dont-email.me> <sqmsqq$14kp$1@gioia.aioe.org> <VSFzJ.136700$7D4.47834@fx37.iad> <2021Dec31.203710@mips.complang.tuwien.ac.at> <KC_zJ.59028$Ak2.12921@fx20.iad> <2022Jan1.173658@mips.complang.tuwien.ac.at>
In-Reply-To: <2022Jan1.173658@mips.complang.tuwien.ac.at>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 93
Message-ID: <gWnAJ.5792$yl1.5519@fx23.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sun, 02 Jan 2022 20:29:00 UTC
Date: Sun, 02 Jan 2022 15:27:10 -0500
X-Received-Bytes: 5379

by: EricP - Sun, 2 Jan 2022 20:27 UTC

Anton Ertl wrote:
> EricP <ThatWouldBeTelling@thevillage.com> writes:
>> Anton Ertl wrote:
>> What I was getting at was that the code gen point would likely be
>> so far down in the compiler that it would have lost the context necessary
>> to know whether this expression was a non-reified bool or integer,
>
> For that you would have to hide the data flow from the comparison to
> the branch. Even relatively simple compilers will see this for most
> code, and more sophisticated compilers keep track of data flow quite
> far; admittedly, it gets a little tricky if there is a phi function in
> between (but that's pretty rare for the stuff we are discussing here).
>
>>> This is easy to implement in a compiler. E.g., in tree-parsing
>>> instruction selection you would have stuff like:
>>>
>>> grammar rule # generated code
>>> root: Condbranch(cond) # branch on the comparison result
>>> cond: Lt(reg,reg) # generate a comparison
>>> reg: cond # reify the cond as 0/1
>>>
>>> and as long as the conds are consumed only by condbranches, you get no
>>> reification and not negation.
>>>
>>>> (and we give a special dispensation to & and | operators on booleans).
>>> Not sure what you mean by that. You can also optimize & and | with
>>> flags as parameters; assuming a cond is 0/-1 in a register (and not
>>> just something in the condition codes; in that case I would introduce
>>> additional nonterminals):
>> The & and | operators normally act on integral data types and return
>> the same types. It would need special handling so they operate on your
>> cond type and don't trigger reification.
>
> cond is a nonterminal of the tree grammar, not a type. The trick with
> tree grammars is that they typically are ambiguous, and the costs (not
> shown here) of the code generation rules are used to select the
> lowest-cost tree parse. In the present context, reifying will usually
> be more expensive and will not be selected if an alternative exists.
>
>> As you show below.
>>
>>> znz: cond # no code
>>> cond: And(cond,cond) # and
>>> znz: And(cond,reg) # and
>>> znz: And(reg,cond) # and
>>> cond: Or(cond,cond) # or
>>> znz: Or(znz,reg) # or
>>> znz: Or(reg,znz) # or
>>>
>>> znz represents a flag as zero/nonzero value, which is useful in
>>> combination with branch on zero instructions (i.e., if the result is
>>> used by such a branch).
>> Yes, this overloading for & and | would do.
>>
>> The compiler would have to be built to propagate cond as a distinct type.
>
> As mentioned, cond is a nonterminal, not a type (and root, reg, and
> znz are also nonterminals). The rest of the compiler knows nothing
> about these nonterminals, and does not need to concern itself with
> them. It delivers a fully instantiated tree (i.e., only operators and
> terminals) to the tree parser, and the tree parser selects the optimal
> (wrt the tree grammar) parse (and thus code) for the tree. E.g.,
>
> if (a<b) ...
>
> could become the tree
>
> Condbranch(Lt(Reg,Reg))
>
> (note that the terminal "Reg" is different from the nonterminal "reg".)
>
> Tree parsing is just one instruction selection technology, but others
> (e.g., Davidson-Fraser) are just as capable of avoiding reification of
> flags in the common cases. In Davidson-Fraser the conceptual process
> might be along the lines of first generating the reification, and then
> eliminating it through peephole optimization.

I'm looking at some papers on tree parsing peephole optimizers,
built using an LR(0) parser with extra bits to allow shift-reduce
and reduce-reduce conflicts to allow multiple matches.
Where the parse can follow multiple paths, it forks and follows all.

If I was doing this I would add an option to allow a semantic check to be
performed at each fork so you can guide it to choose particular forks
and prune unnecessary forks as soon as possible.

This breadth-wise cost based forking search is similar to how I did
syntax error repair. I also once built a lexical scanner that could
search for many, as in 10's of thousands, of search strings at once,
forking to follow multiple paths (I was thinking it might be useful
as a virus scanner).

Re: RISC-V vs. Aarch64

<077afaee-009e-4860-be45-61106126934bn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22693&group=comp.arch#22693

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:620a:28d0:: with SMTP id l16mr27289275qkp.449.1641163079149;
Sun, 02 Jan 2022 14:37:59 -0800 (PST)
X-Received: by 2002:a05:6830:2b20:: with SMTP id l32mr31841013otv.333.1641163078908;
Sun, 02 Jan 2022 14:37:58 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 2 Jan 2022 14:37:58 -0800 (PST)
In-Reply-To: <sqssff$a9j$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:853b:b310:626e:11f7;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:853b:b310:626e:11f7
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com> <sqkcvk$n97$1@dont-email.me>
<RrlzJ.130558$SR4.25229@fx43.iad> <sql2cm$3h7$1@dont-email.me>
<sql73d$6es$2@newsreader4.netcologne.de> <sqmj5j$s31$1@dont-email.me>
<sqmmso$446$2@newsreader4.netcologne.de> <gs2dnRZj-ucyZ1P8nZ2dnUU78YfNnZ2d@supernews.com>
<sqpd0i$spj$1@newsreader4.netcologne.de> <650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>
<sqpocs$1so3$1@gioia.aioe.org> <sqpqbm$7qo$1@newsreader4.netcologne.de>
<sqq3ce$c4n$2@dont-email.me> <sqssff$a9j$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <077afaee-009e-4860-be45-61106126934bn@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 02 Jan 2022 22:37:59 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 51

by: MitchAlsup - Sun, 2 Jan 2022 22:37 UTC

On Sunday, January 2, 2022 at 12:55:45 PM UTC-6, Guillaume wrote:
> Le 01/01/2022 à 18:35, Marcus a écrit :
> > On 2022-01-01, Thomas Koenig wrote:
> >> Terje Mathisen <terje.m...@tmsw.no> schrieb:
> >>
> >>> Just like x86 condition codes, POWER compilers will probably do a better
> >>> job if they can inline bool functions, so that the condition code can be
> >>> used directly instead of first having to be reified as an int, then
> >>> tested again in the calling function.
> >>
> >> Very much so.
> >>
> >> _Bool gt (long int a, long int b)
> >> {
> >> return a > b;
> >> }
> >>
> >> long int mymax(long int a,long int b)
> >> {
> >> return gt(a,b) ? a : b;
> >> }
> >>
> >> will give you, with -O3 on a current trunk,
> >>
> >> cmpd r3,r4
> >> isellt r3,r4,r3
> >> blr
> >>
> >> for mymax.
> >>
> >
> > The MRISC32 version, https://godbolt.org/z/r6rTj5aWv
> >
> > mymax:
> > max r1,r1,r2
> > ret
> >
> > ;-)
> For RISCV, it's:
> mymax:
> bge a0,a1,.L4
> mv a0,a1
> .L4:
> ret
>
> So, requires a conditional branch...
> But, for floating point (if supported), there is the 'fmax' instruction.
<
Why is there not an IMAX instruction in every modern ISA ??

Re: RISC-V vs. Aarch64

<squ9gr$qa4$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22696&group=comp.arch#22696

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: m.del...@this.bitsnbites.eu (Marcus)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Mon, 3 Jan 2022 08:44:26 +0100
Organization: A noiseless patient Spider
Lines: 73
Message-ID: <squ9gr$qa4$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com>
<sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad>
<sql2cm$3h7$1@dont-email.me> <sql73d$6es$2@newsreader4.netcologne.de>
<sqmj5j$s31$1@dont-email.me> <sqmmso$446$2@newsreader4.netcologne.de>
<gs2dnRZj-ucyZ1P8nZ2dnUU78YfNnZ2d@supernews.com>
<sqpd0i$spj$1@newsreader4.netcologne.de>
<650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>
<sqpocs$1so3$1@gioia.aioe.org> <sqpqbm$7qo$1@newsreader4.netcologne.de>
<sqq3ce$c4n$2@dont-email.me> <sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 3 Jan 2022 07:44:27 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="22afad3d0b7586bf798eb8e6b4fdbb5d";
logging-data="26948"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/iFsOqev8mgcCc950gLpsS+7v6VLTVN/U="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
Cancel-Lock: sha1:rX5CJMj3QMY5d5R0oJrrFr7B/vU=
In-Reply-To: <077afaee-009e-4860-be45-61106126934bn@googlegroups.com>
Content-Language: en-US

by: Marcus - Mon, 3 Jan 2022 07:44 UTC

On 2022-01-02, MitchAlsup wrote:
> On Sunday, January 2, 2022 at 12:55:45 PM UTC-6, Guillaume wrote:
>> Le 01/01/2022 à 18:35, Marcus a écrit :
>>> On 2022-01-01, Thomas Koenig wrote:
>>>> Terje Mathisen <terje.m...@tmsw.no> schrieb:
>>>>
>>>>> Just like x86 condition codes, POWER compilers will probably do a better
>>>>> job if they can inline bool functions, so that the condition code can be
>>>>> used directly instead of first having to be reified as an int, then
>>>>> tested again in the calling function.
>>>>
>>>> Very much so.
>>>>
>>>> _Bool gt (long int a, long int b)
>>>> {
>>>> return a > b;
>>>> }
>>>>
>>>> long int mymax(long int a,long int b)
>>>> {
>>>> return gt(a,b) ? a : b;
>>>> }
>>>>
>>>> will give you, with -O3 on a current trunk,
>>>>
>>>> cmpd r3,r4
>>>> isellt r3,r4,r3
>>>> blr
>>>>
>>>> for mymax.
>>>>
>>>
>>> The MRISC32 version, https://godbolt.org/z/r6rTj5aWv
>>>
>>> mymax:
>>> max r1,r1,r2
>>> ret
>>>
>>> ;-)
>> For RISCV, it's:
>> mymax:
>> bge a0,a1,.L4
>> mv a0,a1
>> .L4:
>> ret
>>
>> So, requires a conditional branch...
>> But, for floating point (if supported), there is the 'fmax' instruction.
> <
> Why is there not an IMAX instruction in every modern ISA ??
>

Agree.

Actually, RISC-V has MIN, MAX, MINU, MAXU (just like MRISC32) in the
bitmanip extension (a.k.a. the "add all instructions that should have
been part of the base ISA"-extension ;-) )

It also has VMIN, VMAX, VMINU, VMAXU in the V extension, acting on
vector registers.

....and VFMIN, VFMAX in the V extesions, acting on vector registers.

So in summary, RISC-V has two sets of integer MIN/MAX instructions and
two sets of floating-point MIN/MAX instructions. Different instructions
for different register files, and none of them part of the base
instruction set.

I also note that RISC-V goes against the trend and has *three* register
files (scalar integer + scalar floating-point + vector), with different
instruction sets for each register file. x86 history repeating?

/Marcus

Re: RISC-V vs. Aarch64

<2022Jan3.091811@mips.complang.tuwien.ac.at>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22697&group=comp.arch#22697

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Mon, 03 Jan 2022 08:18:11 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 55
Message-ID: <2022Jan3.091811@mips.complang.tuwien.ac.at>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sq5dj1$1q9$1@dont-email.me> <59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com> <sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad> <sql2cm$3h7$1@dont-email.me> <sqmsqq$14kp$1@gioia.aioe.org> <VSFzJ.136700$7D4.47834@fx37.iad> <2021Dec31.203710@mips.complang.tuwien.ac.at> <KC_zJ.59028$Ak2.12921@fx20.iad> <2022Jan1.173658@mips.complang.tuwien.ac.at> <gWnAJ.5792$yl1.5519@fx23.iad>
Injection-Info: reader02.eternal-september.org; posting-host="d2ff4cf4d9565f5ed7aa38586fbd14d6";
logging-data="21408"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18JG2U3KOqC+neFCDiHsxHj"
Cancel-Lock: sha1:/9P08h0cNwvBcE5PKFr2mgAbA4E=
X-newsreader: xrn 10.00-beta-3

by: Anton Ertl - Mon, 3 Jan 2022 08:18 UTC

EricP <ThatWouldBeTelling@thevillage.com> writes:
>I'm looking at some papers on tree parsing peephole optimizers,
>built using an LR(0) parser with extra bits to allow shift-reduce
>and reduce-reduce conflicts to allow multiple matches.

LR parsers are string parsers. The Graham-Glanville approach to code
generation used an LR parser on a prefix representation of the tree as
a poor-man's tree parser; of course they did not think about it that
way at the time, but once tree parser generators were developed, the
Graham-Glanville approach vanished.

An early paper on tree parsing is:

@InProceedings{emmelmann+89,
author = {Helmut Emmelmann and Friedrich-Wilhelm Schr\"oer and
Rudolf Landwehr},
title = {{BEG} -- a Generator for Efficient Back Ends},
crossref = "sigplan89",
pages = "227--237"
}

@Proceedings{sigplan89,
key = "SIGPLAN~'89",
booktitle = "SIGPLAN~'89 Conference on
Programming Language Design and Implementation",
title = "SIGPLAN~'89 Conference on
Programming Language Design and Implementation",
year = "1989",
}

Or you can read Section 2 of
<http://www.complang.tuwien.ac.at/papers/thier+18.pdf>, which provides
some background on this technology.

>If I was doing this I would add an option to allow a semantic check to be
>performed at each fork so you can guide it to choose particular forks
>and prune unnecessary forks as soon as possible.

For best code generation speed, the tree grammar is transformed into a
tree-parsing automaton (e.g., with the tool "burg"), where the code
generation speed is independent of the number of rules (and number of
alternative trees) involved.

Concerning semantics checks, in BEG and lburg costs can depend on
semantic things (i.e., things beyond the tree grammar), such as the
values of constants; Thier's constraints disable rules based on
semantic things (again, like the values of constants). But both
features are used for improved modeling of instructions (e.g., the
MIPS addiu instruction allows only a certain range of immediate
values), not pruning forks.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: RISC-V vs. Aarch64

<squhht$79u$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22699&group=comp.arch#22699

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Mon, 3 Jan 2022 02:01:33 -0800
Organization: A noiseless patient Spider
Lines: 56
Message-ID: <squhht$79u$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com>
<sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad>
<sql2cm$3h7$1@dont-email.me> <sql73d$6es$2@newsreader4.netcologne.de>
<sqmj5j$s31$1@dont-email.me> <sqmmso$446$2@newsreader4.netcologne.de>
<gs2dnRZj-ucyZ1P8nZ2dnUU78YfNnZ2d@supernews.com>
<sqpd0i$spj$1@newsreader4.netcologne.de>
<650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>
<sqpocs$1so3$1@gioia.aioe.org> <sqpqbm$7qo$1@newsreader4.netcologne.de>
<sqq3ce$c4n$2@dont-email.me> <sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 3 Jan 2022 10:01:33 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="0581a63b9f9ef7850db1070783393328";
logging-data="7486"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18aP77GAVWk3IKfL5JHiI3s"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.4.1
Cancel-Lock: sha1:+u+tv7O+lAohKFxNFCZQAsmXrNM=
In-Reply-To: <077afaee-009e-4860-be45-61106126934bn@googlegroups.com>
Content-Language: en-US

by: Ivan Godard - Mon, 3 Jan 2022 10:01 UTC

On 1/2/2022 2:37 PM, MitchAlsup wrote:
> On Sunday, January 2, 2022 at 12:55:45 PM UTC-6, Guillaume wrote:
>> Le 01/01/2022 à 18:35, Marcus a écrit :
>>> On 2022-01-01, Thomas Koenig wrote:
>>>> Terje Mathisen <terje.m...@tmsw.no> schrieb:
>>>>
>>>>> Just like x86 condition codes, POWER compilers will probably do a better
>>>>> job if they can inline bool functions, so that the condition code can be
>>>>> used directly instead of first having to be reified as an int, then
>>>>> tested again in the calling function.
>>>>
>>>> Very much so.
>>>>
>>>> _Bool gt (long int a, long int b)
>>>> {
>>>> return a > b;
>>>> }
>>>>
>>>> long int mymax(long int a,long int b)
>>>> {
>>>> return gt(a,b) ? a : b;
>>>> }
>>>>
>>>> will give you, with -O3 on a current trunk,
>>>>
>>>> cmpd r3,r4
>>>> isellt r3,r4,r3
>>>> blr
>>>>
>>>> for mymax.
>>>>
>>>
>>> The MRISC32 version, https://godbolt.org/z/r6rTj5aWv
>>>
>>> mymax:
>>> max r1,r1,r2
>>> ret
>>>
>>> ;-)
>> For RISCV, it's:
>> mymax:
>> bge a0,a1,.L4
>> mv a0,a1
>> .L4:
>> ret
>>
>> So, requires a conditional branch...
>> But, for floating point (if supported), there is the 'fmax' instruction.
> <
> Why is there not an IMAX instruction in every modern ISA ??

because you don't need it when you have "?:".

gtr(b0, b1), pick(b0, b1, b2), retn(b0);

one bundle, one cycle.

Re: RISC-V vs. Aarch64

<bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22704&group=comp.arch#22704

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:5ad1:: with SMTP id d17mr41261578qtd.23.1641230909899;
Mon, 03 Jan 2022 09:28:29 -0800 (PST)
X-Received: by 2002:a05:6830:154d:: with SMTP id l13mr33216291otp.282.1641230909612;
Mon, 03 Jan 2022 09:28:29 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 3 Jan 2022 09:28:29 -0800 (PST)
In-Reply-To: <squhht$79u$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:5959:7534:4159:ef20;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:5959:7534:4159:ef20
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com> <sqkcvk$n97$1@dont-email.me>
<RrlzJ.130558$SR4.25229@fx43.iad> <sql2cm$3h7$1@dont-email.me>
<sql73d$6es$2@newsreader4.netcologne.de> <sqmj5j$s31$1@dont-email.me>
<sqmmso$446$2@newsreader4.netcologne.de> <gs2dnRZj-ucyZ1P8nZ2dnUU78YfNnZ2d@supernews.com>
<sqpd0i$spj$1@newsreader4.netcologne.de> <650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>
<sqpocs$1so3$1@gioia.aioe.org> <sqpqbm$7qo$1@newsreader4.netcologne.de>
<sqq3ce$c4n$2@dont-email.me> <sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com> <squhht$79u$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <bb6d49bb-a676-44bd-9a6d-29386d429454n@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 03 Jan 2022 17:28:29 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 61

by: MitchAlsup - Mon, 3 Jan 2022 17:28 UTC

On Monday, January 3, 2022 at 4:01:36 AM UTC-6, Ivan Godard wrote:
> On 1/2/2022 2:37 PM, MitchAlsup wrote:
> > On Sunday, January 2, 2022 at 12:55:45 PM UTC-6, Guillaume wrote:
> >> Le 01/01/2022 à 18:35, Marcus a écrit :
> >>> On 2022-01-01, Thomas Koenig wrote:
> >>>> Terje Mathisen <terje.m...@tmsw.no> schrieb:
> >>>>
> >>>>> Just like x86 condition codes, POWER compilers will probably do a better
> >>>>> job if they can inline bool functions, so that the condition code can be
> >>>>> used directly instead of first having to be reified as an int, then
> >>>>> tested again in the calling function.
> >>>>
> >>>> Very much so.
> >>>>
> >>>> _Bool gt (long int a, long int b)
> >>>> {
> >>>> return a > b;
> >>>> }
> >>>>
> >>>> long int mymax(long int a,long int b)
> >>>> {
> >>>> return gt(a,b) ? a : b;
> >>>> }
> >>>>
> >>>> will give you, with -O3 on a current trunk,
> >>>>
> >>>> cmpd r3,r4
> >>>> isellt r3,r4,r3
> >>>> blr
> >>>>
> >>>> for mymax.
> >>>>
> >>>
> >>> The MRISC32 version, https://godbolt.org/z/r6rTj5aWv
> >>>
> >>> mymax:
> >>> max r1,r1,r2
> >>> ret
> >>>
> >>> ;-)
> >> For RISCV, it's:
> >> mymax:
> >> bge a0,a1,.L4
> >> mv a0,a1
> >> .L4:
> >> ret
> >>
> >> So, requires a conditional branch...
> >> But, for floating point (if supported), there is the 'fmax' instruction.
> > <
> > Why is there not an IMAX instruction in every modern ISA ??
> because you don't need it when you have "?:".
>
> gtr(b0, b1), pick(b0, b1, b2), retn(b0);
>
> one bundle, one cycle.
<
and at least 4 times as much transport energy.

Re: RISC-V vs. Aarch64

<sqverm$adp$1@gioia.aioe.org>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22708&group=comp.arch#22708

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!UgLt14+w9tVHe1BtIa3HDQ.user.46.165.242.75.POSTED!not-for-mail
From: mess...@bottle.org (Guillaume)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Mon, 3 Jan 2022 19:21:36 +0100
Organization: Aioe.org NNTP Server
Message-ID: <sqverm$adp$1@gioia.aioe.org>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com>
<sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad>
<sql2cm$3h7$1@dont-email.me> <sql73d$6es$2@newsreader4.netcologne.de>
<sqmj5j$s31$1@dont-email.me> <sqmmso$446$2@newsreader4.netcologne.de>
<gs2dnRZj-ucyZ1P8nZ2dnUU78YfNnZ2d@supernews.com>
<sqpd0i$spj$1@newsreader4.netcologne.de>
<650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>
<sqpocs$1so3$1@gioia.aioe.org> <sqpqbm$7qo$1@newsreader4.netcologne.de>
<sqq3ce$c4n$2@dont-email.me> <sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="10681"; posting-host="UgLt14+w9tVHe1BtIa3HDQ.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.4.1
Content-Language: fr
X-Notice: Filtered by postfilter v. 0.9.2

by: Guillaume - Mon, 3 Jan 2022 18:21 UTC

Le 02/01/2022 à 23:37, MitchAlsup a écrit :
> Why is there not an IMAX instruction in every modern ISA ??

Well, I can't say for every modern ISA, but for RISC-V, as Marcus said,
MIN/MAX instructions (and much more) are available in the Bitmanip
extension that has now been ratified recently if I'm not mistaken.

The reason is that the RISC-V ISA is very modular, so they have choosen
to keep the base ISA minimal, and then extend it.

It has benefits of course - you can design cores that are as minimal or
as featureful as needed, while still being compliant - but it also has
drawbacks: it will tend to lead to fragmentation. RISC-V has mitigated
that by defining "supersets" such as RV64GC that contain a number of
extensions, and that are meant for more general-purpose computing, but
fragmentation can still be an issue.

As to why they chose to put the MIN/MAX instructions in the "Bitmanip"
extension, I'm not sure. It looks like this extension contains A LOT of
common instructions that were non-essential, and missing in the base
ISA. Like a big bag of instructions, not all of them being really bit
manipulation-related IMHO. Oh, and the Bitmanip extension is itself
sliced into 4 sub-extensions =). Good luck if you were to implement the
whole extension in its entirety: it's very large.

Don't get me wrong, I do like RISC-V and have designed a RV32/64 core.
The more I've worked with this ISA and the more I've gotten to
understand the choices that were made. But, there's still this point
about fragmentation that I'm concerned with.

Re: RISC-V vs. Aarch64

<sqvj2a$1lp$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22711&group=comp.arch#22711

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: m.del...@this.bitsnbites.eu (Marcus)
Newsgroups: comp.arch
Subject: Re: RISC-V vs. Aarch64
Date: Mon, 3 Jan 2022 20:33:30 +0100
Organization: A noiseless patient Spider
Lines: 45
Message-ID: <sqvj2a$1lp$1@dont-email.me>
References: <2021Dec24.163843@mips.complang.tuwien.ac.at>
<sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com>
<sqkcvk$n97$1@dont-email.me> <RrlzJ.130558$SR4.25229@fx43.iad>
<sql2cm$3h7$1@dont-email.me> <sql73d$6es$2@newsreader4.netcologne.de>
<sqmj5j$s31$1@dont-email.me> <sqmmso$446$2@newsreader4.netcologne.de>
<gs2dnRZj-ucyZ1P8nZ2dnUU78YfNnZ2d@supernews.com>
<sqpd0i$spj$1@newsreader4.netcologne.de>
<650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>
<sqpocs$1so3$1@gioia.aioe.org> <sqpqbm$7qo$1@newsreader4.netcologne.de>
<sqq3ce$c4n$2@dont-email.me> <sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com>
<sqverm$adp$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 3 Jan 2022 19:33:30 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="22afad3d0b7586bf798eb8e6b4fdbb5d";
logging-data="1721"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19ESEzex1WrRtmK4MZ1j5B9yi0txX9dEDs="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
Cancel-Lock: sha1:6H1AoVNGR8X7z8BrVgJwv6/x+e8=
In-Reply-To: <sqverm$adp$1@gioia.aioe.org>
Content-Language: en-US

by: Marcus - Mon, 3 Jan 2022 19:33 UTC

On 2022-01-03, Guillaume wrote:
> Le 02/01/2022 à 23:37, MitchAlsup a écrit :
>> Why is there not an IMAX instruction in every modern ISA ??
>
> Well, I can't say for every modern ISA, but for RISC-V, as Marcus said,
> MIN/MAX instructions (and much more) are available in the Bitmanip
> extension that has now been ratified recently if I'm not mistaken.
>
> The reason is that the RISC-V ISA is very modular, so they have choosen
> to keep the base ISA minimal, and then extend it.
>
> It has benefits of course - you can design cores that are as minimal or
> as featureful as needed, while still being compliant - but it also has
> drawbacks: it will tend to lead to fragmentation. RISC-V has mitigated
> that by defining "supersets" such as RV64GC that contain a number of
> extensions, and that are meant for more general-purpose computing, but
> fragmentation can still be an issue.
>
> As to why they chose to put the MIN/MAX instructions in the "Bitmanip"
> extension, I'm not sure. It looks like this extension contains A LOT of
> common instructions that were non-essential, and missing in the base
> ISA. Like a big bag of instructions, not all of them being really bit
> manipulation-related IMHO. Oh, and the Bitmanip extension is itself
> sliced into 4 sub-extensions =). Good luck if you were to implement the
> whole extension in its entirety: it's very large.

Funny thing: I recently discovered that they seem to have dropped the
bit-field instructions from bitmanip (!). I thought that it was one of
the key features of that extension. So I guess we're looking at more
bit-field extensions in the future, adding even more to the fragmentation.

https://github.com/riscv/riscv-bitmanip/issues/169

>
> Don't get me wrong, I do like RISC-V and have designed a RV32/64 core.
> The more I've worked with this ISA and the more I've gotten to
> understand the choices that were made. But, there's still this point
> about fragmentation that I'm concerned with.

I too like RISC-V (it's open, it works and it kind of scales), and at
the same time I dislike it (mostly the over-catering for ultra-small,
leading to some sub-optimal design decisions, and the fragmentation
part).

/Marcus

Re: RISC-V vs. Aarch64

<a4ed6b81-c17f-401a-82c5-8acc2fa2c198n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=22712&group=comp.arch#22712

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:620a:4495:: with SMTP id x21mr32782732qkp.604.1641240439753;
Mon, 03 Jan 2022 12:07:19 -0800 (PST)
X-Received: by 2002:a05:6808:1248:: with SMTP id o8mr36761102oiv.157.1641240439606;
Mon, 03 Jan 2022 12:07:19 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 3 Jan 2022 12:07:19 -0800 (PST)
In-Reply-To: <sqverm$adp$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:5959:7534:4159:ef20;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:5959:7534:4159:ef20
References: <2021Dec24.163843@mips.complang.tuwien.ac.at> <sq5dj1$1q9$1@dont-email.me>
<59376149-c3d3-489e-8b41-f21bdd0ce5a9n@googlegroups.com> <sqkcvk$n97$1@dont-email.me>
<RrlzJ.130558$SR4.25229@fx43.iad> <sql2cm$3h7$1@dont-email.me>
<sql73d$6es$2@newsreader4.netcologne.de> <sqmj5j$s31$1@dont-email.me>
<sqmmso$446$2@newsreader4.netcologne.de> <gs2dnRZj-ucyZ1P8nZ2dnUU78YfNnZ2d@supernews.com>
<sqpd0i$spj$1@newsreader4.netcologne.de> <650c822a-3776-4ea9-aa72-5a6b19bdcabbn@googlegroups.com>
<sqpocs$1so3$1@gioia.aioe.org> <sqpqbm$7qo$1@newsreader4.netcologne.de>
<sqq3ce$c4n$2@dont-email.me> <sqssff$a9j$1@gioia.aioe.org>
<077afaee-009e-4860-be45-61106126934bn@googlegroups.com> <sqverm$adp$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <a4ed6b81-c17f-401a-82c5-8acc2fa2c198n@googlegroups.com>
Subject: Re: RISC-V vs. Aarch64
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 03 Jan 2022 20:07:19 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 46

by: MitchAlsup - Mon, 3 Jan 2022 20:07 UTC

On Monday, January 3, 2022 at 12:21:44 PM UTC-6, Guillaume wrote:
> Le 02/01/2022 à 23:37, MitchAlsup a écrit :
> > Why is there not an IMAX instruction in every modern ISA ??
> Well, I can't say for every modern ISA, but for RISC-V, as Marcus said,
> MIN/MAX instructions (and much more) are available in the Bitmanip
> extension that has now been ratified recently if I'm not mistaken.
>
> The reason is that the RISC-V ISA is very modular, so they have choosen
> to keep the base ISA minimal, and then extend it.
<
And yet they seem to have nearly 2× the instruction count of My 66000 (59 BTW)
Including single and double FP along with 8 transcendental functions (sin,....)
>
> It has benefits of course - you can design cores that are as minimal or
> as featureful as needed, while still being compliant - but it also has
> drawbacks: it will tend to lead to fragmentation. RISC-V has mitigated
> that by defining "supersets" such as RV64GC that contain a number of
> extensions, and that are meant for more general-purpose computing, but
> fragmentation can still be an issue.
<
Having done many "cores", I can guarantee that adding MAX and MIN takes
more gates in the decoder than in the data path.
<
At one point in time I had ABS as a macro that used MAX as its base
instruction. Later I realized that ABS can be calculated at ¼ the power of
feeding MAX(x,-x) to the MAX/MIN unit.
>
> As to why they chose to put the MIN/MAX instructions in the "Bitmanip"
> extension, I'm not sure. It looks like this extension contains A LOT of
> common instructions that were non-essential, and missing in the base
> ISA. Like a big bag of instructions, not all of them being really bit
> manipulation-related IMHO. Oh, and the Bitmanip extension is itself
> sliced into 4 sub-extensions =). Good luck if you were to implement the
> whole extension in its entirety: it's very large.
>
> Don't get me wrong, I do like RISC-V and have designed a RV32/64 core.
> The more I've worked with this ISA and the more I've gotten to
> understand the choices that were made. But, there's still this point
> about fragmentation that I'm concerned with.

"Your butt is mine." -- Michael Jackson, Bad

devel / comp.arch / Re: RISC-V vs. Aarch64

Subject	Author
RISC-V vs. Aarch64	Anton Ertl
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Anton Ertl
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	BGB
Re: RISC-V vs. Aarch64	Anton Ertl
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	robf...@gmail.com
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Quadibloc
Re: RISC-V vs. Aarch64	Quadibloc
Re: RISC-V vs. Aarch64	Quadibloc
Re: RISC-V vs. Aarch64	Marcus
Re: RISC-V vs. Aarch64	BGB
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	BGB
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	BGB
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Thomas Koenig
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	EricP
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Marcus
Re: RISC-V vs. Aarch64	BGB
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	BGB
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Thomas Koenig
Re: RISC-V vs. Aarch64	Marcus
Re: RISC-V vs. Aarch64	EricP
Re: RISC-V vs. Aarch64	Marcus
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Niklas Holsti
Re: RISC-V vs. Aarch64	Bill Findlay
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	Thomas Koenig
Re: RISC-V vs. Aarch64	Thomas Koenig
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	BGB
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	Thomas Koenig
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	Marcus
Re: RISC-V vs. Aarch64	Thomas Koenig
Re: RISC-V vs. Aarch64	aph
Re: RISC-V vs. Aarch64	Michael S
Re: RISC-V vs. Aarch64	Thomas Koenig
Re: RISC-V vs. Aarch64	robf...@gmail.com
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	Tim Rentsch
Re: RISC-V vs. Aarch64	Terje Mathisen
Re: RISC-V vs. Aarch64	Thomas Koenig
Re: RISC-V vs. Aarch64	Marcus
Re: RISC-V vs. Aarch64	Guillaume
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Marcus
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	Thomas Koenig
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	EricP
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	EricP
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	EricP
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	Brett
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	Stephen Fuld
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	Stefan Monnier
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	MitchAlsup
Re: RISC-V vs. Aarch64	Stephen Fuld
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	EricP
Re: RISC-V vs. Aarch64	EricP
Re: RISC-V vs. Aarch64	Ivan Godard
The type of Mill's belt's slots	Stefan Monnier
Re: The type of Mill's belt's slots	MitchAlsup
Re: The type of Mill's belt's slots	Ivan Godard
Re: The type of Mill's belt's slots	Stefan Monnier
Re: The type of Mill's belt's slots	Ivan Godard
Re: The type of Mill's belt's slots	Stefan Monnier
Re: The type of Mill's belt's slots	Ivan Godard
Re: The type of Mill's belt's slots	MitchAlsup
Re: RISC-V vs. Aarch64	Ivan Godard
Re: RISC-V vs. Aarch64	Guillaume
Re: RISC-V vs. Aarch64	Quadibloc
MRISC32 vectorization (was: RISC-V vs. Aarch64)	Thomas Koenig
Re: RISC-V vs. Aarch64	Terje Mathisen
Re: RISC-V vs. Aarch64	Quadibloc
Re: RISC-V vs. Aarch64	Anton Ertl
Re: RISC-V vs. Aarch64	aph