Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

When Dexter's on the Internet, can Hell be far behind?"


devel / comp.arch / Re: Automatic register spill / restore?

SubjectAuthor
* Automatic register spill / restore?Andy
+* Re: Automatic register spill / restore?EricP
|+* Re: Automatic register spill / restore?Niklas Holsti
||+* Re: Automatic register spill / restore?Anton Ertl
|||+* Re: Automatic register spill / restore?Ivan Godard
||||`- Re: Automatic register spill / restore?EricP
|||`* Re: Automatic register spill / restore?Niklas Holsti
||| `- Re: Automatic register spill / restore?MitchAlsup
||`* Re: Automatic register spill / restore?EricP
|| `* Re: Automatic register spill / restore?Niklas Holsti
||  +* Re: Automatic register spill / restore?Niklas Holsti
||  |+- Re: Automatic register spill / restore?MitchAlsup
||  |`- Re: Automatic register spill / restore?EricP
||  `* Re: Automatic register spill / restore?MitchAlsup
||   +- Re: Automatic register spill / restore?Michael S
||   `* Re: Automatic register spill / restore?Anton Ertl
||    +* Re: Automatic register spill / restore?John Dallman
||    |`- Re: Automatic register spill / restore?Marcus
||    `- Re: Automatic register spill / restore?EricP
|`* Re: Automatic register spill / restore?Andy
| +* Re: Automatic register spill / restore?BGB
| |`* Re: Wither VLIW? was Automatic register spill / restore?Andy
| | `- Re: Wither VLIW? was Automatic register spill / restore?BGB
| `- Re: Automatic register spill / restore?EricP
+* Re: Automatic register spill / restore?MitchAlsup
|+* Re: Automatic register spill / restore?John Levine
||`- Re: Automatic register spill / restore?Niklas Holsti
|+* Re: Automatic register spill / restore?Marcus
||`* Re: Automatic register spill / restore?MitchAlsup
|| `- Re: Automatic register spill / restore?John Levine
|`* Re: Automatic register spill / restore?Andy
| `* Re: Automatic register spill / restore?MitchAlsup
|  `* Re: Automatic register spill / restore?Ivan Godard
|   `- Re: Automatic register spill / restore?MitchAlsup
`- Re: Automatic register spill / restore?Jecel Assumpção Jr

Pages:12
Re: Automatic register spill / restore?

<ta2ufc$el2$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=26362&group=comp.arch#26362

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!vxOTfS5Jn/b499FNW7Y8DA.user.46.165.242.75.POSTED!not-for-mail
From: nos...@nowhere.com (Andy)
Newsgroups: comp.arch
Subject: Re: Automatic register spill / restore?
Date: Wed, 6 Jul 2022 15:10:34 +1200
Organization: Aioe.org NNTP Server
Message-ID: <ta2ufc$el2$1@gioia.aioe.org>
References: <t9tuvg$1qgu$1@gioia.aioe.org> <OlEwK.24566$8f2.9092@fx38.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="15010"; posting-host="vxOTfS5Jn/b499FNW7Y8DA.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.9.1
Content-Language: en-US
X-Notice: Filtered by postfilter v. 0.9.2
 by: Andy - Wed, 6 Jul 2022 03:10 UTC

On 5/07/22 04:14, EricP wrote:

>
> Sparc register windows. They were opaque, asynchronous lazy spill/fill
> which was done by kernel mode traps. Reportedly it had... issues.

Yep while I'm vaguely aware of SPARC register windows, that's not
exactly what I was looking for, since I'm pretty sure it was one of the
older perhaps lesser known CISCy mainframes not a newish RISC style
microprocessor.
And I think it used a fairly strait forward register file aside from the
fact the programmer never needed to manually save and restore registers
when calling subroutines and the like.

But aside from that, I was never struck with the impression that
hardware managed register windows were a particularly great idea.

If someone is going to put 120 odd registers into a CPU surely making
them all programmer visible in some way and letting them / the compiler
or operating system decide how best to carve the register file up would
be the better option.

And of course with fresh pop-corn in hand, watching the ensuing
technical / corporate / tribal / religious / jihadist arguments flying
to and fro on how best to do that, would no doubt be hugely and/or
endlessly entertaining! ;-)

Re: Automatic register spill / restore?

<2022Jul6.115355@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=26374&group=comp.arch#26374

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Automatic register spill / restore?
Date: Wed, 06 Jul 2022 09:53:55 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 35
Message-ID: <2022Jul6.115355@mips.complang.tuwien.ac.at>
References: <t9tuvg$1qgu$1@gioia.aioe.org> <OlEwK.24566$8f2.9092@fx38.iad> <jigrgdF3ft7U1@mid.individual.net> <SOZwK.281950$ssF.145632@fx14.iad> <jijg97Ffk0hU1@mid.individual.net> <dad7bdde-2b36-40fd-8b48-74448f32c158n@googlegroups.com>
Injection-Info: reader01.eternal-september.org; posting-host="014d1612568f0b4eb1496ead96ace905";
logging-data="27434"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/W/qYhzLiRpc+W67xAlI6s"
Cancel-Lock: sha1:9RahxWCL2UDyw5M67y628TDSELA=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Wed, 6 Jul 2022 09:53 UTC

MitchAlsup <MitchAlsup@aol.com> writes:
>Whereas MIPS got to high frequencies rather easily, SPARC never did.

Bullshit.

MHz Architecture, CPU
1000 MIPS R16000A (select customers), at least 800MHz in 2004
1200 SPARC UltraSPARC III Cu (released 2001 with at least 900MHz)
1500 MIPS 1074K
2000 MIPS P5600, P6600
4250 SPARC64 XII (Fujitsu)
5000 SPARC M8 (Oracle)

>Read into that what you will.

I read into that that SPARC had a ~3 year clock rate advantage on MIPS
in the early 2000s.

SGI/MIPS had difficulties competing in the GHz race and eventually
bowed out; the higher-clocked cores we see much later are embedded
cores.

Sun and Fujitsu persevered on. Fujitsu introduced the OoO Sparc64 V
in 2003 and reached competetive clock rates over time. Sun/Oracle
continued for a while with in-order cores and lower clock rates, until
they introduced the 2850-3000MHz SPARC T4 with OoO in 2011, and almost
doubled the clock rate compared to the in-order 1650MHz SPARC T3.
Later OoO CPUs from Oracle further increased the clock rate and also
the width. But apparently the customers had defected to AMD64 in the
meantime, so Oracle canceled SPARC after the M8.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Automatic register spill / restore?

<memo.20220706153640.2400B@jgd.cix.co.uk>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=26377&group=comp.arch#26377

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: jgd...@cix.co.uk (John Dallman)
Newsgroups: comp.arch
Subject: Re: Automatic register spill / restore?
Date: Wed, 6 Jul 2022 15:36 +0100 (BST)
Organization: A noiseless patient Spider
Lines: 13
Message-ID: <memo.20220706153640.2400B@jgd.cix.co.uk>
References: <2022Jul6.115355@mips.complang.tuwien.ac.at>
Reply-To: jgd@cix.co.uk
Injection-Info: reader01.eternal-september.org; posting-host="0e3ddffc2e328e5155d6da726d8c837b";
logging-data="73587"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19zHAo6fdFAfaZlEE9L8Evgr9FO0EL4hRc="
Cancel-Lock: sha1:QtRSyfLl/jUdi4VKXUDD53K/U0g=
 by: John Dallman - Wed, 6 Jul 2022 14:36 UTC

In article <2022Jul6.115355@mips.complang.tuwien.ac.at>,
anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:

> But apparently the customers had defected to AMD64 in
> the meantime, so Oracle canceled SPARC after the M8.

Oracle cancelled SPARC slightly before the M8 shipped, by getting rid of
the development team and a lot of the Solaris staff. They claimed that it
would carry on getting faster for years, via specialised on-chip
co-processors, but they were not very convincing in that, nor in their
claims that customers would be albe to run Solaris until at least 2031.

John

Re: Automatic register spill / restore?

<05ixK.395203$X_i.145485@fx18.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=26378&group=comp.arch#26378

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!feed1.usenet.blueworldhosting.com!usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx18.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Automatic register spill / restore?
References: <t9tuvg$1qgu$1@gioia.aioe.org> <OlEwK.24566$8f2.9092@fx38.iad> <jigrgdF3ft7U1@mid.individual.net> <SOZwK.281950$ssF.145632@fx14.iad> <jijg97Ffk0hU1@mid.individual.net> <dad7bdde-2b36-40fd-8b48-74448f32c158n@googlegroups.com> <2022Jul6.115355@mips.complang.tuwien.ac.at>
In-Reply-To: <2022Jul6.115355@mips.complang.tuwien.ac.at>
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 61
Message-ID: <05ixK.395203$X_i.145485@fx18.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 06 Jul 2022 15:43:56 UTC
Date: Wed, 06 Jul 2022 11:43:05 -0400
X-Received-Bytes: 3283
 by: EricP - Wed, 6 Jul 2022 15:43 UTC

Anton Ertl wrote:
> MitchAlsup <MitchAlsup@aol.com> writes:
>> Whereas MIPS got to high frequencies rather easily, SPARC never did.
>
> Bullshit.
>
> MHz Architecture, CPU
> 1000 MIPS R16000A (select customers), at least 800MHz in 2004
> 1200 SPARC UltraSPARC III Cu (released 2001 with at least 900MHz)
> 1500 MIPS 1074K
> 2000 MIPS P5600, P6600
> 4250 SPARC64 XII (Fujitsu)
> 5000 SPARC M8 (Oracle)
>
>> Read into that what you will.
>
> I read into that that SPARC had a ~3 year clock rate advantage on MIPS
> in the early 2000s.
>
> SGI/MIPS had difficulties competing in the GHz race and eventually
> bowed out; the higher-clocked cores we see much later are embedded
> cores.
>
> Sun and Fujitsu persevered on. Fujitsu introduced the OoO Sparc64 V
> in 2003 and reached competetive clock rates over time. Sun/Oracle
> continued for a while with in-order cores and lower clock rates, until
> they introduced the 2850-3000MHz SPARC T4 with OoO in 2011, and almost
> doubled the clock rate compared to the in-order 1650MHz SPARC T3.
> Later OoO CPUs from Oracle further increased the clock rate and also
> the width. But apparently the customers had defected to AMD64 in the
> meantime, so Oracle canceled SPARC after the M8.
>
> - anton

A little poking about finds a description the OoO SPARC V circa 2004
microarchitecture and how it handles registers to get higher clock.

Integer register GPR (General Purpose Registers) has 8 read ports,
2x2 read for integer ops, 2x2 read for AGU.

To access the GPRs faster it:
(a) limits the register windows to 8 sets.
(b) splits the physical register file into 2 sections:
the slower large GPR, and fast access JWR Joint Work Register.

The JWR keeps 3 windows (64*8 bytes in total) for the current window and
those either side of it. Read data from JWR is fed to execution units.
Both JWR and GPR are updated at the same time on commit.
When a window switch occurs, hardware copies window data
between JWR and GPR in the background.

(Hmmm... if JWR and GPR are both updated on commit,
how does that not put the GPR back as the critical path limit?)

On some implementations it refers to the result of integer operations
are maintained in 32 entry GUB (GPR Update Buffer) which has 8R 4W ports.
There is also a FUB for floats.
I gather results are copied from GUB to GPR on commit.

Re: Automatic register spill / restore?

<ta4amt$2m88$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=26379&group=comp.arch#26379

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!news.freedyn.de!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: m.del...@this.bitsnbites.eu (Marcus)
Newsgroups: comp.arch
Subject: Re: Automatic register spill / restore?
Date: Wed, 6 Jul 2022 17:45:32 +0200
Organization: A noiseless patient Spider
Lines: 21
Message-ID: <ta4amt$2m88$1@dont-email.me>
References: <2022Jul6.115355@mips.complang.tuwien.ac.at>
<memo.20220706153640.2400B@jgd.cix.co.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 6 Jul 2022 15:45:33 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="842820778f52677616e6d7772cdeb919";
logging-data="88328"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+JyBZylol/U22hXWiJSAl3M0eg5Q+ZR84="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.9.1
Cancel-Lock: sha1:UhbbQnRsFo7WLSa41wVuZVQ/I7I=
Content-Language: en-US
In-Reply-To: <memo.20220706153640.2400B@jgd.cix.co.uk>
 by: Marcus - Wed, 6 Jul 2022 15:45 UTC

On 2022-07-06, John Dallman wrote:
> In article <2022Jul6.115355@mips.complang.tuwien.ac.at>,
> anton@mips.complang.tuwien.ac.at (Anton Ertl) wrote:
>
>> But apparently the customers had defected to AMD64 in
>> the meantime, so Oracle canceled SPARC after the M8.
>
> Oracle cancelled SPARC slightly before the M8 shipped, by getting rid of
> the development team and a lot of the Solaris staff. They claimed that it
> would carry on getting faster for years, via specialised on-chip
> co-processors, but they were not very convincing in that, nor in their
> claims that customers would be albe to run Solaris until at least 2031.

I can still run Commodore BASIC v2.0 in 2022, so I guess the claim that
Solaris will be runnable in 2031 is true.

(That does not make it a sound investment, though)

>
> John

Re: Automatic register spill / restore?

<ta4jgc$3m60$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=26395&group=comp.arch#26395

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!news.freedyn.de!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Automatic register spill / restore?
Date: Wed, 6 Jul 2022 13:15:38 -0500
Organization: A noiseless patient Spider
Lines: 66
Message-ID: <ta4jgc$3m60$1@dont-email.me>
References: <t9tuvg$1qgu$1@gioia.aioe.org> <OlEwK.24566$8f2.9092@fx38.iad>
<ta2ufc$el2$1@gioia.aioe.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 6 Jul 2022 18:15:40 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="683344d0860f73783ff4500460cfe084";
logging-data="121024"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18zsSbqsd2amN4TB7xpmf6P"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.11.0
Cancel-Lock: sha1:AS+HBEK51cInHaKEyTnrPgUCTuA=
In-Reply-To: <ta2ufc$el2$1@gioia.aioe.org>
Content-Language: en-US
 by: BGB - Wed, 6 Jul 2022 18:15 UTC

On 7/5/2022 10:10 PM, Andy wrote:
> On 5/07/22 04:14, EricP wrote:
>
>>
>> Sparc register windows. They were opaque, asynchronous lazy spill/fill
>> which was done by kernel mode traps. Reportedly it had... issues.
>
> Yep while I'm vaguely aware of SPARC register windows, that's not
> exactly what I was looking for, since I'm pretty sure it was one of the
> older perhaps lesser known CISCy mainframes not a newish RISC style
> microprocessor.
> And I think it used a fairly strait forward register file aside from the
> fact the programmer never needed to manually save and restore registers
> when calling subroutines and the like.
>
>
> But aside from that, I was never struck with the impression that
> hardware managed register windows were a particularly great idea.
>
> If someone is going to put 120 odd registers into a CPU surely making
> them all programmer visible in some way and letting them / the compiler
> or operating system decide how best to carve the register file up would
> be the better option.
>
> And of course with fresh pop-corn in hand, watching the ensuing
> technical / corporate / tribal / religious / jihadist arguments flying
> to and fro on how best to do that, would no doubt be hugely and/or
> endlessly entertaining! ;-)
>

Partial issues:
Encoding: 7 bit register fields wont really fit effectively into a
32-bit instruction format;
LUTs: For FPGA, 7-bit access in LUTRAM is significantly less
resource-efficient than 5 or 6 bits.

It would appear though that many of the RISC-V implementations
effectively have 3 copies of the register file (the U/S/M modes having
separate copies of the register file; so would be around 96 registers
internally).

....

Though, I guess one could also debate whether it would be viable to
implement a variant of the Itanium ISA on an FPGA (haven't looked into
it enough to figure how easily an IA-64 core could fit into an XC7A100T
or similar).

Would likely need a partial software emulation layer though to emulate
things like an SVGA card plugged into a PCI bus and similar, like if one
hopes to be able to run an IA64 build of Windows or similar on it.

This seems like an area where something like a DE10 or similar could
potentially have an advantage.

Looks it up, in a quick search it doesn't appear anyone has done IA-64
for the MiSTer or similar...

....

Re: Automatic register spill / restore?

<MxkxK.25803$Ae2.15955@fx35.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=26396&group=comp.arch#26396

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!feed1.usenet.blueworldhosting.com!usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx35.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Automatic register spill / restore?
References: <t9tuvg$1qgu$1@gioia.aioe.org> <OlEwK.24566$8f2.9092@fx38.iad> <ta2ufc$el2$1@gioia.aioe.org>
In-Reply-To: <ta2ufc$el2$1@gioia.aioe.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 55
Message-ID: <MxkxK.25803$Ae2.15955@fx35.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 06 Jul 2022 18:31:08 UTC
Date: Wed, 06 Jul 2022 14:30:18 -0400
X-Received-Bytes: 2993
 by: EricP - Wed, 6 Jul 2022 18:30 UTC

Andy wrote:
> On 5/07/22 04:14, EricP wrote:
>
>>
>> Sparc register windows. They were opaque, asynchronous lazy spill/fill
>> which was done by kernel mode traps. Reportedly it had... issues.
>
> Yep while I'm vaguely aware of SPARC register windows, that's not
> exactly what I was looking for, since I'm pretty sure it was one of the
> older perhaps lesser known CISCy mainframes not a newish RISC style
> microprocessor.

Back around 1980..84 there were two seminal research projects that
popularized the whole RISC approach
- Stanford MIPS by Hennessy et al, which was later reworked
some and launched commercially as the MIPS R2000
- Berkley RISC with register windows, aka RISC-I, by Patterson et al,
which was later commercialized as SPARC architecture and inspired ARM.

Patterson is one of the originators behind the current RISC-V ISA.

> And I think it used a fairly strait forward register file aside from the
> fact the programmer never needed to manually save and restore registers
> when calling subroutines and the like.
>
>
> But aside from that, I was never struck with the impression that
> hardware managed register windows were a particularly great idea.

In days of yore when memory was 500 ns it was probably really smart.
Berkley RISC had 78 32-bit 2R1W registers which would be enough for
3 windows (3*16+16) plus sundry housekeeping. Subroutine SAVE and
RESTORE would have taken a clock or so if it didn't trigger a spill.

But the technology had to be good enough to cram 78 registers on 1 chip.
It needed 44500 4um NMOS transistors which only became possible a few
years earlier.

> If someone is going to put 120 odd registers into a CPU surely making
> them all programmer visible in some way and letting them / the compiler
> or operating system decide how best to carve the register file up would
> be the better option.

Per-priority register banks in ARM have the same effect.
You pay for holding all that architectural state in live registers
but can't use them.

> And of course with fresh pop-corn in hand, watching the ensuing
> technical / corporate / tribal / religious / jihadist arguments flying
> to and fro on how best to do that, would no doubt be hugely and/or
> endlessly entertaining! ;-)

Re: Automatic register spill / restore?

<d758425c-ed58-48c0-bc5f-48c5e5f7c060n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=26450&group=comp.arch#26450

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:211d:b0:6af:9e8:f08d with SMTP id l29-20020a05620a211d00b006af09e8f08dmr453379qkl.458.1657238888668;
Thu, 07 Jul 2022 17:08:08 -0700 (PDT)
X-Received: by 2002:a05:6214:21e5:b0:470:a567:edf7 with SMTP id
p5-20020a05621421e500b00470a567edf7mr604917qvj.67.1657238888495; Thu, 07 Jul
2022 17:08:08 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 7 Jul 2022 17:08:08 -0700 (PDT)
In-Reply-To: <t9tuvg$1qgu$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2804:431:abca:3d01:482e:2ef6:899f:5200;
posting-account=qNYWnwkAAAAV29tnVQXfHVigXdWsZgZ6
NNTP-Posting-Host: 2804:431:abca:3d01:482e:2ef6:899f:5200
References: <t9tuvg$1qgu$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <d758425c-ed58-48c0-bc5f-48c5e5f7c060n@googlegroups.com>
Subject: Re: Automatic register spill / restore?
From: jec...@merlintec.com (Jecel Assumpção Jr)
Injection-Date: Fri, 08 Jul 2022 00:08:08 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 8
 by: Jecel Assumpção Jr - Fri, 8 Jul 2022 00:08 UTC

On Monday, July 4, 2022 at 2:48:36 AM UTC-3, Andy wrote:
> The discussions going on about register to/from stack and load/store
> multiple instructions has got me vaguely remembering that there was some
> talk about old mainframes that could save to stack automatically any
> registers in danger of being overwritten after a jump to subroutine or such.

The AT&T Hobbit (CRISP) had a stack cache with a hardware spiller/refiller that took advantage of any otherwise unused memory cycles.

-- Jecel

Re: Wither VLIW? was Automatic register spill / restore?

<tao3p8$mq5$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=26592&group=comp.arch#26592

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.niel.me!aioe.org!vxOTfS5Jn/b499FNW7Y8DA.user.46.165.242.75.POSTED!not-for-mail
From: nos...@nowhere.com (Andy)
Newsgroups: comp.arch
Subject: Re: Wither VLIW? was Automatic register spill / restore?
Date: Thu, 14 Jul 2022 15:49:58 +1200
Organization: Aioe.org NNTP Server
Message-ID: <tao3p8$mq5$1@gioia.aioe.org>
References: <t9tuvg$1qgu$1@gioia.aioe.org> <OlEwK.24566$8f2.9092@fx38.iad>
<ta2ufc$el2$1@gioia.aioe.org> <ta4jgc$3m60$1@dont-email.me>
<tadaib$1msu$1@gioia.aioe.org> <tadg3l$19jd4$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="23365"; posting-host="vxOTfS5Jn/b499FNW7Y8DA.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.9.1
Content-Language: en-US
X-Notice: Filtered by postfilter v. 0.9.2
 by: Andy - Thu, 14 Jul 2022 03:49 UTC

On 10/07/22 15:12, BGB wrote:

> I am left wondering if I could make it fit, at least in a basic sense,
> on something like an XC7A100T. Dev-boards with these (such as the Nexys
> A7) are available for around ~ $270 or so last I looked (and this is
> basically what I am using for my BJX2 Core).
>
>
> At least at a superficial level, the IA-64 ISA isn't *that* far beyond
> what I have already done with the BJX2 ISA.
>

If you say so, looks like Mount Everest to me though...

>
> Most obvious difference is that the IA-64 register file would be
> significantly larger. Would also probably need to omit the IA-32
> decoder, ...
>

Perhaps something smaller, Transmeta Crusoe or Efficion maybe, only 64
registers if you include the deep speculation, 32 if you skip that, and
the IA-32 decoding is just a re-assemble of the Code Morphing firmware
you can find on the internet.

> In this case, the idea would partly be to emulate parts of the ISA on
> top of itself (likely via hardware traps).
>
>
> If I were to do it via a modified BJX2 core, would potentially replace
> the RISC-V alt-mode with an IA-64 alt-mode, and considerably expand the
> size of the register file and similar.

Hmmm,

>
> Though, this looks concerning, the amount of expansions needed would
> likely push the core beyond the resource limits of the XC7A100T.

Maybe skipping the great big Intel CPU cores is for the best. ;-)

> If I were to approach the register file design in a similar way to to
> what I have done with my BJX2 core, I will effectively need a 512-entry
> register file (likely also 8R4W if using 64-bit ports). Probably "more
> sane" to use multiple smaller register files.
>
> This seems a little absurd...

Agreed

>
> This might require a bigger FPGA...

Oh no...

>
> And or come up with a more cost-effective way to implement such a
> register file.

Possibly, not sure myself.

>> Then there's the issue of the compiler to deal with, I imagine
>> progress in VLIW scheduling compiler research has continued on since
>> Itanium effectively died, but would anyone be motivated enough to
>> collect the latest advancements and update a compiler just for the
>> Itanium machines still working out there?
>>
>
> AFAIK, GCC can target IA-64.
>   Not sure how good its code generation is.
>   Apparently the target has been deprecated though.
>

Always wondered how good GCC would be at generating code for a VLIW CPU.
I just assumed those so inclined would steal whatever language front-end
they could find and write the bulk of the VLIW specific compiler from
scratch.

>> There are probably far better / smaller / easier VLIW style cores to
>> study and replicate in a FPGA than Itanium I think.
>>
>
> It seems like I am one of the (relatively few) people doing VLIW on FPGA
> (at all).
>

Aside from the odd DSP-core, you might be right.

If only Transmeta could have held on a little longer, or did things
differently, like opening up the internal instruction set so that
hackers and compiler writers could have targeted more optimal GCC code
generation to their cores... they might still be around with huge sales
in Android devices right now, and VLIW research could have got the
injection of resources it needed to gain the performance needed to stay
competitive at least.

Although Nvidia isn't exactly setting the world on fire in CPU sales
either...

> Most of the other people I know of, are doing RISC variants (and/or
> RISC-V implementations).
>

RISC is pretty much the text-book common denominator these days.

I kinda hope VLIW makes a mainstream comeback somehow, the current
CISC/RISC duopoly doesn't seem particularly healthy for the long term view.

Maybe massive machine learning trained compilers can make a dent in the
software side of the VLIW equation?

> Looks like pretty much no one is bothering with soft-core processors for
> IA-64.

I'm thinking that's probably for the best. ;-)

>> But then with VLIW the hardware is just half the battle, you still
>> need to program it so that it runs at near peak performance, which I
>> take it is the harder part, again YMMV.
>>
>
> Yeah...
>
> With my existing ISA, my C compiler gets nowhere near the full speed of
> what is possible. Can do a little better by writing hand-optimized ASM,
> but this doesn't scale very well.

Seems to me that is the nub of the issue, if your WEX hardware is pretty
much working as intended then getting decent code generation out of your
compiler might be the best bang for your buck.

I'm sure there's still plenty of research papers to be read on the
subject, and if you happen to invent some new way to efficiently pack
many operations into a string of wide words, well, fortune and glory and
that jazz could be yours for the taking...

Or possibly the 8000lb gorilla will stomp on your neck and steal your
lunch money just like it did to Transmeta...

To early to tell I guess... :-)

Re: Wither VLIW? was Automatic register spill / restore?

<taomrn$2m1kd$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=26593&group=comp.arch#26593

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Wither VLIW? was Automatic register spill / restore?
Date: Thu, 14 Jul 2022 04:15:27 -0500
Organization: A noiseless patient Spider
Lines: 322
Message-ID: <taomrn$2m1kd$1@dont-email.me>
References: <t9tuvg$1qgu$1@gioia.aioe.org> <OlEwK.24566$8f2.9092@fx38.iad>
<ta2ufc$el2$1@gioia.aioe.org> <ta4jgc$3m60$1@dont-email.me>
<tadaib$1msu$1@gioia.aioe.org> <tadg3l$19jd4$1@dont-email.me>
<tao3p8$mq5$1@gioia.aioe.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 14 Jul 2022 09:15:35 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="f057459fe676b622c205f4f818fe8952";
logging-data="2819725"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/0OKrhfrPVHO14/16qIiTV"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.11.0
Cancel-Lock: sha1:WaRaxCZH8AtyPMJvY2IBDJg0K9E=
Content-Language: en-US
In-Reply-To: <tao3p8$mq5$1@gioia.aioe.org>
 by: BGB - Thu, 14 Jul 2022 09:15 UTC

On 7/13/2022 10:49 PM, Andy wrote:
> On 10/07/22 15:12, BGB wrote:
>
>
>> I am left wondering if I could make it fit, at least in a basic sense,
>> on something like an XC7A100T. Dev-boards with these (such as the
>> Nexys A7) are available for around ~ $270 or so last I looked (and
>> this is basically what I am using for my BJX2 Core).
>>
>>
>> At least at a superficial level, the IA-64 ISA isn't *that* far beyond
>> what I have already done with the BJX2 ISA.
>>
>
> If you say so, looks like Mount Everest to me though...
>

It is complicated, granted, but at a basic level most of the parts are
not *that* complicated. Main problem, as noted, would mostly be the
stupidly large register file.

>
>>
>> Most obvious difference is that the IA-64 register file would be
>> significantly larger. Would also probably need to omit the IA-32
>> decoder, ...
>>
>
> Perhaps something smaller, Transmeta Crusoe or Efficion maybe, only 64
> registers if you include the deep speculation, 32 if you skip that, and
> the IA-32 decoding is just a re-assemble of the Code Morphing firmware
> you can find on the internet.
>

I had considered a few times maybe trying to do an x86 emulator on BJX2,
but this is one of those "never got around to it" issues.

Would need to go directly to JIT though, as there is pretty much no hope
of usable performance with a conventional interpreter.

And, on a 50 MHz CPU core, I would probably be lucky if it even matched
the performance of the original IBM PC.

Would likely also need instructions to allow faking the behavior of x86
style ALU and branch ops (my ISA lacks condition codes, and these would
be expensive to emulate).

>
>> In this case, the idea would partly be to emulate parts of the ISA on
>> top of itself (likely via hardware traps).
>>
>>
>> If I were to do it via a modified BJX2 core, would potentially replace
>> the RISC-V alt-mode with an IA-64 alt-mode, and considerably expand
>> the size of the register file and similar.
>
> Hmmm,
>

In any case, not going to do this, it was more a hypothetical.

>>
>> Though, this looks concerning, the amount of expansions needed would
>> likely push the core beyond the resource limits of the XC7A100T.
>
> Maybe skipping the great big Intel CPU cores is for the best. ;-)
>

Probably true.

I had previously wanted to buy a board with an XC7A200T (Nexys Video),
but lacked money.

Now it seems they are sold out pretty much everywhere...

>> If I were to approach the register file design in a similar way to to
>> what I have done with my BJX2 core, I will effectively need a
>> 512-entry register file (likely also 8R4W if using 64-bit ports).
>> Probably "more sane" to use multiple smaller register files.
>>
>> This seems a little absurd...
>
> Agreed
>
>>
>> This might require a bigger FPGA...
>
> Oh no...
>
>>
>> And or come up with a more cost-effective way to implement such a
>> register file.
>
> Possibly, not sure myself.
>
>
>>> Then there's the issue of the compiler to deal with, I imagine
>>> progress in VLIW scheduling compiler research has continued on since
>>> Itanium effectively died, but would anyone be motivated enough to
>>> collect the latest advancements and update a compiler just for the
>>> Itanium machines still working out there?
>>>
>>
>> AFAIK, GCC can target IA-64.
>>    Not sure how good its code generation is.
>>    Apparently the target has been deprecated though.
>>
>
> Always wondered how good GCC would be at generating code for a VLIW CPU.
> I just assumed those so inclined would steal whatever language front-end
> they could find and write the bulk of the VLIW specific compiler from
> scratch.
>

Dunno. I wrote my whole compiler from scratch.

But, given how much it sucks with my own ISA, and what would be needed
for "good" results with IA-64, it would likely be straight up terrible...

>
>>> There are probably far better / smaller / easier VLIW style cores to
>>> study and replicate in a FPGA than Itanium I think.
>>>
>>
>> It seems like I am one of the (relatively few) people doing VLIW on
>> FPGA (at all).
>>
>
> Aside from the odd DSP-core, you might be right.
>

Possibly.

I suspect the sinking of the Itanium had done a lot to sour the
reputation if VLIW in general.

Like, Itanium did for VLIW what the Hindenburg did for airships...

> If only Transmeta could have held on a little longer, or did things
> differently, like opening up the internal instruction set so that
> hackers and compiler writers could have targeted more optimal GCC code
> generation to their cores... they might still be around with huge sales
> in Android devices right now, and VLIW research could have got the
> injection of resources it needed to gain the performance needed to stay
> competitive at least.
>
> Although Nvidia isn't exactly setting the world on fire in CPU sales
> either...
>

Yeah, quite possible, it could have been interesting.

Emulation is one of those areas where one is almost invariably going to
take a loss; so if targeting the underlying ISA, it could maybe have
been more competitive.

Maybe also not try to set oneself up as "compete with Intel or bust".

>
>> Most of the other people I know of, are doing RISC variants (and/or
>> RISC-V implementations).
>>
>
> RISC is pretty much the text-book common denominator these days.
>

Pretty much.

> I kinda hope VLIW makes a mainstream comeback somehow, the current
> CISC/RISC duopoly doesn't seem particularly healthy for the long term view.
>
> Maybe massive machine learning trained compilers can make a dent in the
> software side of the VLIW equation?
>

Not sure here.

In my case, it is kinda lots of fiddly.

I recently got things a little better, by fiddling a fair bit with the
logic for shuffling instructions around. It tries to reduce interlocking
and improve cases for bundling.

Then, I ended up needing to add in logic to limit how much shuffling it
does, and to try to hash and cache the results of intermediate
comparisons, mostly as the "more advanced" shuffling cases were starting
to result in the process taking an unreasonably long time.

Part of the issue seems to be that in the shuffling process it only
takes a limited window into account at a time (which expanded from 3 to
7 instructions), but at each point it isn't necessarily the case that
there will be an agreement as to which option is lowest cost.

The only real alternative would be to evaluate the entire basic block
for each possible swapping decision.

Though, this only really happens in a minority of cases (most cases
don't have quite so much "instruction mobility").

Some gains here were due to adding heuristics to infer "non-aliasing"
memory accesses, say:
Same base register but non-overlapping displacements;
SP and non-SP in some combinations.
SP and GBR (Stack and Globals are never the same memory);
...

But, can't do as much with indexed loads/stores here, since one can't do
much of anything to infer potential overlap or non-overlap between these
cases.

Have also made an inference that SP or GBR based loads/stores can't
alias with indexed load/store. While a little hand-wavy, this is
"probably true".

Though, Jumbo-form and instructions with Relocations (typically the same
instruction) are classified as "immovable", and thus can't be moved
(though the window is now large enough that it can now shuffle "around"
these instructions in many cases).

>
>> Looks like pretty much no one is bothering with soft-core processors
>> for IA-64.
>
> I'm thinking that's probably for the best. ;-)
>

Probably true, after thinking more on it.

>
>>> But then with VLIW the hardware is just half the battle, you still
>>> need to program it so that it runs at near peak performance, which I
>>> take it is the harder part, again YMMV.
>>>
>>
>> Yeah...
>>
>> With my existing ISA, my C compiler gets nowhere near the full speed
>> of what is possible. Can do a little better by writing hand-optimized
>> ASM, but this doesn't scale very well.
>
>
> Seems to me that is the nub of the issue, if your WEX hardware is pretty
> much working as intended then getting decent code generation out of your
> compiler might be the best bang for your buck.
>


Click here to read the complete article

devel / comp.arch / Re: Automatic register spill / restore?

Pages:12
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor