Message-ID:

"The chain which can be yanked is not the eternal chain." -- G. Fitch

devel / comp.arch / Re: In-order vs Out-of-order

Re: In-order vs Out-of-order

<salia9$j66$1@dont-email.me>

https://www.novabbs.com/devel/article-flat.php?id=17951&group=comp.arch#17951

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: In-order vs Out-of-order
Date: Sat, 19 Jun 2021 12:59:06 -0700
Organization: A noiseless patient Spider
Lines: 28
Message-ID: <salia9$j66$1@dont-email.me>
References: <9d8fb369-e27f-4794-92b8-24686d874ae4n@googlegroups.com>
<ff861de4-197b-4e9b-b3f5-79ff7e627c0en@googlegroups.com>
<2021Jun19.184356@mips.complang.tuwien.ac.at>
<8dd6f635-56c3-4d21-94f6-0af3ce1dff18n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 19 Jun 2021 19:59:05 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="cede6ea71679332ac12d0522b827b369";
logging-data="19654"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+uUrynYTPU++WJIu9hb0HI"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:/k5+nracESA3P7MlkSKD212g85Y=
In-Reply-To: <8dd6f635-56c3-4d21-94f6-0af3ce1dff18n@googlegroups.com>
Content-Language: en-US

by: Ivan Godard - Sat, 19 Jun 2021 19:59 UTC

On 6/19/2021 11:57 AM, Michael S wrote:
> On Saturday, June 19, 2021 at 7:57:56 PM UTC+3, Anton Ertl wrote:
>> nedbrek <ned...@yahoo.com> writes:
>>> One of the interesting results of our research was that out-of-order machines are easier to ramp to high frequency. This is because an in-order machine suffers more as L1D latency increases (you need to get the compiler to extract more and more ILP).
>> The numbers I have seen posted here (in an IA-64 discussion) are that
>> one extra L1D cycles costs 10% on in-order and 5% on OoO. But I don't
>> see why this should make it hard to ramp up clock rates. Sure, you
>> don't see as much benefit from the higher clock rate because of that
>> effect, but you should still see a benefit, because ALU ops go faster
>> and because the compiler is able to fill some of the load latency with
>> useful work.
>>
>> My impression is that in-order designs are harder to clock fast
>> because they have less localized control recurrences than OoO; and
>> last time I wrote that, the people in the know did not really
>> contradict me, but discussed some techniques to work around that, such
>> as having a backup of the pipeline state and rolling back to that.
>> - anton
>> --
>> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
>> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>
>
> I never tried to think hard about it, but my uneducated impression always was that it applies to "normal" In-Order, the one that does interlocks in hardware.
> I don't see why it should apply to in-order designs with exposed pipeline, especially to those that also expose to software a banked structure of register files.
> Of course, even designs like those has to have a way to deal with variable-length latency of memory instructions... May be, if it's done by replay the control could remain distributed?

It can be done by splitting the load's request from the retire; it's
called "deferred load".

Re: In-order vs Out-of-order

<93121c25-39b9-4e7d-9e07-7c9b584194a5n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=17953&group=comp.arch#17953

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:5e46:: with SMTP id i6mr16554607qtx.366.1624138131703;
Sat, 19 Jun 2021 14:28:51 -0700 (PDT)
X-Received: by 2002:a9d:3f0:: with SMTP id f103mr14129369otf.182.1624138131485;
Sat, 19 Jun 2021 14:28:51 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 19 Jun 2021 14:28:51 -0700 (PDT)
In-Reply-To: <salia9$j66$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=87.68.182.191; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 87.68.182.191
References: <9d8fb369-e27f-4794-92b8-24686d874ae4n@googlegroups.com>
<ff861de4-197b-4e9b-b3f5-79ff7e627c0en@googlegroups.com> <2021Jun19.184356@mips.complang.tuwien.ac.at>
<8dd6f635-56c3-4d21-94f6-0af3ce1dff18n@googlegroups.com> <salia9$j66$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <93121c25-39b9-4e7d-9e07-7c9b584194a5n@googlegroups.com>
Subject: Re: In-order vs Out-of-order
From: already5...@yahoo.com (Michael S)
Injection-Date: Sat, 19 Jun 2021 21:28:51 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

by: Michael S - Sat, 19 Jun 2021 21:28 UTC

On Saturday, June 19, 2021 at 10:59:07 PM UTC+3, Ivan Godard wrote:
> On 6/19/2021 11:57 AM, Michael S wrote:
> > On Saturday, June 19, 2021 at 7:57:56 PM UTC+3, Anton Ertl wrote:
> >> nedbrek <ned...@yahoo.com> writes:
> >>> One of the interesting results of our research was that out-of-order machines are easier to ramp to high frequency. This is because an in-order machine suffers more as L1D latency increases (you need to get the compiler to extract more and more ILP).
> >> The numbers I have seen posted here (in an IA-64 discussion) are that
> >> one extra L1D cycles costs 10% on in-order and 5% on OoO. But I don't
> >> see why this should make it hard to ramp up clock rates. Sure, you
> >> don't see as much benefit from the higher clock rate because of that
> >> effect, but you should still see a benefit, because ALU ops go faster
> >> and because the compiler is able to fill some of the load latency with
> >> useful work.
> >>
> >> My impression is that in-order designs are harder to clock fast
> >> because they have less localized control recurrences than OoO; and
> >> last time I wrote that, the people in the know did not really
> >> contradict me, but discussed some techniques to work around that, such
> >> as having a backup of the pipeline state and rolling back to that.
> >> - anton
> >> --
> >> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> >> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>
> >
> > I never tried to think hard about it, but my uneducated impression always was that it applies to "normal" In-Order, the one that does interlocks in hardware.
> > I don't see why it should apply to in-order designs with exposed pipeline, especially to those that also expose to software a banked structure of register files.
> > Of course, even designs like those has to have a way to deal with variable-length latency of memory instructions... May be, if it's done by replay the control could remain distributed?
> It can be done by splitting the load's request from the retire; it's
> called "deferred load".

Mill-style "deferred load" is a performance feature - you trade decode bandwidth for earlier issue of load.
On architecture with lots of visible registers it's not needed at all. Unlike Itanium-style advanced load that can be potentially useful even here.
Neither Mill's "deferred load" nor Itanium's "advanced load" do nothing to alleviate a need for global interlock HW .

Re: In-order vs Out-of-order

<salo9d$ort$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=17954&group=comp.arch#17954

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: In-order vs Out-of-order
Date: Sat, 19 Jun 2021 14:41:01 -0700
Organization: A noiseless patient Spider
Lines: 15
Message-ID: <salo9d$ort$1@dont-email.me>
References: <9d8fb369-e27f-4794-92b8-24686d874ae4n@googlegroups.com>
<ff861de4-197b-4e9b-b3f5-79ff7e627c0en@googlegroups.com>
<2021Jun19.184356@mips.complang.tuwien.ac.at>
<8dd6f635-56c3-4d21-94f6-0af3ce1dff18n@googlegroups.com>
<salia9$j66$1@dont-email.me>
<93121c25-39b9-4e7d-9e07-7c9b584194a5n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 19 Jun 2021 21:41:01 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="cede6ea71679332ac12d0522b827b369";
logging-data="25469"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/rYP9urNUj7UH5fz3yuAK/"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:Absp1STkQoDf4Zs6UIJY/Xf5O5M=
In-Reply-To: <93121c25-39b9-4e7d-9e07-7c9b584194a5n@googlegroups.com>
Content-Language: en-US

by: Ivan Godard - Sat, 19 Jun 2021 21:41 UTC

On 6/19/2021 2:28 PM, Michael S wrote:
> On Saturday, June 19, 2021 at 10:59:07 PM UTC+3, Ivan Godard wrote:
>> On 6/19/2021 11:57 AM, Michael S wrote:

<snip>

>>> Of course, even designs like those has to have a way to deal with variable-length latency of memory instructions... May be, if it's done by replay the control could remain distributed?
>> It can be done by splitting the load's request from the retire; it's
>> called "deferred load".
>
> Mill-style "deferred load" is a performance feature - you trade decode bandwidth for earlier issue of load.

Perhaps you are thinking of "pickup load". Deferred load costs no extra
bandwidth.

Re: In-order vs Out-of-order

<76f5f93c-5a4b-46f4-ab33-1d2626ff48c0n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=17969&group=comp.arch#17969

copy link Newsgroups: comp.arch

X-Received: by 2002:ae9:e8d2:: with SMTP id a201mr18021765qkg.98.1624187292661;
Sun, 20 Jun 2021 04:08:12 -0700 (PDT)
X-Received: by 2002:a9d:7d05:: with SMTP id v5mr17393630otn.240.1624187292441;
Sun, 20 Jun 2021 04:08:12 -0700 (PDT)
Path: i2pn2.org!i2pn.org!paganini.bofh.team!usenet.pasdenom.info!usenet-fr.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 20 Jun 2021 04:08:12 -0700 (PDT)
In-Reply-To: <salo9d$ort$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <9d8fb369-e27f-4794-92b8-24686d874ae4n@googlegroups.com>
<ff861de4-197b-4e9b-b3f5-79ff7e627c0en@googlegroups.com> <2021Jun19.184356@mips.complang.tuwien.ac.at>
<8dd6f635-56c3-4d21-94f6-0af3ce1dff18n@googlegroups.com> <salia9$j66$1@dont-email.me>
<93121c25-39b9-4e7d-9e07-7c9b584194a5n@googlegroups.com> <salo9d$ort$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <76f5f93c-5a4b-46f4-ab33-1d2626ff48c0n@googlegroups.com>
Subject: Re: In-order vs Out-of-order
From: already5...@yahoo.com (Michael S)
Injection-Date: Sun, 20 Jun 2021 11:08:12 +0000
Content-Type: text/plain; charset="UTF-8"

by: Michael S - Sun, 20 Jun 2021 11:08 UTC

On Sunday, June 20, 2021 at 12:41:03 AM UTC+3, Ivan Godard wrote:
> On 6/19/2021 2:28 PM, Michael S wrote:
> > On Saturday, June 19, 2021 at 10:59:07 PM UTC+3, Ivan Godard wrote:
> >> On 6/19/2021 11:57 AM, Michael S wrote:
> <snip>
> >>> Of course, even designs like those has to have a way to deal with variable-length latency of memory instructions... May be, if it's done by replay the control could remain distributed?
> >> It can be done by splitting the load's request from the retire; it's
> >> called "deferred load".
> >
> > Mill-style "deferred load" is a performance feature - you trade decode bandwidth for earlier issue of load.
> Perhaps you are thinking of "pickup load". Deferred load costs no extra
> bandwidth.

Yes, I was thinking of "pickup load".
But deferred load also has decode cost since you have to find place for delay specifier in "verb" word.

Subject	Author
In-order vs Out-of-order	Chingu
Re: In-order vs Out-of-order	MitchAlsup
Re: In-order vs Out-of-order	Chingu
Re: In-order vs Out-of-order	MitchAlsup
Re: In-order vs Out-of-order	BGB
Re: In-order vs Out-of-order	Stefan Monnier
Re: In-order vs Out-of-order	BGB
Re: In-order vs Out-of-order	MitchAlsup
Re: In-order vs Out-of-order	Michael S
Re: In-order vs Out-of-order	Anton Ertl
Re: In-order vs Out-of-order	BGB
Re: In-order vs Out-of-order	MitchAlsup
Re: In-order vs Out-of-order	BGB
Re: In-order vs Out-of-order	MitchAlsup
Re: In-order vs Out-of-order	robf...@gmail.com
Re: In-order vs Out-of-order	BGB
Re: In-order vs Out-of-order	Terje Mathisen
Re: In-order vs Out-of-order	Bernd Linsel
Re: In-order vs Out-of-order	BGB
Re: In-order vs Out-of-order	MitchAlsup
Re: In-order vs Out-of-order	nedbrek
Re: In-order vs Out-of-order	Marcus
Re: In-order vs Out-of-order	Anton Ertl
Re: In-order vs Out-of-order	MitchAlsup
Re: In-order vs Out-of-order	Michael S
Re: In-order vs Out-of-order	Ivan Godard
Re: In-order vs Out-of-order	Michael S
Re: In-order vs Out-of-order	Ivan Godard
Re: In-order vs Out-of-order	Michael S