Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

One person's error is another person's data.


devel / comp.arch / Re: IA-64

SubjectAuthor
* IA-64gareth evans
`* Re: IA-64David Brown
 `* Re: IA-64John Dallman
  `* Re: IA-64Marcus
   +* Re: IA-64John Levine
   |+* Re: IA-64Sarr Blumson
   ||+* Re: IA-64MitchAlsup
   |||+- Re: IA-64Stephen Fuld
   |||`- Re: IA-64EricP
   ||+* Re: IA-64David Brown
   |||`* Re: IA-64greenaum
   ||| `* Re: IA-64Marcus
   |||  +* Re: IA-64MitchAlsup
   |||  |+- Re: IA-64Ivan Godard
   |||  |+* Re: IA-64Terje Mathisen
   |||  ||+* Re: IA-64Thomas Koenig
   |||  |||+* Re: IA-64MitchAlsup
   |||  ||||`- Re: IA-64Thomas Koenig
   |||  |||+* Re: IA-64John Levine
   |||  ||||`* Re: IA-64MitchAlsup
   |||  |||| `- Re: IA-64John Levine
   |||  |||`- Re: IA-64Marcus
   |||  ||+* Re: IA-64Quadibloc
   |||  |||`- Re: IA-64MitchAlsup
   |||  ||`* Re: IA-64MitchAlsup
   |||  || `- Re: IA-64Terje Mathisen
   |||  |`* Re: IA-64Anton Ertl
   |||  | +- Re: IA-64Anton Ertl
   |||  | `* Re: IA-64MitchAlsup
   |||  |  `- Re: IA-64Anton Ertl
   |||  +- Re: IA-64John Dallman
   |||  +- Re: IA-64Ivan Godard
   |||  `* Re: IA-64Anton Ertl
   |||   `* Re: IA-64MitchAlsup
   |||    +* Re: IA-64Stephen Fuld
   |||    |+- Re: IA-64EricP
   |||    |`* Re: IA-64Marcus
   |||    | `- Re: IA-64Stephen Fuld
   |||    `* Re: IA-64Stefan Monnier
   |||     `- Re: IA-64MitchAlsup
   ||`- Re: IA-64Anton Ertl
   |+* Re: IA-64Stefan Monnier
   ||+* Re: IA-64John Dallman
   |||+- Re: IA-64Stefan Monnier
   |||+* Re: IA-64Thomas Koenig
   ||||`* Re: IA-64Anton Ertl
   |||| +* Re: IA-64Anton Ertl
   |||| |`* Re: IA-64BGB
   |||| | +* Re: IA-64EricP
   |||| | |+- Re: IA-64MitchAlsup
   |||| | |`- Re: IA-64EricP
   |||| | +- Re: IA-64Ivan Godard
   |||| | +- Re: IA-64MitchAlsup
   |||| | +- Re: IA-64Anton Ertl
   |||| | `* Re: Local stall pipeline stageEricP
   |||| |  +- Re: Local stall pipeline stageEricP
   |||| |  `* Re: Local stall pipeline stageMitchAlsup
   |||| |   `* Re: Local stall pipeline stageEricP
   |||| |    `- Re: Local stall pipeline stageMitchAlsup
   |||| `* Re: IA-64 and other parallel failuresJohn Levine
   ||||  +- Re: IA-64 and other parallel failuresMitchAlsup
   ||||  +* Re: IA-64 and other parallel failuresAnton Ertl
   ||||  |+* Re: IA-64 and other parallel failuresIvan Godard
   ||||  ||`* Re: IA-64 and other parallel failuresAnton Ertl
   ||||  || `* Re: IA-64 and other parallel failuresMichael S
   ||||  ||  `- Re: IA-64 and other parallel failuresJohn Levine
   ||||  |`- Re: IA-64 and other parallel failuresQuadibloc
   ||||  `- Re: IA-64 and other parallel failuresMichael S
   |||`* Re: IA-64Anton Ertl
   ||| `* Re: IA-64John Dallman
   |||  +* Re: IA-64Quadibloc
   |||  |`- Re: IA-64Marcus
   |||  `* Re: IA-64Anton Ertl
   |||   +* Re: IA-64EricP
   |||   |`* Re: IA-64Michael S
   |||   | `- Re: IA-64MitchAlsup
   |||   `* Re: IA-64John Dallman
   |||    +* Re: IA-64MitchAlsup
   |||    |`- Re: IA-64John Dallman
   |||    +* Re: IA-64Anton Ertl
   |||    |`* Re: IA-64John Dallman
   |||    | `* Re: IA-64Michael S
   |||    |  `* Re: IA-64John Dallman
   |||    |   +- Re: IA-64Michael S
   |||    |   `- Re: IA-64Thomas Koenig
   |||    `* Re: IA-64Quadibloc
   |||     `- Re: IA-64John Dallman
   ||+* Re: IA-64Quadibloc
   |||`- Re: IA-64Anton Ertl
   ||`* Re: IA-64Terje Mathisen
   || `* Re: IA-64Stefan Monnier
   ||  `- Re: IA-64Terje Mathisen
   |+* Re: IA-64MitchAlsup
   ||+* Re: IA-64Ivan Godard
   |||+* Re: IA-64BGB
   ||||`* Re: IA-64MitchAlsup
   |||| `* Re: IA-64Marcus
   ||||  `- Re: IA-64Quadibloc
   |||+- Re: IA-64MitchAlsup
   |||+- Re: VLIW, threat or menace, was IA-64John Levine
   |||`- Re: IA-64Stefan Monnier
   ||`- Re: IA-64Quadibloc
   |+- Re: IA-64BGB
   |`* Re: IA-64Quadibloc
   `- Re: IA-64John Dallman

Pages:12345
Re: IA-64

<s83l25$1lg$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16922&group=comp.arch#16922

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: m.del...@this.bitsnbites.eu (Marcus)
Newsgroups: comp.arch
Subject: Re: IA-64
Date: Wed, 19 May 2021 20:23:00 +0200
Organization: A noiseless patient Spider
Lines: 40
Message-ID: <s83l25$1lg$1@dont-email.me>
References: <s7gtk1$csi$1@dont-email.me>
<memo.20210512183631.13980N@jgd.cix.co.uk> <s7h71q$76h$1@dont-email.me>
<s7hcrm$sp1$1@gal.iecc.com> <s7hffs$19e$1@dont-email.me>
<s7hklb$i82$1@dont-email.me> <60a43f23.2045341625@news.eternal-september.org>
<s814as$hm$1@dont-email.me>
<cca40ada-b0ad-4b7f-9e16-91ca67de09a3n@googlegroups.com>
<s82dne$83s$1@gioia.aioe.org> <s82neu$6ku$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=windows-1252; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 19 May 2021 18:23:01 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="c2bc2388f8ab3b493b0311e7eda3587a";
logging-data="1712"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Lz8yzPvqbiG8alPW7f68JiXbBkkk5IMk="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.8.1
Cancel-Lock: sha1:qug94cwirJnsNye0uibBwbkhxUg=
In-Reply-To: <s82neu$6ku$1@newsreader4.netcologne.de>
Content-Language: en-US
 by: Marcus - Wed, 19 May 2021 18:23 UTC

On 2021-05-19, Thomas Koenig wrote:
> Terje Mathisen <terje.mathisen@tmsw.no> schrieb:
>
>> Given that the compiler knows _exactly_ what sort of target it is
>> working towards,
>
> That assumption can certainly be true if people compile the software
> themselves, as they often do for scientific applications.
>
> It is not the case with pre-compiled software delivered by some
> vendor or other distributor. If you use new features (such
> as this new-fangled popcnt instructions), the software will
> not run on old hardware (unless somebody puts in a feature test
> and a runtime switch for different architectures).
>
> As an example, look at the binary distributions of Stockfish for
> different CPUs.
>

Another very real example is the distribution of "apps" for
mobile devices (phones & tablets). When I worked at Opera Software
two very important goals for the mobile browser applications were:

1) The binary must be really small (e.g. to support quick and cheap
downloads over a 3G connection, and to fit in limited storage on
cheap devices).

2) It must be as fast as possible on a wide range of CPU generations
(e.g. ARMv6 vs ARMv7 vs ARMv7+NEON).

This required quite innovative solutions to be able to ship code that
is optimized for several architectures (with different ABI:s) in a
single, compact binary package. (The situation may have improved since
then w.r.t support for multiple architectures etc. in the "app stores").

There are many situations where there is great value in having a single,
stable ISA+ABI, allowing the same binary to run on a wide range of
microarchitectures.

/Marcus

Re: IA-64

<s83lug$874$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16923&group=comp.arch#16923

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: m.del...@this.bitsnbites.eu (Marcus)
Newsgroups: comp.arch
Subject: Re: IA-64
Date: Wed, 19 May 2021 20:38:07 +0200
Organization: A noiseless patient Spider
Lines: 32
Message-ID: <s83lug$874$1@dont-email.me>
References: <s7gtk1$csi$1@dont-email.me>
<memo.20210512183631.13980N@jgd.cix.co.uk> <s7h71q$76h$1@dont-email.me>
<s7hcrm$sp1$1@gal.iecc.com> <s7hffs$19e$1@dont-email.me>
<s7hklb$i82$1@dont-email.me> <60a43f23.2045341625@news.eternal-september.org>
<s814as$hm$1@dont-email.me> <2021May19.103259@mips.complang.tuwien.ac.at>
<b50a26c8-64c6-41d9-a513-ef2e997396acn@googlegroups.com>
<s83cpa$3o8$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 19 May 2021 18:38:08 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="c2bc2388f8ab3b493b0311e7eda3587a";
logging-data="8420"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX191pbDif1QAzh4lfLD8Z0BH9TQ980mhYmo="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.8.1
Cancel-Lock: sha1:j9/COsuauKiUHuD7q1LWMzab0sA=
In-Reply-To: <s83cpa$3o8$1@dont-email.me>
Content-Language: en-US
 by: Marcus - Wed, 19 May 2021 18:38 UTC

On 2021-05-19, Stephen Fuld wrote:
> On 5/19/2021 8:04 AM, MitchAlsup wrote:
>> On Wednesday, May 19, 2021 at 3:52:42 AM UTC-5, Anton Ertl wrote:
>
> snip
>
>>> The big problem for software scheduling is that compilers are much
>>> worse at branch prediction than hardware.
>> <
>> Precisely because the Branch prediction structures contain vastly
>> more correlated states than SW currently manages.
>
> I think that is part of it.  The compiler could keep such data
> structures.  After all, software simulators of CPUs do.
>
> But the inescapable part is that compilers don't have access to the
> input data that is only available at run time and that affects the
> accuracy of branch predictors.  There is profile guided optimization,
> but then you are back to the problem John Dallman pointed out about
> getting "correct" test data to guide the test runs.
>

Using the "correct data" is part of it, but it's worse. Static analysis
and even PGO can only find the statistical probabilities for each
branch, and at best optimize & schedule for one way or the other. On
the other hand the hardware branch predictor can make different
decisions for the same branch instruction at different points in time.
This is something that a compiler can not practically do (at least not
without spewing out a large number of different code sequences, that
would kill I$ performance, and probably be quite inaccurate anyway).

/Marcus

Re: IA-64

<s83muv$e9c$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16926&group=comp.arch#16926

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: IA-64
Date: Wed, 19 May 2021 11:55:27 -0700
Organization: A noiseless patient Spider
Lines: 36
Message-ID: <s83muv$e9c$1@dont-email.me>
References: <s7gtk1$csi$1@dont-email.me>
<memo.20210512183631.13980N@jgd.cix.co.uk> <s7h71q$76h$1@dont-email.me>
<s7hcrm$sp1$1@gal.iecc.com> <s7hffs$19e$1@dont-email.me>
<s7hklb$i82$1@dont-email.me> <60a43f23.2045341625@news.eternal-september.org>
<s814as$hm$1@dont-email.me> <2021May19.103259@mips.complang.tuwien.ac.at>
<b50a26c8-64c6-41d9-a513-ef2e997396acn@googlegroups.com>
<s83cpa$3o8$1@dont-email.me> <s83lug$874$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 19 May 2021 18:55:27 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="7e648b9f8e0a37104a6a431191939543";
logging-data="14636"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18vVWLkmltfGNl3Fa0wTCAyWgrAM6Xc2ts="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.1
Cancel-Lock: sha1:LKDV5c7gVqE9zQyzHX3U8pDCJZk=
In-Reply-To: <s83lug$874$1@dont-email.me>
Content-Language: en-US
 by: Stephen Fuld - Wed, 19 May 2021 18:55 UTC

On 5/19/2021 11:38 AM, Marcus wrote:
> On 2021-05-19, Stephen Fuld wrote:
>> On 5/19/2021 8:04 AM, MitchAlsup wrote:
>>> On Wednesday, May 19, 2021 at 3:52:42 AM UTC-5, Anton Ertl wrote:
>>
>> snip
>>
>>>> The big problem for software scheduling is that compilers are much
>>>> worse at branch prediction than hardware.
>>> <
>>> Precisely because the Branch prediction structures contain vastly
>>> more correlated states than SW currently manages.
>>
>> I think that is part of it.  The compiler could keep such data
>> structures.  After all, software simulators of CPUs do.
>>
>> But the inescapable part is that compilers don't have access to the
>> input data that is only available at run time and that affects the
>> accuracy of branch predictors.  There is profile guided optimization,
>> but then you are back to the problem John Dallman pointed out about
>> getting "correct" test data to guide the test runs.
>>
>
> Using the "correct data" is part of it, but it's worse. Static analysis
> and even PGO can only find the statistical probabilities for each
> branch, and at best optimize & schedule for one way or the other. On
> the other hand the hardware branch predictor can make different
> decisions for the same branch instruction at different points in time.

True. I had missed the temporal aspect. Thanks.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: IA-64

<s83p04$4fs$1@gal.iecc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16929&group=comp.arch#16929

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: IA-64
Date: Wed, 19 May 2021 19:30:12 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <s83p04$4fs$1@gal.iecc.com>
References: <s7gtk1$csi$1@dont-email.me> <s82neu$6ku$1@newsreader4.netcologne.de> <s83hio$2khq$1@gal.iecc.com> <04d93eaf-fb82-4f2c-8b76-daeca7b054fcn@googlegroups.com>
Injection-Date: Wed, 19 May 2021 19:30:12 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="4604"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <s7gtk1$csi$1@dont-email.me> <s82neu$6ku$1@newsreader4.netcologne.de> <s83hio$2khq$1@gal.iecc.com> <04d93eaf-fb82-4f2c-8b76-daeca7b054fcn@googlegroups.com>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
Lines: 12
 by: John Levine - Wed, 19 May 2021 19:30 UTC

According to MitchAlsup <MitchAlsup@aol.com>:
>I wonder what would Multiflow compiler do with your std garbage collector
>"application" ?

Give up, I expect. Since their customers were mostly doing scientific
computation and this was the 1980s, they had a C compiler but the main
language was Fortran and I get the impression that the more your C was
like Fortran, the faster it was likely to run.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: IA-64

<2021May19.213612@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16931&group=comp.arch#16931

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: IA-64
Date: Wed, 19 May 2021 19:36:12 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 29
Message-ID: <2021May19.213612@mips.complang.tuwien.ac.at>
References: <s7gtk1$csi$1@dont-email.me> <memo.20210512183631.13980N@jgd.cix.co.uk> <s7h71q$76h$1@dont-email.me> <s7hcrm$sp1$1@gal.iecc.com> <s7hffs$19e$1@dont-email.me> <s7hklb$i82$1@dont-email.me> <60a43f23.2045341625@news.eternal-september.org> <s814as$hm$1@dont-email.me> <cca40ada-b0ad-4b7f-9e16-91ca67de09a3n@googlegroups.com> <2021May19.111616@mips.complang.tuwien.ac.at> <347b9472-2b18-4323-8066-162f52701c03n@googlegroups.com>
Injection-Info: reader02.eternal-september.org; posting-host="8969735dbe9fa8af044a5c1c27b85a2b";
logging-data="17156"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+9aahsW8bXE2X3GcKeBLPe"
Cancel-Lock: sha1:+XbVsaAAvQjVaZroYZtjt3VNEGM=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Wed, 19 May 2021 19:36 UTC

MitchAlsup <MitchAlsup@aol.com> writes:
>On Wednesday, May 19, 2021 at 4:36:01 AM UTC-5, Anton Ertl wrote:
>> MitchAlsup <Mitch...@aol.com> writes:
>> >However, there are things that HW simply takes in stride
>> >that software scheduling can never get "correct", cache misses, for example;
>> >and in particular a cache miss that runs 1 miss followed by 7 hits. Does SW
>> >schedule for the hit case or for the miss case ?
><
>> Easy: This does not sound like pointer-chasing, so just schedule for
>> the miss case. If there is some reason not to, unroll by 8, and
>> schedule 1 miss followed by 7 hits. Alternatively, let the hardware
>> prefetcher do its work, and schedule for a hit.
><
>But then SW is not driving the whole process !!

Software is doing the fetches that inform the prefetcher. But if
that's too much hardware autonomy for some religious reason, the
hardware prefetcher could be set up by software: e.g., program it to
prefetch n cache lines ahead when the load/store unit sees a load
within a certain range. And then there are the prefetch instructions.

But note that EPIC does not have such religious rules. The register
stack engine of IA-64 has quite a bit of autonomy. And caches are
also autonomous (cache line length, organization, replacement policy).

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: IA-64

<s851rg$c46$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16952&group=comp.arch#16952

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!/FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: IA-64
Date: Thu, 20 May 2021 09:07:28 +0200
Organization: Aioe.org NNTP Server
Lines: 55
Message-ID: <s851rg$c46$1@gioia.aioe.org>
References: <s7gtk1$csi$1@dont-email.me>
<memo.20210512183631.13980N@jgd.cix.co.uk> <s7h71q$76h$1@dont-email.me>
<s7hcrm$sp1$1@gal.iecc.com> <s7hffs$19e$1@dont-email.me>
<s7hklb$i82$1@dont-email.me> <60a43f23.2045341625@news.eternal-september.org>
<s814as$hm$1@dont-email.me>
<cca40ada-b0ad-4b7f-9e16-91ca67de09a3n@googlegroups.com>
<s82dne$83s$1@gioia.aioe.org>
<fbea40d0-e56c-4538-bd5e-35a0133be726n@googlegroups.com>
NNTP-Posting-Host: /FKOcGQMirZgkZJCo9x3IA.user.gioia.aioe.org
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
X-Complaints-To: abuse@aioe.org
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:60.0) Gecko/20100101
Firefox/60.0 SeaMonkey/2.53.7
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Thu, 20 May 2021 07:07 UTC

MitchAlsup wrote:
> On Wednesday, May 19, 2021 at 2:11:47 AM UTC-5, Terje Mathisen wrote:
>> MitchAlsup wrote:
>>> On Tuesday, May 18, 2021 at 2:25:18 PM UTC-5, Marcus wrote:
>>>> On 2021-05-18, gree...@gmail.com wrote:
>>>>> On Thu, 13 May 2021 00:25:41 +0200, David Brown <david...@hesbynett.no> sprachen:
>>>>>
>>>>>> Thus processors
>>>>>> that made these decisions at run-time (super-scaler multi-issue cpus)
>>>>>> gave you more speed for a great deal less money and power.
>>>>>
>>>>> Well no, because if you want to decide what to parallel at run-time, write code that will do that at run-time. A mix of
>>>>> self-modding code, and a recompiler. It would just be doing in software, what is currently the job of a shitload of transistors
>>>>> doing all the prediction and hyper-threading and stuff.
>>>>>
>>>>> Admittedly those transistors can do all that in real-time, or even non-time, hidden away between and behind instruction ticks.
>>>>> They do a great job. But as ever, software is better than hardware. Scrap all that stuff, that exoskeleton that drags the poor old
>>>>> x86 along screaming in agony. Instead give the transistors to EPIC stuff. You know how many transistors you'd have spare? It'd be
>>>>> halfway to running an entire application in one clock cycle. Build an absolute fuckload of execution units, ALUs, registers. Then
>>>>> get software to figure out to run as much as possible explicitly parallel. Where it can't do that, other software tries to figure
>>>>> out what it can.
>>>>
>>>> Doesn't this reasoning fall apart when you need to take into account
>>>> the dynamic nature of program execution, such as instruction latency
>>>> variations due to cache hit/miss and and branch mispredictions (unless
>>>> you want to add dozens of delay slots) etc? How do you fill unused
>>>> execution units without dynamic scheduling?
>>> <
>>> It is in arguably true that software has a much larger horizon for scheduling
>>> that HW can manage. However, there are things that HW simply takes in stride
>>> that software scheduling can never get "correct", cache misses, for example;
>>> and in particular a cache miss that runs 1 miss followed by 7 hits. Does SW
>>> schedule for the hit case or for the miss case ? HW can schedule for both
>>> cases simultaneously !
> <
>> Given that the compiler knows _exactly_ what sort of target it is
>> working towards, it should unroll code like this by a factor of 8, then
>> it can target both the first miss and the next 7 hits (but only if it
>> also pre-aligns the source address.)
> <
> But what if the sequence is 4 hits 1 miss 3 hits ??

I did note in a follow-up msg that the compiler would have to pre-align
all loops, which would add both more code and significantly slower loop
startup.

I.e. I agree with you that only the HW is in the right position to
optimize at this level.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Pages:12345
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor