Message-ID:

"There is no distinctly American criminal class except Congress." -- Mark Twain

devel / comp.arch / Good News for Ivan, Mitch, and Others

Good News for Ivan, Mitch, and Others

<f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>

https://www.novabbs.com/devel/article-flat.php?id=27239&group=comp.arch#27239

X-Received: by 2002:a05:622a:1b01:b0:343:582f:3e07 with SMTP id bb1-20020a05622a1b0100b00343582f3e07mr6122017qtb.578.1660362980548;
Fri, 12 Aug 2022 20:56:20 -0700 (PDT)
X-Received: by 2002:a05:622a:393:b0:342:e821:674 with SMTP id
j19-20020a05622a039300b00342e8210674mr6362639qtx.151.1660362980401; Fri, 12
Aug 2022 20:56:20 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 12 Aug 2022 20:56:20 -0700 (PDT)
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:d8cf:78e:f084:48e8;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:d8cf:78e:f084:48e8
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
Subject: Good News for Ivan, Mitch, and Others
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sat, 13 Aug 2022 03:56:20 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 1188

by: Quadibloc - Sat, 13 Aug 2022 03:56 UTC

Just read this article on HPC Wire:

https://www.hpcwire.com/2022/08/11/google-program-to-free-chips-boosts-university-semiconductor-design/

John Savard

Re: Good News for Ivan, Mitch, and Others

<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27240&group=comp.arch#27240

copy link Newsgroups: comp.arch

X-Received: by 2002:a37:80c5:0:b0:6ba:c5af:8b8e with SMTP id b188-20020a3780c5000000b006bac5af8b8emr4762302qkd.459.1660366126133;
Fri, 12 Aug 2022 21:48:46 -0700 (PDT)
X-Received: by 2002:ac8:7d86:0:b0:342:e7c0:2545 with SMTP id
c6-20020ac87d86000000b00342e7c02545mr6014609qtd.513.1660366125987; Fri, 12
Aug 2022 21:48:45 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 12 Aug 2022 21:48:45 -0700 (PDT)
In-Reply-To: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:d8cf:78e:f084:48e8;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:d8cf:78e:f084:48e8
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sat, 13 Aug 2022 04:48:46 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 9

by: Quadibloc - Sat, 13 Aug 2022 04:48 UTC

On Friday, August 12, 2022 at 9:56:22 PM UTC-6, Quadibloc wrote:
> Just read this article on HPC Wire:
>
> https://www.hpcwire.com/2022/08/11/google-program-to-free-chips-boosts-university-semiconductor-design/

September 13 is the deadline for the next batch. It's 130nm, which may be old, but it's good enough - the Pentium
Pro was implemented in 350 to 500 nm, although it was called 0.35 um to 0.5 um at the time, so you can make a
big core at 130nm.

John Savard

Re: Good News for Ivan, Mitch, and Others

<td88c8$dbk$1@newsreader4.netcologne.de>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27242&group=comp.arch#27242

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-e87a-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Good News for Ivan, Mitch, and Others
Date: Sat, 13 Aug 2022 13:19:04 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <td88c8$dbk$1@newsreader4.netcologne.de>
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com>
Injection-Date: Sat, 13 Aug 2022 13:19:04 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-e87a-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:e87a:0:7285:c2ff:fe6c:992d";
logging-data="13684"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Sat, 13 Aug 2022 13:19 UTC

Quadibloc <jsavard@ecn.ab.ca> schrieb:
> On Friday, August 12, 2022 at 9:56:22 PM UTC-6, Quadibloc wrote:
>> Just read this article on HPC Wire:
>>
>> https://www.hpcwire.com/2022/08/11/google-program-to-free-chips-boosts-university-semiconductor-design/
>
> September 13 is the deadline for the next batch. It's 130nm,
> which may be old, but it's good enough - the Pentium Pro was
> implemented in 350 to 500 nm, although it was called 0.35 um to
> 0.5 um at the time, so you can make a big core at 130nm.

https://en.wikipedia.org/wiki/PowerPC_G4#PowerPC_7447_and_7457 was
done in a 130 nm process, 512 kb L2 cache and a transistor count
of 58 million. It ran at up to 1.7 GHz.

Sounds like a 66130 would fit very comfortably, at least for size.

Re: Good News for Ivan, Mitch, and Others

<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27244&group=comp.arch#27244

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:620a:28c9:b0:6b9:592c:dc04 with SMTP id l9-20020a05620a28c900b006b9592cdc04mr6880831qkp.674.1660418864833;
Sat, 13 Aug 2022 12:27:44 -0700 (PDT)
X-Received: by 2002:ac8:7d86:0:b0:342:e7c0:2545 with SMTP id
c6-20020ac87d86000000b00342e7c02545mr8106732qtd.513.1660418864679; Sat, 13
Aug 2022 12:27:44 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!border-1.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 13 Aug 2022 12:27:44 -0700 (PDT)
In-Reply-To: <td88c8$dbk$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:d8cf:78e:f084:48e8;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:d8cf:78e:f084:48e8
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sat, 13 Aug 2022 19:27:44 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 23
X-Received-Bytes: 2498

by: Quadibloc - Sat, 13 Aug 2022 19:27 UTC

On Saturday, August 13, 2022 at 7:19:08 AM UTC-6, Thomas Koenig wrote:

> https://en.wikipedia.org/wiki/PowerPC_G4#PowerPC_7447_and_7457 was
> done in a 130 nm process, 512 kb L2 cache and a transistor count
> of 58 million. It ran at up to 1.7 GHz.

And one sees the press gloating that Russia is helpless, because they can't
make chips beyond 65nm, so sanctions have them at our mercy!

The Red Chinese designed their H-bomb on a 24-bit computer made out
of discrete transistors.

And there are even articles saying the Chinese 7nm chips are nothing to
worry about, because they're an early 7nm process that doesn't use EUV.
The Ryzen 9 3900 12-core processor in my daily driver was made that way,
and I've seen no need to update it with something later and greater.

Sanctions are good, but let's not kid ourselves; to be able to make fast
and powerful processors, a country doesn't need what constitutes a bleeding-edge
process node these days: we had fast and powerful processors for years now,
so even an out-of-date process node is enough to allow a country to make weapons
with which to make a serious nuisance of itself.

John Savard

Re: Good News for Ivan, Mitch, and Others

<td8uio$2ospr$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27245&group=comp.arch#27245

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Good News for Ivan, Mitch, and Others
Date: Sat, 13 Aug 2022 12:38:00 -0700
Organization: A noiseless patient Spider
Lines: 34
Message-ID: <td8uio$2ospr$1@dont-email.me>
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com>
<td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 13 Aug 2022 19:38:01 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="60ba2d939230f8367d60a93da956fd29";
logging-data="2913083"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18wfNlQCUmSwUblWN4Qkp4VB0g9obt2wyk="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.1.2
Cancel-Lock: sha1:e4ZwguL+qtqcXZzVLQ9CIBpg9Ps=
Content-Language: en-US
In-Reply-To: <ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com>

by: Stephen Fuld - Sat, 13 Aug 2022 19:38 UTC

On 8/13/2022 12:27 PM, Quadibloc wrote:
> On Saturday, August 13, 2022 at 7:19:08 AM UTC-6, Thomas Koenig wrote:
>
>> https://en.wikipedia.org/wiki/PowerPC_G4#PowerPC_7447_and_7457 was
>> done in a 130 nm process, 512 kb L2 cache and a transistor count
>> of 58 million. It ran at up to 1.7 GHz.
>
> And one sees the press gloating that Russia is helpless, because they can't
> make chips beyond 65nm, so sanctions have them at our mercy!
>
> The Red Chinese designed their H-bomb on a 24-bit computer made out
> of discrete transistors.

The Eniac was used for some US H-bomb calculations. And, of course, the
calculations for the A-bomb were done by Computers, when the word was a
job description, not a thing!

snip

> Sanctions are good, but let's not kid ourselves; to be able to make fast
> and powerful processors, a country doesn't need what constitutes a bleeding-edge
> process node these days: we had fast and powerful processors for years now,
> so even an out-of-date process node is enough to allow a country to make weapons
> with which to make a serious nuisance of itself.

Yes, and some weapons don't require any computing power, e.g. passenger
jets on 9/11, or biological weapons.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Good News for Ivan, Mitch, and Others

<2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27246&group=comp.arch#27246

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:306:b0:343:416d:76ae with SMTP id q6-20020a05622a030600b00343416d76aemr8633571qtw.337.1660420568225;
Sat, 13 Aug 2022 12:56:08 -0700 (PDT)
X-Received: by 2002:a05:6214:622:b0:476:c145:3242 with SMTP id
a2-20020a056214062200b00476c1453242mr8175743qvx.53.1660420568070; Sat, 13 Aug
2022 12:56:08 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 13 Aug 2022 12:56:07 -0700 (PDT)
In-Reply-To: <ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:444e:4365:2589:d193;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:444e:4365:2589:d193
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sat, 13 Aug 2022 19:56:08 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3266

by: MitchAlsup - Sat, 13 Aug 2022 19:56 UTC

On Saturday, August 13, 2022 at 2:27:45 PM UTC-5, Quadibloc wrote:
> On Saturday, August 13, 2022 at 7:19:08 AM UTC-6, Thomas Koenig wrote:
>
> > https://en.wikipedia.org/wiki/PowerPC_G4#PowerPC_7447_and_7457 was
> > done in a 130 nm process, 512 kb L2 cache and a transistor count
> > of 58 million. It ran at up to 1.7 GHz.
> And one sees the press gloating that Russia is helpless, because they can't
> make chips beyond 65nm, so sanctions have them at our mercy!
>
> The Red Chinese designed their H-bomb on a 24-bit computer made out
> of discrete transistors.
>
> And there are even articles saying the Chinese 7nm chips are nothing to
> worry about, because they're an early 7nm process that doesn't use EUV.
> The Ryzen 9 3900 12-core processor in my daily driver was made that way,
> and I've seen no need to update it with something later and greater.
<
I was designing for 5GHz in 65nm............oh so long ago.
<
If you are looking for a small in-order machine at reasonable power and
frequency, you can make a sub 1mm^2 super-macro in 45nm complete
with its cache hierarchy.
<
It is only the GBOoO machines that require bleeding edge lithography--
these are 15×-30× the size of a reasonable in-order core. Of course,
Caches are the same size per unit store for both small IO and GBOoO.
>
> Sanctions are good, but let's not kid ourselves; to be able to make fast
> and powerful processors, a country doesn't need what constitutes a bleeding-edge
> process node these days: we had fast and powerful processors for years now,
> so even an out-of-date process node is enough to allow a country to make weapons
> with which to make a serious nuisance of itself.
<
You must be talking about those IOT thingies................
>
> John Savard

Re: Good News for Ivan, Mitch, and Others

<td92dl$2tle3$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27247&group=comp.arch#27247

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Good News for Ivan, Mitch, and Others
Date: Sat, 13 Aug 2022 13:43:31 -0700
Organization: A noiseless patient Spider
Lines: 54
Message-ID: <td92dl$2tle3$1@dont-email.me>
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com>
<td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com>
<2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 13 Aug 2022 20:43:33 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="07d6299c97769d0b897f0f2af38d4ac2";
logging-data="3069379"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19hye2CsU0UhYf9f2JIACCT"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.12.0
Cancel-Lock: sha1:Ux+wtTaz1D6KADzveopxYu6g0bY=
Content-Language: en-US
In-Reply-To: <2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>

by: Ivan Godard - Sat, 13 Aug 2022 20:43 UTC

On 8/13/2022 12:56 PM, MitchAlsup wrote:
> On Saturday, August 13, 2022 at 2:27:45 PM UTC-5, Quadibloc wrote:
>> On Saturday, August 13, 2022 at 7:19:08 AM UTC-6, Thomas Koenig wrote:
>>
>>> https://en.wikipedia.org/wiki/PowerPC_G4#PowerPC_7447_and_7457 was
>>> done in a 130 nm process, 512 kb L2 cache and a transistor count
>>> of 58 million. It ran at up to 1.7 GHz.
>> And one sees the press gloating that Russia is helpless, because they can't
>> make chips beyond 65nm, so sanctions have them at our mercy!
>>
>> The Red Chinese designed their H-bomb on a 24-bit computer made out
>> of discrete transistors.
>>
>> And there are even articles saying the Chinese 7nm chips are nothing to
>> worry about, because they're an early 7nm process that doesn't use EUV.
>> The Ryzen 9 3900 12-core processor in my daily driver was made that way,
>> and I've seen no need to update it with something later and greater.
> <
> I was designing for 5GHz in 65nm............oh so long ago.
> <
> If you are looking for a small in-order machine at reasonable power and
> frequency, you can make a sub 1mm^2 super-macro in 45nm complete
> with its cache hierarchy.
> <
> It is only the GBOoO machines that require bleeding edge lithography--
> these are 15×-30× the size of a reasonable in-order core. Of course,
> Caches are the same size per unit store for both small IO and GBOoO.

Total chip area doesn't matter: it's dominated by pins and pads and
caches, so for a given core count you get about as many dice per wafer
regardless of ISA and micro-architecture.

What matters is *yield* - how many *good* dice do you get per wafer. And
defect rates in different parts of the die have wildly disparate effects
on the yield. A defect in a pin pad is not even noticed. A defect in a
cache can be (often) fixed with a hot spare line/entry during fab; no
impact on profit. A defect in a core can bin to a model with fewer
cores: not a total loss, but still bottom line impact.

So the dollar impact of yield is dominated by the area of the die
occupied by logic that cannot use free sparing. Guess what is in that
area: the core. A chip with 10 GBOOO cores may have twice the mm^2 area
of one with 10 in-order, but it will have 30x the defect-vulnerable
area. Until the fab process has matured enough to be effectively
defect-free, the IO yield per wafer can easily be twice that of the GBOOO.

So, rough numbers IO/GBOOO:
1-2x chips/wafer
1-4x good/die (bleeding edge worst)
2-5x instructions/watt
.5-1x instructions/cycle (for legacy ISAs)
.1 marketing brags/product

Re: Good News for Ivan, Mitch, and Others

<nse*3zGVy@news.chiark.greenend.org.uk>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27248&group=comp.arch#27248

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!nntp.terraraq.uk!nntp-feed.chiark.greenend.org.uk!ewrotcd!.POSTED.chiark.greenend.org.uk!not-for-mail
From: theom+n...@chiark.greenend.org.uk (Theo)
Newsgroups: comp.arch
Subject: Re: Good News for Ivan, Mitch, and Others
Date: 13 Aug 2022 22:38:37 +0100 (BST)
Organization: University of Cambridge, England
Distribution: world
Message-ID: <nse*3zGVy@news.chiark.greenend.org.uk>
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com> <e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
Injection-Info: chiark.greenend.org.uk; posting-host="chiark.greenend.org.uk:212.13.197.229";
logging-data="23089"; mail-complaints-to="abuse@chiark.greenend.org.uk"
User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX) (Linux/5.10.0-15-amd64 (x86_64))
Originator: theom@chiark.greenend.org.uk ([212.13.197.229])

by: Theo - Sat, 13 Aug 2022 21:38 UTC

Thomas Koenig <tkoenig@netcologne.de> wrote:
> Quadibloc <jsavard@ecn.ab.ca> schrieb:
> > On Friday, August 12, 2022 at 9:56:22 PM UTC-6, Quadibloc wrote:
> >> Just read this article on HPC Wire:
> >>
> >> https://www.hpcwire.com/2022/08/11/google-program-to-free-chips-boosts-university-semiconductor-design/
> >
> > September 13 is the deadline for the next batch. It's 130nm,
> > which may be old, but it's good enough - the Pentium Pro was
> > implemented in 350 to 500 nm, although it was called 0.35 um to
> > 0.5 um at the time, so you can make a big core at 130nm.
>
> https://en.wikipedia.org/wiki/PowerPC_G4#PowerPC_7447_and_7457 was
> done in a 130 nm process, 512 kb L2 cache and a transistor count
> of 58 million. It ran at up to 1.7 GHz.

The usual problem with 'XYZ old high-end chip was made on that process' is that
they did, but they used a lot of mm2. The 7447 was, according to that
article 83.9mm2, and the 7457 98.3mm2. Google's programme offers you 10mm2 of
user space (total chip area probably about 13-15mm2).

Which is not nothing, but nobody is going to be building a Pentium 4 in that
space (131/146mm2 in 130nm)

> Sounds like a 66130 would fit very comfortably, at least for size.

What you can fit is one thing, how much cache you can manage to squeeze in,
is another question...

That said, a high end design from a previous process (say 0.3um or bigger) could
be shrunk (or reimplemented) and might fit within the mm2 limit. Just needs
scaling the expectations down a little.

Theo

Re: Good News for Ivan, Mitch, and Others

<632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27250&group=comp.arch#27250

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:6214:d0e:b0:478:c84c:6c17 with SMTP id 14-20020a0562140d0e00b00478c84c6c17mr8473139qvh.63.1660428796724;
Sat, 13 Aug 2022 15:13:16 -0700 (PDT)
X-Received: by 2002:a05:622a:100d:b0:31f:25e3:7a45 with SMTP id
d13-20020a05622a100d00b0031f25e37a45mr8796642qte.365.1660428796578; Sat, 13
Aug 2022 15:13:16 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 13 Aug 2022 15:13:16 -0700 (PDT)
In-Reply-To: <td92dl$2tle3$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:d4fa:9459:c187:7734;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:d4fa:9459:c187:7734
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com> <2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
<td92dl$2tle3$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sat, 13 Aug 2022 22:13:16 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 4911

by: MitchAlsup - Sat, 13 Aug 2022 22:13 UTC

On Saturday, August 13, 2022 at 3:43:35 PM UTC-5, Ivan Godard wrote:
> On 8/13/2022 12:56 PM, MitchAlsup wrote:
> > On Saturday, August 13, 2022 at 2:27:45 PM UTC-5, Quadibloc wrote:
> >> On Saturday, August 13, 2022 at 7:19:08 AM UTC-6, Thomas Koenig wrote:
> >>
> >>> https://en.wikipedia.org/wiki/PowerPC_G4#PowerPC_7447_and_7457 was
> >>> done in a 130 nm process, 512 kb L2 cache and a transistor count
> >>> of 58 million. It ran at up to 1.7 GHz.
> >> And one sees the press gloating that Russia is helpless, because they can't
> >> make chips beyond 65nm, so sanctions have them at our mercy!
> >>
> >> The Red Chinese designed their H-bomb on a 24-bit computer made out
> >> of discrete transistors.
> >>
> >> And there are even articles saying the Chinese 7nm chips are nothing to
> >> worry about, because they're an early 7nm process that doesn't use EUV..
> >> The Ryzen 9 3900 12-core processor in my daily driver was made that way,
> >> and I've seen no need to update it with something later and greater.
> > <
> > I was designing for 5GHz in 65nm............oh so long ago.
> > <
> > If you are looking for a small in-order machine at reasonable power and
> > frequency, you can make a sub 1mm^2 super-macro in 45nm complete
> > with its cache hierarchy.
> > <
> > It is only the GBOoO machines that require bleeding edge lithography--
> > these are 15×-30× the size of a reasonable in-order core. Of course,
> > Caches are the same size per unit store for both small IO and GBOoO.
<
> Total chip area doesn't matter: it's dominated by pins and pads and
> caches, so for a given core count you get about as many dice per wafer
> regardless of ISA and micro-architecture.
>
> What matters is *yield* - how many *good* dice do you get per wafer. And
> defect rates in different parts of the die have wildly disparate effects
> on the yield. A defect in a pin pad is not even noticed. A defect in a
> cache can be (often) fixed with a hot spare line/entry during fab; no
> impact on profit. A defect in a core can bin to a model with fewer
> cores: not a total loss, but still bottom line impact.
>
> So the dollar impact of yield is dominated by the area of the die
> occupied by logic that cannot use free sparing. Guess what is in that
> area: the core. A chip with 10 GBOOO cores may have twice the mm^2 area
> of one with 10 in-order, but it will have 30x the defect-vulnerable
> area. Until the fab process has matured enough to be effectively
> defect-free, the IO yield per wafer can easily be twice that of the GBOOO..
>
> So, rough numbers IO/GBOOO:
> 1-2x chips/wafer
> 1-4x good/die (bleeding edge worst)
> 2-5x instructions/watt
> .5-1x instructions/cycle (for legacy ISAs)
> .1 marketing brags/product
<
With respect to the offered 130nm processes, you get one (1) GBOoO
core per die at acceptable yield. You might be able to cram on a
SouthBridge (PCIe), since Opteron (SledgeHammer) was 180nm, and
bump the L2 to 2GB.
<
In the same footprint, you can put 12 SIO cores on the die, each
with reasonable L1s, and sharing a similarly sized L2 (and a South-
Bridge too).
<
1 fatal flaw in the GBOoO and its dead,
4 fatal flaws in the SIO and you still have 8 working cores.

Re: Good News for Ivan, Mitch, and Others

<1140a09f-1179-43cc-84bf-cf36b356e1ffn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27254&group=comp.arch#27254

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:1986:b0:343:225d:f9e1 with SMTP id u6-20020a05622a198600b00343225df9e1mr10804692qtc.651.1660492148275;
Sun, 14 Aug 2022 08:49:08 -0700 (PDT)
X-Received: by 2002:a05:622a:393:b0:342:e821:674 with SMTP id
j19-20020a05622a039300b00342e8210674mr11188885qtx.151.1660492148147; Sun, 14
Aug 2022 08:49:08 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 14 Aug 2022 08:49:07 -0700 (PDT)
In-Reply-To: <632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:49bf:fd13:17c:5ddc;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:49bf:fd13:17c:5ddc
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com> <2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
<td92dl$2tle3$1@dont-email.me> <632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1140a09f-1179-43cc-84bf-cf36b356e1ffn@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sun, 14 Aug 2022 15:49:08 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 2190

by: Quadibloc - Sun, 14 Aug 2022 15:49 UTC

On Saturday, August 13, 2022 at 4:13:17 PM UTC-6, MitchAlsup wrote:

> With respect to the offered 130nm processes, you get one (1) GBOoO
> core per die at acceptable yield.

Sounds right, given that the Pentium III was made on 250nm down to 130nm
during its lifetime.

Of course, while it was out-of-order, it may not have been a "Great Big Out-of-Order"
chip. Which is all right: while it seems that the benefits of out-of-order designs
are such that having twelve in-order cores instead of one out-of-order core is
_not_ seen as a desirable trade-off, enlarging things like the reorder window, the
number of physical registers, and so on surely comes with diminishing returns.

John Savard

Re: Good News for Ivan, Mitch, and Others

<tdbagc$37g9a$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27259&group=comp.arch#27259

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: paaroncl...@gmail.com (Paul A. Clayton)
Newsgroups: comp.arch
Subject: Re: Good News for Ivan, Mitch, and Others
Date: Sun, 14 Aug 2022 13:13:47 -0400
Organization: A noiseless patient Spider
Lines: 14
Message-ID: <tdbagc$37g9a$1@dont-email.me>
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com>
<td88c8$dbk$1@newsreader4.netcologne.de>
<nse*3zGVy@news.chiark.greenend.org.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 14 Aug 2022 17:13:49 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="d5b42bb7a10f0089d1e88fd19d355411";
logging-data="3391786"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Lhd3TrVLXohlZyIuWdcaO3t0KigccLOY="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.0
Cancel-Lock: sha1:6x+wfrXBgqrTrqBx1MpZkNgPDkw=
In-Reply-To: <nse*3zGVy@news.chiark.greenend.org.uk>

by: Paul A. Clayton - Sun, 14 Aug 2022 17:13 UTC

Theo wrote:
[snip]
> What you can fit is one thing, how much cache you can manage to squeeze in,
> is another question...

I wonder if one could interface with AMD's L3 extending cache
chip? (I suspect such would require too much area even with the
much smaller area per connection for such tight integration.)

(Such would require AMD to offer such for independent sale and
even then the packaging costs would be painful. I think AMD also
has a separate I/O and memory controller chip which could reduce
the chip area needed for a research processor if AMD was willing
to sell such.)

Re: Good News for Ivan, Mitch, and Others

<nuD*93KVy@news.chiark.greenend.org.uk>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27260&group=comp.arch#27260

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!nntp.terraraq.uk!nntp-feed.chiark.greenend.org.uk!ewrotcd!.POSTED.chiark.greenend.org.uk!not-for-mail
From: theom+n...@chiark.greenend.org.uk (Theo)
Newsgroups: comp.arch
Subject: Re: Good News for Ivan, Mitch, and Others
Date: 14 Aug 2022 18:59:17 +0100 (BST)
Organization: University of Cambridge, England
Message-ID: <nuD*93KVy@news.chiark.greenend.org.uk>
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com> <e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de> <nse*3zGVy@news.chiark.greenend.org.uk> <tdbagc$37g9a$1@dont-email.me>
Injection-Info: chiark.greenend.org.uk; posting-host="chiark.greenend.org.uk:212.13.197.229";
logging-data="29661"; mail-complaints-to="abuse@chiark.greenend.org.uk"
User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX) (Linux/5.10.0-15-amd64 (x86_64))
Originator: theom@chiark.greenend.org.uk ([212.13.197.229])

by: Theo - Sun, 14 Aug 2022 17:59 UTC

Paul A. Clayton <paaronclayton@gmail.com> wrote:
> Theo wrote:
> [snip]
> > What you can fit is one thing, how much cache you can manage to squeeze in,
> > is another question...
>
> I wonder if one could interface with AMD's L3 extending cache
> chip? (I suspect such would require too much area even with the
> much smaller area per connection for such tight integration.)

I'd guess the 130nm process used here does not support the through silicon
vias that would be needed for the interconnect. I don't know much about how
TSVs are built in the fab, but would guess you'd need some help from the
process.

If you routed the signals off as pads and then did a wire-bond maybe you
could do that, but I suspect AMD's chip is using TSVs.

The other complicated is the voltages may be different - Vcore on 130nm is
probably higher than Vcore on 7nm. It would likely need some voltage
conversion (as pads typically do).

Theo

Re: Good News for Ivan, Mitch, and Others

<tdd4sq$1ia1$1@gioia.aioe.org>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27271&group=comp.arch#27271

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!NZ87pNe1TKxNDknVl4tZhw.user.46.165.242.91.POSTED!not-for-mail
From: antis...@math.uni.wroc.pl
Newsgroups: comp.arch
Subject: Re: Good News for Ivan, Mitch, and Others
Date: Mon, 15 Aug 2022 09:50:18 -0000 (UTC)
Organization: Aioe.org NNTP Server
Message-ID: <tdd4sq$1ia1$1@gioia.aioe.org>
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com> <e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de> <ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com> <td8uio$2ospr$1@dont-email.me>
Injection-Info: gioia.aioe.org; logging-data="51521"; posting-host="NZ87pNe1TKxNDknVl4tZhw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: tin/2.4.5-20201224 ("Glen Albyn") (Linux/5.10.0-9-amd64 (x86_64))
X-Notice: Filtered by postfilter v. 0.9.2
Cancel-Lock: sha1:wQxbieoUvtmBy5RDVXgzn4T8qNA=

by: antis...@math.uni.wroc.pl - Mon, 15 Aug 2022 09:50 UTC

Stephen Fuld <sfuld@alumni.cmu.edu.invalid> wrote:
> On 8/13/2022 12:27 PM, Quadibloc wrote:
> > On Saturday, August 13, 2022 at 7:19:08 AM UTC-6, Thomas Koenig wrote:
> >
> >> https://en.wikipedia.org/wiki/PowerPC_G4#PowerPC_7447_and_7457 was
> >> done in a 130 nm process, 512 kb L2 cache and a transistor count
> >> of 58 million. It ran at up to 1.7 GHz.
> >
> > And one sees the press gloating that Russia is helpless, because they can't
> > make chips beyond 65nm, so sanctions have them at our mercy!
> >
> > The Red Chinese designed their H-bomb on a 24-bit computer made out
> > of discrete transistors.
>
> The Eniac was used for some US H-bomb calculations. And, of course, the
> calculations for the A-bomb were done by Computers, when the word was a
> job description, not a thing!

Feynmann wrote that US A-bomb was done using IBM accounting
machines. IIUC Russians build their bomb without computers
or equivalent devices (supposedly one of physicists involved
in the program said that any computational problem can be
solved by sufficient number of students).

--
Waldek Hebisch

Re: Good News for Ivan, Mitch, and Others

<2fc52827-2d71-4380-ad14-01e3ea2e9f13n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27276&group=comp.arch#27276

copy link Newsgroups: comp.arch

X-Received: by 2002:ae9:ed91:0:b0:6bb:29c0:8b0c with SMTP id c139-20020ae9ed91000000b006bb29c08b0cmr4273041qkg.676.1660584336643;
Mon, 15 Aug 2022 10:25:36 -0700 (PDT)
X-Received: by 2002:a05:622a:1992:b0:344:6b4a:96b with SMTP id
u18-20020a05622a199200b003446b4a096bmr1867411qtc.220.1660584336499; Mon, 15
Aug 2022 10:25:36 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 15 Aug 2022 10:25:36 -0700 (PDT)
In-Reply-To: <632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:50ed:d45e:f030:8;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:50ed:d45e:f030:8
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com> <2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
<td92dl$2tle3$1@dont-email.me> <632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2fc52827-2d71-4380-ad14-01e3ea2e9f13n@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 15 Aug 2022 17:25:36 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 5451

by: Quadibloc - Mon, 15 Aug 2022 17:25 UTC

On Saturday, August 13, 2022 at 4:13:17 PM UTC-6, MitchAlsup wrote:

> With respect to the offered 130nm processes, you get one (1) GBOoO
> core per die at acceptable yield. You might be able to cram on a
> SouthBridge (PCIe), since Opteron (SledgeHammer) was 180nm, and
> bump the L2 to 2GB.

> In the same footprint, you can put 12 SIO cores on the die, each
> with reasonable L1s, and sharing a similarly sized L2 (and a South-
> Bridge too).

> 1 fatal flaw in the GBOoO and its dead,
> 4 fatal flaws in the SIO and you still have 8 working cores.

I think that you may have come up with a valuable idea...

At one point, some time back, you told me that a VLIW design
could never subsitute for out-of-order, because while having a
lot of registers to execute code that looks like register rename
was done by the compiler... is all well and good, but it doesn't
address the most important thing OoO deals with: cache
misses. Because, of course, they're unpredictable, unlike
register hazards.

Why that was valuable was because you _also_ noted, in a
different post, that the scoreboarding of the 6600, even if it
misses some register hazards, _does_ also address what
cache misses do.

Well, maybe I misunderstood you, but it sounded to me that
you just solved the problem that Ivan Godard and others are
trying to solve: get the high speed of OoO without the
humungous overhead in transistors. Just make a VLIW
design with a lot of registers... and make a scoreboarded
implementation of it... and you've got an OoO equivalent
without the transistor overhead!!!

Now, maybe that's a mistake on my part, and there are good
reasons why this wouldn't work.

But in any case, if by in-order cores, you mean just plain old
ordinary in-order cores, like the first Intel Atom, or like the
486, then...

while what I quoted above of your post was indeed
accurate and factual, I still have to admit I disagree
with what I think you were implying and advocating by
it - which you had also explicitly said in other posts long
ago.
The Star 100 failed where the Cray I succeeded. Because
the Cray I could be efficient working with shorter
vectors, since its vector registers gave it shorter set-up
times per vector, and because it was designed to be
fast on scalar code too, so that Amdahl's law wouldn't bite
it quite so hard.
And the Illiac IV failed even harder; so hard that it's
almost forgotten today.
And so, while the failure of Dennard Scaling, and the petering-out of
Moore's Law now means we have few other choices but
parallelism for more performance... it's not as if parallelism
is something _new_.
People have been trying to find ways to efficiently put multiple
processors working in parallel to work on problems in an
efficient manner for a long time.
And there's been no general solution, only the odd
parallel algorithm for the odd individual problem.

Until some Einstein comes along and _changes_ that situation,
the gains in single-thread performance provided by
Great Big Out-of-Order chip designs are worth every transistor
they cost.
Of course, _some_ applications (i.e. database) are embarassingly
parallel or throughput-based. More cores would be a simple
multiplier for them.
And so, with today's technology, instead of putting eight GBOoO
cores on a die, that more value might be obtained by putting,
say, _four_ GBOoO cores, and *sixty-four* in-order cores on that same
die, because it might be a little slower in some applications, but
it would be a lot faster in certain other applications.

So while I'm willing to, as it were, go with you
halfway, I'm in disagreement with disparaging the value of
GBOoO designs, however wretchedly excessive they may seem.

If, by combining the scoreboard with VLIW, one can get the
benefits of OoO without the cost, that's different.
But if we can't... the benefits of GBOoO are worth their
great cost.

John Savard

Re: Good News for Ivan, Mitch, and Others

<191f11f8-709a-4821-9a36-47cfb7aae515n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27279&group=comp.arch#27279

copy link Newsgroups: comp.arch

X-Received: by 2002:ad4:5b8d:0:b0:47b:2c2c:96f with SMTP id 13-20020ad45b8d000000b0047b2c2c096fmr15177279qvp.80.1660591299176;
Mon, 15 Aug 2022 12:21:39 -0700 (PDT)
X-Received: by 2002:a0c:f254:0:b0:479:5dbf:7861 with SMTP id
z20-20020a0cf254000000b004795dbf7861mr15021617qvl.87.1660591298976; Mon, 15
Aug 2022 12:21:38 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 15 Aug 2022 12:21:38 -0700 (PDT)
In-Reply-To: <2fc52827-2d71-4380-ad14-01e3ea2e9f13n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:2d2c:6a8c:b7f:4060;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:2d2c:6a8c:b7f:4060
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com> <2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
<td92dl$2tle3$1@dont-email.me> <632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
<2fc52827-2d71-4380-ad14-01e3ea2e9f13n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <191f11f8-709a-4821-9a36-47cfb7aae515n@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 15 Aug 2022 19:21:39 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 120

by: MitchAlsup - Mon, 15 Aug 2022 19:21 UTC

On Monday, August 15, 2022 at 12:25:37 PM UTC-5, Quadibloc wrote:
> On Saturday, August 13, 2022 at 4:13:17 PM UTC-6, MitchAlsup wrote:
> > With respect to the offered 130nm processes, you get one (1) GBOoO
> > core per die at acceptable yield. You might be able to cram on a
> > SouthBridge (PCIe), since Opteron (SledgeHammer) was 180nm, and
> > bump the L2 to 2GB.
>
> > In the same footprint, you can put 12 SIO cores on the die, each
> > with reasonable L1s, and sharing a similarly sized L2 (and a South-
> > Bridge too).
>
> > 1 fatal flaw in the GBOoO and its dead,
> > 4 fatal flaws in the SIO and you still have 8 working cores.
> I think that you may have come up with a valuable idea...
>
> At one point, some time back, you told me that a VLIW design
> could never subsitute for out-of-order, because while having a
> lot of registers to execute code that looks like register rename
> was done by the compiler... is all well and good, but it doesn't
> address the most important thing OoO deals with: cache
> misses. Because, of course, they're unpredictable, unlike
> register hazards.
<
Many of them are perfectly predictable:: 7-hits, 1-miss.
>
> Why that was valuable was because you _also_ noted, in a
> different post, that the scoreboarding of the 6600, even if it
> misses some register hazards, _does_ also address what
> cache misses do.
<
The ability to absorb latency.
>
> Well, maybe I misunderstood you, but it sounded to me that
> you just solved the problem that Ivan Godard and others are
> trying to solve: get the high speed of OoO without the
> humungous overhead in transistors. Just make a VLIW
> design with a lot of registers... and make a scoreboarded
> implementation of it... and you've got an OoO equivalent
> without the transistor overhead!!!
<
To a large part, this is true. I have a clean RISC instruction set,
I can CoIssue 30% of the instruction stream, and when backed
with reasonable caches, and TLBs these give me a machine
performing close to 1 IPC. It is my hope that VVM doubles
the overall performance--and I am within spitting distance of
a GBOoO machine while only paying for two SMIO machines
or area--so it should be on the order of 6× smaller.
>
> Now, maybe that's a mistake on my part, and there are good
> reasons why this wouldn't work.
>
> But in any case, if by in-order cores, you mean just plain old
> ordinary in-order cores, like the first Intel Atom, or like the
> 486, then...
<
No, as noted above.
>
> while what I quoted above of your post was indeed
> accurate and factual, I still have to admit I disagree
> with what I think you were implying and advocating by
> it - which you had also explicitly said in other posts long
> ago.
<
> The Star 100 failed where the Cray I succeeded. Because
> the Cray I could be efficient working with shorter
> vectors, since its vector registers gave it shorter set-up
> times per vector, and because it was designed to be
> fast on scalar code too, so that Amdahl's law wouldn't bite
> it quite so hard.
<
Agreed Star failed essentially due to Amdahl's law.
<
> And the Illiac IV failed even harder; so hard that it's
> almost forgotten today.
>
Vectors won over arrays, in essence.
>
> And so, while the failure of Dennard Scaling, and the petering-out of
> Moore's Law now means we have few other choices but
> parallelism for more performance... it's not as if parallelism
> is something _new_.
<
> People have been trying to find ways to efficiently put multiple
> processors working in parallel to work on problems in an
> efficient manner for a long time.
> And there's been no general solution, only the odd
> parallel algorithm for the odd individual problem.
<
The HW is there and works. What is failing is SW's ability to use
the kinds of HW that ahs been designed and produced in the past.
>
> Until some Einstein comes along and _changes_ that situation,
> the gains in single-thread performance provided by
> Great Big Out-of-Order chip designs are worth every transistor
> they cost.
<
For your application, yes.
<
> Of course, _some_ applications (i.e. database) are embarassingly
> parallel or throughput-based. More cores would be a simple
> multiplier for them.
> And so, with today's technology, instead of putting eight GBOoO
> cores on a die, that more value might be obtained by putting,
> say, _four_ GBOoO cores, and *sixty-four* in-order cores on that same
> die, because it might be a little slower in some applications, but
> it would be a lot faster in certain other applications.
>
> So while I'm willing to, as it were, go with you
> halfway, I'm in disagreement with disparaging the value of
> GBOoO designs, however wretchedly excessive they may seem.
<
I do not disparage GBOoO, I disparage them not making forward
progress faster--and this is a direct result of them being so <well>
complicated.
>
> If, by combining the scoreboard with VLIW, one can get the
> benefits of OoO without the cost, that's different.
> But if we can't... the benefits of GBOoO are worth their
> great cost.
>
> John Savard

Re: Good News for Ivan, Mitch, and Others

<100d9c60-a3ba-4d01-8ee2-30812e19df12n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27287&group=comp.arch#27287

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:6214:4110:b0:476:d05d:71b3 with SMTP id kc16-20020a056214411000b00476d05d71b3mr15359532qvb.62.1660601735166;
Mon, 15 Aug 2022 15:15:35 -0700 (PDT)
X-Received: by 2002:a05:6214:1c48:b0:479:71ce:1498 with SMTP id
if8-20020a0562141c4800b0047971ce1498mr15524958qvb.112.1660601735022; Mon, 15
Aug 2022 15:15:35 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 15 Aug 2022 15:15:34 -0700 (PDT)
In-Reply-To: <191f11f8-709a-4821-9a36-47cfb7aae515n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:50ed:d45e:f030:8;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:50ed:d45e:f030:8
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com> <2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
<td92dl$2tle3$1@dont-email.me> <632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
<2fc52827-2d71-4380-ad14-01e3ea2e9f13n@googlegroups.com> <191f11f8-709a-4821-9a36-47cfb7aae515n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <100d9c60-a3ba-4d01-8ee2-30812e19df12n@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 15 Aug 2022 22:15:35 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 3464

by: Quadibloc - Mon, 15 Aug 2022 22:15 UTC

On Monday, August 15, 2022 at 1:21:40 PM UTC-6, MitchAlsup wrote:

> The HW is there and works. What is failing is SW's ability to use
> the kinds of HW that ahs been designed and produced in the past.

In general, I found your reply interesting and informative, and had
little to take issue with. But _this_ statement is the sort of thing that
leads me to feel that I am in disagreement with you somewhere.

A pile of in-order CPUs, like, say, those in an old Xeon Phi, are there,
and work. Can't argue with that!

The fact, though, that computer programs aimed at solving certain
classes of problems can't make effective use of them... does not
necessarily indicate "failure" on the part of the software. It could just
as easily be that the hardware, which is there and working, all right,
is of an unsuitable kind, and those transistors would have been put to
more appropriate use building fewer GBOoO CPUs instead.

In other words, you seem to be making a value judgement... which
implies the assertion of "a fact not in evidence" - specifically, that
if those programmers were only doing their jobs, they would have
written better software.

Algorithms that haven't been discovered yet... will not necessarily,
with absolute inevitability, be discovered some time in the future.
For all we know, they may not exist at all.

Certainly, you're entitled to your opinion that optimism over the
prospects for software to exploit parallel computing facilities is
warranted. It's just that what I've quoted seems to elevate that
opinion to a statement of fact.

This is why I give you a hard time, even though I very much agree that
GBOoO designs do seem like a wretchedly excessive expenditure of
transistors and energy and the heat budget of chips... but if I don't
know of a better way, then I accept what I must.

John Savard

Re: Good News for Ivan, Mitch, and Others

<1ab7ea48-4cdf-47c6-86bd-cc45c989f1e8n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27288&group=comp.arch#27288

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:620a:4509:b0:6b9:9987:ebbf with SMTP id t9-20020a05620a450900b006b99987ebbfmr12964158qkp.304.1660602695292;
Mon, 15 Aug 2022 15:31:35 -0700 (PDT)
X-Received: by 2002:a05:620a:4149:b0:6bb:2c53:4702 with SMTP id
k9-20020a05620a414900b006bb2c534702mr5202873qko.656.1660602695154; Mon, 15
Aug 2022 15:31:35 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 15 Aug 2022 15:31:34 -0700 (PDT)
In-Reply-To: <100d9c60-a3ba-4d01-8ee2-30812e19df12n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:50ed:d45e:f030:8;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:50ed:d45e:f030:8
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com> <2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
<td92dl$2tle3$1@dont-email.me> <632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
<2fc52827-2d71-4380-ad14-01e3ea2e9f13n@googlegroups.com> <191f11f8-709a-4821-9a36-47cfb7aae515n@googlegroups.com>
<100d9c60-a3ba-4d01-8ee2-30812e19df12n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1ab7ea48-4cdf-47c6-86bd-cc45c989f1e8n@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Mon, 15 Aug 2022 22:31:35 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 3233

by: Quadibloc - Mon, 15 Aug 2022 22:31 UTC

On Monday, August 15, 2022 at 4:15:36 PM UTC-6, Quadibloc wrote:

> Certainly, you're entitled to your opinion that optimism over the
> prospects for software to exploit parallel computing facilities is
> warranted. It's just that what I've quoted seems to elevate that
> opinion to a statement of fact.

And _why_ is it that whenever I run across it, it sticks in my craw
enough that I feel a need to comment on it?

I think I still need to explain myself some more.

Naturally, as an engineer working on chip designs for major
semiconductor manufacturers, your formal education would have
included courses in calculus and differential equations, and of
course you would be able to program computers in BASIC,
FORTRAN, C, and no doubt a few assembler languages as well.

But while you _are_ Mitcn Alsup, to whose great hardware
knowledge I defer, none of those things make you *Donald E.
Knuth*, who at least collected a lot of algorithms in his multi-volume
series _The Art of Computer Programming_.

Whether or not he also did important work on developing _new_
algorithms, I know that there are mathematicians who work in
that field. Every now and again, they will be mentioned in news
reports, say about progress being made in solving the travelling
salesman problem. (That's the one about the most efficient order
in which to visit a given set of cities; not the one faced by farmers'
daughters.)

If I want to know if there's any hope that breakthroughs in using
parallel computers are on the horizon, I'd be asking them.

John Savard

Re: Good News for Ivan, Mitch, and Others

<813652fd-f874-492d-ba33-35555c3ca681n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27290&group=comp.arch#27290

copy link Newsgroups: comp.arch

X-Received: by 2002:a0c:cb0c:0:b0:476:d5e7:e0ca with SMTP id o12-20020a0ccb0c000000b00476d5e7e0camr15404374qvk.57.1660605043395;
Mon, 15 Aug 2022 16:10:43 -0700 (PDT)
X-Received: by 2002:ad4:5be8:0:b0:479:6ba3:f08c with SMTP id
k8-20020ad45be8000000b004796ba3f08cmr15675010qvc.85.1660605043268; Mon, 15
Aug 2022 16:10:43 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border-2.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 15 Aug 2022 16:10:43 -0700 (PDT)
In-Reply-To: <100d9c60-a3ba-4d01-8ee2-30812e19df12n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:2d2c:6a8c:b7f:4060;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:2d2c:6a8c:b7f:4060
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com> <2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
<td92dl$2tle3$1@dont-email.me> <632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
<2fc52827-2d71-4380-ad14-01e3ea2e9f13n@googlegroups.com> <191f11f8-709a-4821-9a36-47cfb7aae515n@googlegroups.com>
<100d9c60-a3ba-4d01-8ee2-30812e19df12n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <813652fd-f874-492d-ba33-35555c3ca681n@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 15 Aug 2022 23:10:43 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 84

by: MitchAlsup - Mon, 15 Aug 2022 23:10 UTC

On Monday, August 15, 2022 at 5:15:36 PM UTC-5, Quadibloc wrote:
> On Monday, August 15, 2022 at 1:21:40 PM UTC-6, MitchAlsup wrote:
>
> > The HW is there and works. What is failing is SW's ability to use
> > the kinds of HW that ahs been designed and produced in the past.
<
> In general, I found your reply interesting and informative, and had
> little to take issue with. But _this_ statement is the sort of thing that
> leads me to feel that I am in disagreement with you somewhere.
>
> A pile of in-order CPUs, like, say, those in an old Xeon Phi, are there,
> and work. Can't argue with that!
>
> The fact, though, that computer programs aimed at solving certain
> classes of problems can't make effective use of them... does not
> necessarily indicate "failure" on the part of the software. It could just
> as easily be that the hardware, which is there and working, all right,
> is of an unsuitable kind, and those transistors would have been put to
> more appropriate use building fewer GBOoO CPUs instead.
>
> In other words, you seem to be making a value judgement... which
> implies the assertion of "a fact not in evidence" - specifically, that
> if those programmers were only doing their jobs, they would have
> written better software.
<
Over the past 40 years (has it been that long already) I have watched
the computer architects come up with a small variety of things that
purported to "solve the synchronization problem". Test and Set, Test
and test and Set, Compare and Swap, DCAS, Operation-on-Memory, Load
Locked-Store Conditional, My own ASF and my own ESM as extensions
of LL-SC. And then there is the failure of Transactional Memory,.....
<
While these provide an adequate way to control synchronization, they
are completely unattractive for controlling parallelism--which is the real
goal here.
<
We are at a point where the SW people don't know what to ask HW people
to build into the next CPU. Without being ask (told) and what the mechanics
of such an addition would be composed from, HW designers don't know
what to try next. I tried with ASF and ESM.
>
> Algorithms that haven't been discovered yet... will not necessarily,
> with absolute inevitability, be discovered some time in the future.
> For all we know, they may not exist at all.
<
It seems the old insurance model would be reasonable in spreading
work out over a large number of workers. A "bus boy" walks around
a matrix of desks, each desk manned by a worker. Bus boy drops
of a packet of work to the person at the desk, picks up completed
work, and continues on his rounds. The workers are supplied with
enough work to stay busy, work flows around fast enough that each
task is completed in 'reasonable' time. This model worked for nearly
100 years in insurance, advertising,...
<
But we don't have such a model for computer applications.
>
> Certainly, you're entitled to your opinion that optimism over the
> prospects for software to exploit parallel computing facilities is
> warranted. It's just that what I've quoted seems to elevate that
> opinion to a statement of fact.
<
As I stated her about 18 months ago (more targeting Ivan that the
group as a whole). What we have run out of is ideas as to what would
be a good model to perform work on a parallel possibly heterogenous
computing system(s).
<
This is not a HW problem--it may (MAY) not be a SW problem either.
But the ability to put zillions of cores on a die is already hear. How
we use them or abuse them is our problem not the HW guys.
>
> This is why I give you a hard time, even though I very much agree that
> GBOoO designs do seem like a wretchedly excessive expenditure of
> transistors and energy and the heat budget of chips... but if I don't
> know of a better way, then I accept what I must.
<
In the beginning GBOoO was a way around Amdahl's law--it got a factor
of 2× and then petered out. Now, to make the next transition, we need
a model for controlling parallelism, where user applications can spawn,
communicate, and control other threads (applications-threads,...) without
excursions through GuestOS.
>
> John Savard

Re: Good News for Ivan, Mitch, and Others

<f0a92092-1f47-4a75-8bf6-44347178b961n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27291&group=comp.arch#27291

copy link Newsgroups: comp.arch

X-Received: by 2002:ad4:596f:0:b0:484:10b3:4653 with SMTP id eq15-20020ad4596f000000b0048410b34653mr16066516qvb.86.1660609648591;
Mon, 15 Aug 2022 17:27:28 -0700 (PDT)
X-Received: by 2002:ad4:5946:0:b0:477:42fd:6492 with SMTP id
eo6-20020ad45946000000b0047742fd6492mr15919975qvb.88.1660609648453; Mon, 15
Aug 2022 17:27:28 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 15 Aug 2022 17:27:28 -0700 (PDT)
In-Reply-To: <813652fd-f874-492d-ba33-35555c3ca681n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=73.188.126.34; posting-account=ujX_IwoAAACu0_cef9hMHeR8g0ZYDNHh
NNTP-Posting-Host: 73.188.126.34
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com> <2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
<td92dl$2tle3$1@dont-email.me> <632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
<2fc52827-2d71-4380-ad14-01e3ea2e9f13n@googlegroups.com> <191f11f8-709a-4821-9a36-47cfb7aae515n@googlegroups.com>
<100d9c60-a3ba-4d01-8ee2-30812e19df12n@googlegroups.com> <813652fd-f874-492d-ba33-35555c3ca681n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f0a92092-1f47-4a75-8bf6-44347178b961n@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: timcaff...@aol.com (Timothy McCaffrey)
Injection-Date: Tue, 16 Aug 2022 00:27:28 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 4550

by: Timothy McCaffrey - Tue, 16 Aug 2022 00:27 UTC

On Monday, August 15, 2022 at 7:10:44 PM UTC-4, MitchAlsup wrote:
> On Monday, August 15, 2022 at 5:15:36 PM UTC-5, Quadibloc wrote:

> Over the past 40 years (has it been that long already) I have watched
> the computer architects come up with a small variety of things that
> purported to "solve the synchronization problem". Test and Set, Test
> and test and Set, Compare and Swap, DCAS, Operation-on-Memory, Load
> Locked-Store Conditional, My own ASF and my own ESM as extensions
> of LL-SC. And then there is the failure of Transactional Memory,.....
> <
> While these provide an adequate way to control synchronization, they
> are completely unattractive for controlling parallelism--which is the real
> goal here.
> <
> We are at a point where the SW people don't know what to ask HW people
> to build into the next CPU. Without being ask (told) and what the mechanics
> of such an addition would be composed from, HW designers don't know
> what to try next. I tried with ASF and ESM.
> It seems the old insurance model would be reasonable in spreading
> work out over a large number of workers. A "bus boy" walks around
> a matrix of desks, each desk manned by a worker. Bus boy drops
> of a packet of work to the person at the desk, picks up completed
> work, and continues on his rounds. The workers are supplied with
> enough work to stay busy, work flows around fast enough that each
> task is completed in 'reasonable' time. This model worked for nearly
> 100 years in insurance, advertising,...
> <
> But we don't have such a model for computer applications.

> This is not a HW problem--it may (MAY) not be a SW problem either.
> But the ability to put zillions of cores on a die is already hear. How
> we use them or abuse them is our problem not the HW guys.

Now, given a clean sheet for the OS and APIs, I would ask for a
per CPU queue, where I could push data from one CPU to another.
In fact, it one queue per CPU combination. Then you could
a) send a message from CPU A to CPU B with no synchronization
overhead.
b) it is a "push", so CPU A basically does a "fire and forget".
c) CPU B (somehow) knows there is no need to deal with the
message until the whole thing gets there.
(Yes, I am skipping over the details, but I'm sure they are solvable).

The usual response I get for this is that, "just put the message in memory".
That re-introduces synchronization overhead, memory bandwidth, memory latency, etc.
all over again.

With this feature you could efficiently build a real software pipeline, splitting processing
steps between CPUs and passing results on to the next step in the pipeline.
This does mean your typical multitasking OS with its semi-random CPU
scheduling is NOT appropriate, which is why I specified a clean sheet for the OS and the APIs above.

- Tim

Re: Good News for Ivan, Mitch, and Others

<f313dd43-9a67-48cf-b91d-91474fa0557en@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27293&group=comp.arch#27293

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:1651:b0:344:5d06:7449 with SMTP id y17-20020a05622a165100b003445d067449mr6466723qtj.292.1660610080503;
Mon, 15 Aug 2022 17:34:40 -0700 (PDT)
X-Received: by 2002:ac8:7d53:0:b0:344:6545:5c02 with SMTP id
h19-20020ac87d53000000b0034465455c02mr4785125qtb.365.1660610080259; Mon, 15
Aug 2022 17:34:40 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 15 Aug 2022 17:34:40 -0700 (PDT)
In-Reply-To: <813652fd-f874-492d-ba33-35555c3ca681n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.14.162; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.14.162
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com> <2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
<td92dl$2tle3$1@dont-email.me> <632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
<2fc52827-2d71-4380-ad14-01e3ea2e9f13n@googlegroups.com> <191f11f8-709a-4821-9a36-47cfb7aae515n@googlegroups.com>
<100d9c60-a3ba-4d01-8ee2-30812e19df12n@googlegroups.com> <813652fd-f874-492d-ba33-35555c3ca681n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f313dd43-9a67-48cf-b91d-91474fa0557en@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: jim.brak...@ieee.org (JimBrakefield)
Injection-Date: Tue, 16 Aug 2022 00:34:40 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 7579

by: JimBrakefield - Tue, 16 Aug 2022 00:34 UTC

On Monday, August 15, 2022 at 6:10:44 PM UTC-5, MitchAlsup wrote:
> On Monday, August 15, 2022 at 5:15:36 PM UTC-5, Quadibloc wrote:
> > On Monday, August 15, 2022 at 1:21:40 PM UTC-6, MitchAlsup wrote:
> >
> > > The HW is there and works. What is failing is SW's ability to use
> > > the kinds of HW that ahs been designed and produced in the past.
> <
> > In general, I found your reply interesting and informative, and had
> > little to take issue with. But _this_ statement is the sort of thing that
> > leads me to feel that I am in disagreement with you somewhere.
> >
> > A pile of in-order CPUs, like, say, those in an old Xeon Phi, are there,
> > and work. Can't argue with that!
> >
> > The fact, though, that computer programs aimed at solving certain
> > classes of problems can't make effective use of them... does not
> > necessarily indicate "failure" on the part of the software. It could just
> > as easily be that the hardware, which is there and working, all right,
> > is of an unsuitable kind, and those transistors would have been put to
> > more appropriate use building fewer GBOoO CPUs instead.
> >
> > In other words, you seem to be making a value judgement... which
> > implies the assertion of "a fact not in evidence" - specifically, that
> > if those programmers were only doing their jobs, they would have
> > written better software.
> <
> Over the past 40 years (has it been that long already) I have watched
> the computer architects come up with a small variety of things that
> purported to "solve the synchronization problem". Test and Set, Test
> and test and Set, Compare and Swap, DCAS, Operation-on-Memory, Load
> Locked-Store Conditional, My own ASF and my own ESM as extensions
> of LL-SC. And then there is the failure of Transactional Memory,.....
> <
> While these provide an adequate way to control synchronization, they
> are completely unattractive for controlling parallelism--which is the real
> goal here.
> <
> We are at a point where the SW people don't know what to ask HW people
> to build into the next CPU. Without being ask (told) and what the mechanics
> of such an addition would be composed from, HW designers don't know
> what to try next. I tried with ASF and ESM.
> >
> > Algorithms that haven't been discovered yet... will not necessarily,
> > with absolute inevitability, be discovered some time in the future.
> > For all we know, they may not exist at all.
> <
> It seems the old insurance model would be reasonable in spreading
> work out over a large number of workers. A "bus boy" walks around
> a matrix of desks, each desk manned by a worker. Bus boy drops
> of a packet of work to the person at the desk, picks up completed
> work, and continues on his rounds. The workers are supplied with
> enough work to stay busy, work flows around fast enough that each
> task is completed in 'reasonable' time. This model worked for nearly
> 100 years in insurance, advertising,...
> <
> But we don't have such a model for computer applications.
> >
> > Certainly, you're entitled to your opinion that optimism over the
> > prospects for software to exploit parallel computing facilities is
> > warranted. It's just that what I've quoted seems to elevate that
> > opinion to a statement of fact.
> <
> As I stated her about 18 months ago (more targeting Ivan that the
> group as a whole). What we have run out of is ideas as to what would
> be a good model to perform work on a parallel possibly heterogenous
> computing system(s).
> <
> This is not a HW problem--it may (MAY) not be a SW problem either.
> But the ability to put zillions of cores on a die is already hear. How
> we use them or abuse them is our problem not the HW guys.
> >
> > This is why I give you a hard time, even though I very much agree that
> > GBOoO designs do seem like a wretchedly excessive expenditure of
> > transistors and energy and the heat budget of chips... but if I don't
> > know of a better way, then I accept what I must.
> <
> In the beginning GBOoO was a way around Amdahl's law--it got a factor
> of 2× and then petered out. Now, to make the next transition, we need
> a model for controlling parallelism, where user applications can spawn,
> communicate, and control other threads (applications-threads,...) without
> excursions through GuestOS.
> >
> > John Savard

Ugh, my two cents:
> It seems the old insurance model would be reasonable in spreading
> work out over a large number of workers. A "bus boy" walks around
> a matrix of desks, each desk manned by a worker. Bus boy drops
> of a packet of work to the person at the desk, picks up completed
> work, and continues on his rounds. The workers are supplied with
> enough work to stay busy, work flows around fast enough that each
> task is completed in 'reasonable' time. This model worked for nearly
> 100 years in insurance, advertising,...

And why not?
The natural unit of work would be a very small number of DRAM bursts.
It is important to to realize one needs multiple "desks", eg processors
capable of working these "burst" units. E.g. the work (data and program blocks)
can go to any available processor. Chopping up data may not be optimal
for big data. IMHO there are sizeable workload percentages in internet browsing and
data base applications. A gigabit++ network takes care of moving the work
blocks to the nearest available processor or obtaining the next data or program
blocks. Block size assumed to be ~1KB or so?

Re: Good News for Ivan, Mitch, and Others

<1cc6647f-73ba-42ad-b0ac-2e78134fcc2bn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27294&group=comp.arch#27294

copy link Newsgroups: comp.arch

X-Received: by 2002:ad4:5bad:0:b0:476:e202:32eb with SMTP id 13-20020ad45bad000000b00476e20232ebmr15836324qvq.3.1660612833564;
Mon, 15 Aug 2022 18:20:33 -0700 (PDT)
X-Received: by 2002:a37:58c6:0:b0:6b5:d169:7b99 with SMTP id
m189-20020a3758c6000000b006b5d1697b99mr13518015qkb.709.1660612833409; Mon, 15
Aug 2022 18:20:33 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 15 Aug 2022 18:20:33 -0700 (PDT)
In-Reply-To: <f0a92092-1f47-4a75-8bf6-44347178b961n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:2d2c:6a8c:b7f:4060;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:2d2c:6a8c:b7f:4060
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com> <2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
<td92dl$2tle3$1@dont-email.me> <632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
<2fc52827-2d71-4380-ad14-01e3ea2e9f13n@googlegroups.com> <191f11f8-709a-4821-9a36-47cfb7aae515n@googlegroups.com>
<100d9c60-a3ba-4d01-8ee2-30812e19df12n@googlegroups.com> <813652fd-f874-492d-ba33-35555c3ca681n@googlegroups.com>
<f0a92092-1f47-4a75-8bf6-44347178b961n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1cc6647f-73ba-42ad-b0ac-2e78134fcc2bn@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 16 Aug 2022 01:20:33 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 6594

by: MitchAlsup - Tue, 16 Aug 2022 01:20 UTC

On Monday, August 15, 2022 at 7:27:29 PM UTC-5, timca...@aol.com wrote:
> On Monday, August 15, 2022 at 7:10:44 PM UTC-4, MitchAlsup wrote:
> > On Monday, August 15, 2022 at 5:15:36 PM UTC-5, Quadibloc wrote:
>
> > Over the past 40 years (has it been that long already) I have watched
> > the computer architects come up with a small variety of things that
> > purported to "solve the synchronization problem". Test and Set, Test
> > and test and Set, Compare and Swap, DCAS, Operation-on-Memory, Load
> > Locked-Store Conditional, My own ASF and my own ESM as extensions
> > of LL-SC. And then there is the failure of Transactional Memory,.....
> > <
> > While these provide an adequate way to control synchronization, they
> > are completely unattractive for controlling parallelism--which is the real
> > goal here.
> > <
> > We are at a point where the SW people don't know what to ask HW people
> > to build into the next CPU. Without being ask (told) and what the mechanics
> > of such an addition would be composed from, HW designers don't know
> > what to try next. I tried with ASF and ESM.
> > It seems the old insurance model would be reasonable in spreading
> > work out over a large number of workers. A "bus boy" walks around
> > a matrix of desks, each desk manned by a worker. Bus boy drops
> > of a packet of work to the person at the desk, picks up completed
> > work, and continues on his rounds. The workers are supplied with
> > enough work to stay busy, work flows around fast enough that each
> > task is completed in 'reasonable' time. This model worked for nearly
> > 100 years in insurance, advertising,...
> > <
> > But we don't have such a model for computer applications.
> > This is not a HW problem--it may (MAY) not be a SW problem either.
> > But the ability to put zillions of cores on a die is already hear. How
> > we use them or abuse them is our problem not the HW guys.
<
> Now, given a clean sheet for the OS and APIs, I would ask for a
> per CPU queue, where I could push data from one CPU to another.
<
Change/CPU/thread/ and you have just defined my Messaging model.
Given a "method" any thread with said method can send a request to
any other thread (even across GuestOSs) and this request will be
queued on that other thread. Should that thread be at a priority where
it can find a core within its affinity set at a lower priority, that thread
can be scheduled--without anyone going through an excursion through
any of the GuestOSs.
<
> In fact, it one queue per CPU combination. Then you could
> a) send a message from CPU A to CPU B with no synchronization
> overhead.
<
Change CUP to thread. You don't want a CPU to tell another CPU
what to do, you want a thread to tell another thread to do something.
Then you have some kind of orderly mechanism whereby that thread
can pick up the work and start working on it. You don't want to be in
a position where 1 instruction of the sending thread runs on CPU[k]
and the next instruction runs on CPU[m]. So, by altering the notion
away from CPU and towards "the unit of scheduled state" you get
what you really want, without dragging the mechanism into the policy.
<
> b) it is a "push", so CPU A basically does a "fire and forget".
<
My 66000 can fire and forget, or fire and wait.
In both cases, the message arrives as if the target of the message
were CALLed from within his own thread--the std ABI. The return
follows this lead and sends back any response as if it came from
a subroutine within the callers thread.
<
> c) CPU B (somehow) knows there is no need to deal with the
> message until the whole thing gets there.
> (Yes, I am skipping over the details, but I'm sure they are solvable).
<
Do all of the above without having to pass through the GuestOS interfaces.
{No excursion though GuestOS is required. No Change in GuestOS is
required even if the GuesOSs are disjoint under different hypervisors.
>
> The usual response I get for this is that, "just put the message in memory".
> That re-introduces synchronization overhead, memory bandwidth, memory latency, etc.
> all over again.
<
In addition to the above, what if you wanted to send such a message, but you did
not want GuestOS to be able to read said message ? {The Paranoid Application}
>
> With this feature you could efficiently build a real software pipeline, splitting processing
> steps between CPUs and passing results on to the next step in the pipeline.
> This does mean your typical multitasking OS with its semi-random CPU
> scheduling is NOT appropriate, which is why I specified a clean sheet for the OS and the APIs above.
<
My 66000 messaging--all I require for you to read about it is a signed NDA.
>
> - Tim

Re: Good News for Ivan, Mitch, and Others

<e1ef42eb-8b71-49a1-97f8-9ec894376a83n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27296&group=comp.arch#27296

copy link Newsgroups: comp.arch

X-Received: by 2002:a37:43d4:0:b0:6b8:e3ba:ddfc with SMTP id q203-20020a3743d4000000b006b8e3baddfcmr13807112qka.192.1660627179941;
Mon, 15 Aug 2022 22:19:39 -0700 (PDT)
X-Received: by 2002:a05:6214:d88:b0:46e:64aa:842a with SMTP id
e8-20020a0562140d8800b0046e64aa842amr16140780qve.101.1660627179839; Mon, 15
Aug 2022 22:19:39 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 15 Aug 2022 22:19:39 -0700 (PDT)
In-Reply-To: <813652fd-f874-492d-ba33-35555c3ca681n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:c890:cfb8:c705:4c35;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:c890:cfb8:c705:4c35
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com> <2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
<td92dl$2tle3$1@dont-email.me> <632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
<2fc52827-2d71-4380-ad14-01e3ea2e9f13n@googlegroups.com> <191f11f8-709a-4821-9a36-47cfb7aae515n@googlegroups.com>
<100d9c60-a3ba-4d01-8ee2-30812e19df12n@googlegroups.com> <813652fd-f874-492d-ba33-35555c3ca681n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <e1ef42eb-8b71-49a1-97f8-9ec894376a83n@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Tue, 16 Aug 2022 05:19:39 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 3072

by: Quadibloc - Tue, 16 Aug 2022 05:19 UTC

On Monday, August 15, 2022 at 5:10:44 PM UTC-6, MitchAlsup wrote:

> It seems the old insurance model would be reasonable in spreading
> work out over a large number of workers. A "bus boy" walks around
> a matrix of desks, each desk manned by a worker. Bus boy drops
> of a packet of work to the person at the desk, picks up completed
> work, and continues on his rounds. The workers are supplied with
> enough work to stay busy, work flows around fast enough that each
> task is completed in 'reasonable' time. This model worked for nearly
> 100 years in insurance, advertising,...

> But we don't have such a model for computer applications.

I thought that's at least equivalent to the way computers work
now, when a lot of processors in parallel are being utilized to
provide maximum throughput.

Every processor is kept busy, and individual tasks are completed
as fast as their dependencies permit.

There's nothing in the "bus boy" problem that gets around
dependencies - the only thing it seems to be saying is that we
aren't splitting our programs into enough threads. Well,
AMD is making 128 core Threadripper and EPYC chips, and
so there is an incentive to split up programs into at least 128
threads. I agree we need to do that, but I doubt it will "solve
the problem" the way people are hoping.

John Savard

Re: Good News for Ivan, Mitch, and Others

<73b9fb3c-d62c-472d-ad58-c239ff55497fn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27297&group=comp.arch#27297

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:620a:4442:b0:6b2:844e:ee67 with SMTP id w2-20020a05620a444200b006b2844eee67mr13444227qkp.625.1660627701884;
Mon, 15 Aug 2022 22:28:21 -0700 (PDT)
X-Received: by 2002:ad4:5ba5:0:b0:47b:4f54:cd1f with SMTP id
5-20020ad45ba5000000b0047b4f54cd1fmr16153557qvq.38.1660627701727; Mon, 15 Aug
2022 22:28:21 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 15 Aug 2022 22:28:21 -0700 (PDT)
In-Reply-To: <813652fd-f874-492d-ba33-35555c3ca681n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:c890:cfb8:c705:4c35;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:c890:cfb8:c705:4c35
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com> <td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com> <2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
<td92dl$2tle3$1@dont-email.me> <632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
<2fc52827-2d71-4380-ad14-01e3ea2e9f13n@googlegroups.com> <191f11f8-709a-4821-9a36-47cfb7aae515n@googlegroups.com>
<100d9c60-a3ba-4d01-8ee2-30812e19df12n@googlegroups.com> <813652fd-f874-492d-ba33-35555c3ca681n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <73b9fb3c-d62c-472d-ad58-c239ff55497fn@googlegroups.com>
Subject: Re: Good News for Ivan, Mitch, and Others
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Tue, 16 Aug 2022 05:28:21 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 3078

by: Quadibloc - Tue, 16 Aug 2022 05:28 UTC

On Monday, August 15, 2022 at 5:10:44 PM UTC-6, MitchAlsup wrote:

> In the beginning GBOoO was a way around Amdahl's law--it got a factor
> of 2× and then petered out.

Absolutely. No disagreement there.

> Now, to make the next transition, we need
> a model for controlling parallelism, where user applications can spawn,
> communicate, and control other threads (applications-threads,...) without
> excursions through GuestOS.

I definitely agree that since we need to make applications more
multi-threaded, we also need to pare down the overhead for
multi-threading.

That many applications haven't been made as capable of using
multiple threads as they could and should be is certainly true.
But fixing that _is_ happening, and while when it is fixed, it's
going to result in significant performance improvements to the
affected applications on many-core systems, I don't see that
as the "next breakthrough".

It's the _current_ best practices state of the art, and people doing
fancy scientific processing on what passes for a supercomputer
these days are doing that. It will trickle down; by the time further
process shrinks change 128 cores from a top-of-the-line
Threadripper to a top-of-the-line Ryzen in consumer boxes, it will
be achieved.

John Savard

Re: Good News for Ivan, Mitch, and Others

<tdfquo$7il$1@newsreader4.netcologne.de>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27299&group=comp.arch#27299

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!news.swapon.de!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-1faa-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Good News for Ivan, Mitch, and Others
Date: Tue, 16 Aug 2022 10:19:04 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <tdfquo$7il$1@newsreader4.netcologne.de>
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com>
<td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com>
<2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
<td92dl$2tle3$1@dont-email.me>
<632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
<2fc52827-2d71-4380-ad14-01e3ea2e9f13n@googlegroups.com>
<191f11f8-709a-4821-9a36-47cfb7aae515n@googlegroups.com>
Injection-Date: Tue, 16 Aug 2022 10:19:04 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-1faa-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:1faa:0:7285:c2ff:fe6c:992d";
logging-data="7765"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Tue, 16 Aug 2022 10:19 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:
> On Monday, August 15, 2022 at 12:25:37 PM UTC-5, Quadibloc wrote:

>> People have been trying to find ways to efficiently put multiple
>> processors working in parallel to work on problems in an
>> efficient manner for a long time.
>> And there's been no general solution, only the odd
>> parallel algorithm for the odd individual problem.
><
> The HW is there and works. What is failing is SW's ability to use
> the kinds of HW that ahs been designed and produced in the past.

At the time, I had the feeling that the hardware just increased the
number of cores because they could not increase frequency by much,
and they didn't know what else to do, in the hope that software
would catch up... and to a large degree, it has not.

Programming is hard, as can be seen by the numerous programming
bugs that plague us. And they are expensive to us all. I don't
quite believe the estimates putting the cost of software bugs at
2.E12 dollars per year, that would be more than 2% of the world's
GDP and more than the 6.E11 dollars of annual software revenue
cited by https://www.statista.com/outlook/tmo/software/worldwide ,
but there can be no doubt that the impact is huge.

And parallel programming is _much_ harder than serial programming.
it introduces potential race conditions which make bugs spurious,
and harder to track down.

There is also the question of how to program in parallel. I do
not particular like pthreads or OpenMP, because both share data by
default, which is a recipe for race conditions.

Something like Fortran's PGAS model, where data is only shared
explicitly, but where it is possible to access memory on another
image, and where there are strict rules for synchronization,
is far better (IMHO). It also scales well (also to clusters,
using MPI as underlying mechanism), but it is certainly not
a good fit for everything.

Functional programming may help in making things parallelizable,
but has not caught on to a very large extent, because people
are not quite used to the style, and this has its drawbacks (see
the absence of a direct DO loop in Haskell, you use recursion or
iterators instead).

So... no ready solution, unfortunately.

Re: Good News for Ivan, Mitch, and Others

<tdg2ei$pup$1@gioia.aioe.org>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=27301&group=comp.arch#27301

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!y0sttPrO1OAcON/g+jAtOw.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Good News for Ivan, Mitch, and Others
Date: Tue, 16 Aug 2022 14:27:04 +0200
Organization: Aioe.org NNTP Server
Message-ID: <tdg2ei$pup$1@gioia.aioe.org>
References: <f6135252-c63a-4895-a69d-2af4f872abb1n@googlegroups.com>
<e0ab4da2-42a3-4f11-a8ae-8a3ec3578c6an@googlegroups.com>
<td88c8$dbk$1@newsreader4.netcologne.de>
<ddbf0da1-6184-46d8-ae6d-894cf9e02a69n@googlegroups.com>
<2cad9ad0-c113-4f41-a48a-dd2f01de79f4n@googlegroups.com>
<td92dl$2tle3$1@dont-email.me>
<632e95b4-08b8-4240-aebf-0cd21267db14n@googlegroups.com>
<2fc52827-2d71-4380-ad14-01e3ea2e9f13n@googlegroups.com>
<191f11f8-709a-4821-9a36-47cfb7aae515n@googlegroups.com>
<100d9c60-a3ba-4d01-8ee2-30812e19df12n@googlegroups.com>
<813652fd-f874-492d-ba33-35555c3ca681n@googlegroups.com>
<f0a92092-1f47-4a75-8bf6-44347178b961n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="26585"; posting-host="y0sttPrO1OAcON/g+jAtOw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.13
X-Notice: Filtered by postfilter v. 0.9.2

by: Terje Mathisen - Tue, 16 Aug 2022 12:27 UTC

Timothy McCaffrey wrote:
> On Monday, August 15, 2022 at 7:10:44 PM UTC-4, MitchAlsup wrote:
>> On Monday, August 15, 2022 at 5:15:36 PM UTC-5, Quadibloc wrote:
>
>> Over the past 40 years (has it been that long already) I have watched
>> the computer architects come up with a small variety of things that
>> purported to "solve the synchronization problem". Test and Set, Test
>> and test and Set, Compare and Swap, DCAS, Operation-on-Memory, Load
>> Locked-Store Conditional, My own ASF and my own ESM as extensions
>> of LL-SC. And then there is the failure of Transactional Memory,.....
>> <
>> While these provide an adequate way to control synchronization, they
>> are completely unattractive for controlling parallelism--which is the real
>> goal here.
>> <
>> We are at a point where the SW people don't know what to ask HW people
>> to build into the next CPU. Without being ask (told) and what the mechanics
>> of such an addition would be composed from, HW designers don't know
>> what to try next. I tried with ASF and ESM.
>> It seems the old insurance model would be reasonable in spreading
>> work out over a large number of workers. A "bus boy" walks around
>> a matrix of desks, each desk manned by a worker. Bus boy drops
>> of a packet of work to the person at the desk, picks up completed
>> work, and continues on his rounds. The workers are supplied with
>> enough work to stay busy, work flows around fast enough that each
>> task is completed in 'reasonable' time. This model worked for nearly
>> 100 years in insurance, advertising,...
>> <
>> But we don't have such a model for computer applications.
>
>> This is not a HW problem--it may (MAY) not be a SW problem either.
>> But the ability to put zillions of cores on a die is already hear. How
>> we use them or abuse them is our problem not the HW guys.
>
> Now, given a clean sheet for the OS and APIs, I would ask for a
> per CPU queue, where I could push data from one CPU to another.
> In fact, it one queue per CPU combination. Then you could
> a) send a message from CPU A to CPU B with no synchronization
> overhead.
> b) it is a "push", so CPU A basically does a "fire and forget".
> c) CPU B (somehow) knows there is no need to deal with the
> message until the whole thing gets there.
> (Yes, I am skipping over the details, but I'm sure they are solvable).
>
> The usual response I get for this is that, "just put the message in memory".
> That re-introduces synchronization overhead, memory bandwidth, memory latency, etc.
> all over again.

Transputers, anyone?

>
> With this feature you could efficiently build a real software pipeline, splitting processing
> steps between CPUs and passing results on to the next step in the pipeline.
> This does mean your typical multitasking OS with its semi-random CPU
> scheduling is NOT appropriate, which is why I specified a clean sheet for the OS and the APIs above.

Afair, the Transputer failed to move on to upgraded versions?

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Subject	Author
Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Thomas Koenig
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Stephen Fuld
Re: Good News for Ivan, Mitch, and Others	antispam
Re: Good News for Ivan, Mitch, and Others	MitchAlsup
Re: Good News for Ivan, Mitch, and Others	Ivan Godard
Re: Good News for Ivan, Mitch, and Others	MitchAlsup
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	MitchAlsup
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	MitchAlsup
Re: Good News for Ivan, Mitch, and Others	Timothy McCaffrey
Re: Good News for Ivan, Mitch, and Others	MitchAlsup
Re: Good News for Ivan, Mitch, and Others	Terje Mathisen
Re: Good News for Ivan, Mitch, and Others	Brian G. Lucas
Re: Good News for Ivan, Mitch, and Others	Ivan Godard
Re: Good News for Ivan, Mitch, and Others	JimBrakefield
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Thomas Koenig
Re: Good News for Ivan, Mitch, and Others	MitchAlsup
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	MitchAlsup
Re: Good News for Ivan, Mitch, and Others	Ivan Godard
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Thomas Koenig
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Terje Mathisen
Re: Good News for Ivan, Mitch, and Others	Thomas Koenig
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Michael S
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	MitchAlsup
Re: Good News for Ivan, Mitch, and Others	Thomas Koenig
Re: Good News for Ivan, Mitch, and Others	Anton Ertl
Re: Good News for Ivan, Mitch, and Others	Terje Mathisen
Re: Good News for Ivan, Mitch, and Others	Ivan Godard
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Terje Mathisen
Re: Good News for Ivan, Mitch, and Others	Michael S
Re: Good News for Ivan, Mitch, and Others	Thomas Koenig
Re: Good News for Ivan, Mitch, and Others	Terje Mathisen
Re: Good News for Ivan, Mitch, and Others	David Brown
Re: Good News for Ivan, Mitch, and Others	Brett
Re: Good News for Ivan, Mitch, and Others	Ivan Godard
Re: Good News for Ivan, Mitch, and Others	David Brown
Re: Good News for Ivan, Mitch, and Others	Thomas Koenig
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Michael S
Re: Good News for Ivan, Mitch, and Others	Thomas Koenig
Re: Good News for Ivan, Mitch, and Others	Quadibloc
Re: Good News for Ivan, Mitch, and Others	Theo
Re: Good News for Ivan, Mitch, and Others	Paul A. Clayton
Re: Good News for Ivan, Mitch, and Others	Theo