Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

<<<<< EVACUATION ROUTE <<<<<


devel / comp.arch / Re: The age of the CPU as God is OVER.

SubjectAuthor
* Why We Can't Have Nice ThingsQuadibloc
`* The age of the CPU as God is OVER.Brett
 +* Re: The age of the CPU as God is OVER.Quadibloc
 |`* Re: The age of the CPU as God is OVER.Brett
 | +* Re: The age of the CPU as God is OVER.Thomas Koenig
 | |`* Re: The age of the CPU as God is OVER.BGB
 | | `- Re: The age of the CPU as God is OVER.Brett
 | `* Re: The age of the CPU as God is OVER.Quadibloc
 |  `- Re: The age of the CPU as God is OVER.JimBrakefield
 `* Re: The age of the CPU as God is OVER.Andy
  `* Re: The age of the CPU as God is OVER.Brett
   `* Re: The age of the CPU as God is OVER.Andy
    `- Re: The age of the CPU as God is OVER.Thomas Koenig

1
Why We Can't Have Nice Things

<913f27f9-0d29-4d31-af6a-df51e05a13f2n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=27569&group=comp.arch#27569

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:2387:b0:496:c9db:82b0 with SMTP id fw7-20020a056214238700b00496c9db82b0mr17862537qvb.111.1661904937578;
Tue, 30 Aug 2022 17:15:37 -0700 (PDT)
X-Received: by 2002:a05:622a:134f:b0:344:df2b:9afe with SMTP id
w15-20020a05622a134f00b00344df2b9afemr17523237qtk.279.1661904937447; Tue, 30
Aug 2022 17:15:37 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 30 Aug 2022 17:15:37 -0700 (PDT)
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:6947:3c86:73e1:a64e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:6947:3c86:73e1:a64e
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <913f27f9-0d29-4d31-af6a-df51e05a13f2n@googlegroups.com>
Subject: Why We Can't Have Nice Things
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 31 Aug 2022 00:15:37 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 1873
 by: Quadibloc - Wed, 31 Aug 2022 00:15 UTC

Thermals are thermals, and die area is die area.
So I can't blame AMD too much for the fact that the new generation of Ryzen chips, although they include AVX-512, are, at leaslt to me, a disappointment.

However, single-thread performance is often crucial for many applications. Of course, only *newer* applications will be written to make use of the relatively new AVX-512 feature, and those applications are also new enough to be properly thread-aware.

So I don't suppose that pairing up the cores into core complexes, where they can link their 256-bit vector units into a single shared 512-bit unit, is necessarily that urgently needed. (It would be odd if vectorized code wasn't parallizable...)

But still, it's a pity that the lawsuit may be preventing AMD from re-using this potentially useful idea from Bulldozer.

John Savard

The age of the CPU as God is OVER.

<ten79s$1ourf$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=27572&group=comp.arch#27572

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: ggt...@yahoo.com (Brett)
Newsgroups: comp.arch
Subject: The age of the CPU as God is OVER.
Date: Wed, 31 Aug 2022 08:49:00 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 46
Message-ID: <ten79s$1ourf$1@dont-email.me>
References: <913f27f9-0d29-4d31-af6a-df51e05a13f2n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 31 Aug 2022 08:49:00 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="44673d2478d6c9c9b1c43bc7ff7b1507";
logging-data="1866607"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18UpHE3dutdnblsnUqkJfcm"
User-Agent: NewsTap/5.5 (iPad)
Cancel-Lock: sha1:Yn8KO6v6gnLJtWtsByYuLYBLSiI=
sha1:3T6AtwDU33aJKqYmgCvyT1JE+/c=
 by: Brett - Wed, 31 Aug 2022 08:49 UTC

Quadibloc <jsavard@ecn.ab.ca> wrote:
> Thermals are thermals, and die area is die area.
> So I can't blame AMD too much for the fact that the new generation of
> Ryzen chips, although they include AVX-512, are, at leaslt to me, a disappointment.

The new Ryzen chips clock at 5.7 GHz, what is not to like.

> However, single-thread performance is often crucial for many
> applications. Of course, only *newer* applications will be written to
> make use of the relatively new AVX-512 feature, and those applications
> are also new enough to be properly thread-aware.

No point in doubling the actual vector unit if you do not double the
load/store bus width to all levels of cache.

But if you double all those busses that is going to cost you power and
clocks.

I am on record that Apple will never copy AVX-512 for its ARM chip.
Heavy lifting is better done on more efficient custom math units and the
Apple chips use the AI or graphics engines library’s for bulk data
transform. Which nets you far higher performance at far lower power.

The age of the CPU as God is OVER.

Look at Apple silicon, the CPU’s are a tiny block in a corner with half the
die graphics an most of the rest dedicated compute engines for different
tasks like real time video encoding at a tenth the power of the CPU and a
tenth the die space and four times the performance. There are dozens of
these units with different levels of programmability. The more they are
programmable the more they cost in power and die size with the CPU’s being
the worst.

> So I don't suppose that pairing up the cores into core complexes, where
> they can link their 256-bit vector units into a single shared 512-bit
> unit, is necessarily that urgently needed. (It would be odd if vectorized
> code wasn't parallizable...)
>
> But still, it's a pity that the lawsuit may be preventing AMD from
> re-using this potentially useful idea from Bulldozer.
>
> John Savard
>

Re: The age of the CPU as God is OVER.

<f649d624-adf3-4813-8539-d20e3af6e733n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=27576&group=comp.arch#27576

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ae9:e404:0:b0:6bb:d8c0:381c with SMTP id q4-20020ae9e404000000b006bbd8c0381cmr16835508qkc.459.1661972781154;
Wed, 31 Aug 2022 12:06:21 -0700 (PDT)
X-Received: by 2002:a0c:8c8b:0:b0:496:e83c:9c4a with SMTP id
p11-20020a0c8c8b000000b00496e83c9c4amr21357198qvb.16.1661972780947; Wed, 31
Aug 2022 12:06:20 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 31 Aug 2022 12:06:20 -0700 (PDT)
In-Reply-To: <ten79s$1ourf$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:6947:3c86:73e1:a64e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:6947:3c86:73e1:a64e
References: <913f27f9-0d29-4d31-af6a-df51e05a13f2n@googlegroups.com> <ten79s$1ourf$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f649d624-adf3-4813-8539-d20e3af6e733n@googlegroups.com>
Subject: Re: The age of the CPU as God is OVER.
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Wed, 31 Aug 2022 19:06:21 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2312
 by: Quadibloc - Wed, 31 Aug 2022 19:06 UTC

On Wednesday, August 31, 2022 at 2:49:03 AM UTC-6, gg...@yahoo.com wrote:

> Look at Apple silicon, the CPU’s are a tiny block in a corner with half the
> die graphics an most of the rest dedicated compute engines for different
> tasks like real time video encoding at a tenth the power of the CPU and a
> tenth the die space and four times the performance.

And just how do you use a video encoding engine for computational fluid
dynamics?

I don't want a computer that only lets me be a passive consumer of media.

I want a _supercomputer_ on my desktop.

So I want something which is highly programmable, highly flexible, suitable
for any application, and yet gives me as many TFLOPS as a GPU. Of course
that will take power, die size, and memory bandwidth. Fine. I'm OK with a
single-core CPU with Cray-style vectors packaged in a Threadripper/EPYC
style package. Since I still want it subsidized by being a mass-market
consumer item, it's okay for it to only be _half_ as powerful as a (14nm!)
NEC SX-Aurora TSUBASA computer.

John Savard

Re: The age of the CPU as God is OVER.

<teopsk$1udih$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=27577&group=comp.arch#27577

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: ggt...@yahoo.com (Brett)
Newsgroups: comp.arch
Subject: Re: The age of the CPU as God is OVER.
Date: Wed, 31 Aug 2022 23:12:21 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 34
Message-ID: <teopsk$1udih$1@dont-email.me>
References: <913f27f9-0d29-4d31-af6a-df51e05a13f2n@googlegroups.com>
<ten79s$1ourf$1@dont-email.me>
<f649d624-adf3-4813-8539-d20e3af6e733n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 31 Aug 2022 23:12:21 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="1b3cf41ca27a489130fee3575f437590";
logging-data="2045521"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/oJdWeT3O+d8zYxKFxwRTO"
User-Agent: NewsTap/5.5 (iPad)
Cancel-Lock: sha1:w4/gr03WksvdbTRc/0RZqxW8hY4=
sha1:+W1h85BZ2ORUYVtsDemM0bwHqxk=
 by: Brett - Wed, 31 Aug 2022 23:12 UTC

Quadibloc <jsavard@ecn.ab.ca> wrote:
> On Wednesday, August 31, 2022 at 2:49:03 AM UTC-6, gg...@yahoo.com wrote:
>
>> Look at Apple silicon, the CPU’s are a tiny block in a corner with half the
>> die graphics an most of the rest dedicated compute engines for different
>> tasks like real time video encoding at a tenth the power of the CPU and a
>> tenth the die space and four times the performance.
>
> And just how do you use a video encoding engine for computational fluid
> dynamics?

Write your code in Metal and Apple will schedule that code on the fastest
hardware capable of running that code, be it CPU or GPU, or AI or one of
the other dedicated blocks.

Metal is designed for bulk transform of data.

> I don't want a computer that only lets me be a passive consumer of media.
>
> I want a _supercomputer_ on my desktop.
>
> So I want something which is highly programmable, highly flexible, suitable
> for any application, and yet gives me as many TFLOPS as a GPU. Of course
> that will take power, die size, and memory bandwidth. Fine. I'm OK with a
> single-core CPU with Cray-style vectors packaged in a Threadripper/EPYC
> style package. Since I still want it subsidized by being a mass-market
> consumer item, it's okay for it to only be _half_ as powerful as a (14nm!)
> NEC SX-Aurora TSUBASA computer.
>
> John Savard
>

Re: The age of the CPU as God is OVER.

<tepecm$rc8$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=27578&group=comp.arch#27578

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd7-4e56-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: The age of the CPU as God is OVER.
Date: Thu, 1 Sep 2022 05:02:14 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <tepecm$rc8$1@newsreader4.netcologne.de>
References: <913f27f9-0d29-4d31-af6a-df51e05a13f2n@googlegroups.com>
<ten79s$1ourf$1@dont-email.me>
<f649d624-adf3-4813-8539-d20e3af6e733n@googlegroups.com>
<teopsk$1udih$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 1 Sep 2022 05:02:14 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd7-4e56-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd7:4e56:0:7285:c2ff:fe6c:992d";
logging-data="28040"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Thu, 1 Sep 2022 05:02 UTC

Brett <ggtgp@yahoo.com> schrieb:
> Quadibloc <jsavard@ecn.ab.ca> wrote:
>> On Wednesday, August 31, 2022 at 2:49:03 AM UTC-6, gg...@yahoo.com wrote:
>>
>>> Look at Apple silicon, the CPU’s are a tiny block in a corner with half the
>>> die graphics an most of the rest dedicated compute engines for different
>>> tasks like real time video encoding at a tenth the power of the CPU and a
>>> tenth the die space and four times the performance.
>>
>> And just how do you use a video encoding engine for computational fluid
>> dynamics?
>
> Write your code in Metal and Apple will schedule that code on the fastest
> hardware capable of running that code, be it CPU or GPU, or AI or one of
> the other dedicated blocks.
>
> Metal is designed for bulk transform of data.

I tried reading some of the introductions by Apple, but my bullshit
meter went off-scale high.

Re: The age of the CPU as God is OVER.

<tepnim$2460v$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=27580&group=comp.arch#27580

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: The age of the CPU as God is OVER.
Date: Thu, 1 Sep 2022 02:38:59 -0500
Organization: A noiseless patient Spider
Lines: 111
Message-ID: <tepnim$2460v$1@dont-email.me>
References: <913f27f9-0d29-4d31-af6a-df51e05a13f2n@googlegroups.com>
<ten79s$1ourf$1@dont-email.me>
<f649d624-adf3-4813-8539-d20e3af6e733n@googlegroups.com>
<teopsk$1udih$1@dont-email.me> <tepecm$rc8$1@newsreader4.netcologne.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 1 Sep 2022 07:39:02 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="36214ea7491356e287e930ebec51d49f";
logging-data="2234399"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19IL/W6qsqn3nHzFGT7qWoV"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.13.0
Cancel-Lock: sha1:9LHEvF6ViQ8/zTrsEBIcXWktaTc=
Content-Language: en-US
In-Reply-To: <tepecm$rc8$1@newsreader4.netcologne.de>
 by: BGB - Thu, 1 Sep 2022 07:38 UTC

On 9/1/2022 12:02 AM, Thomas Koenig wrote:
> Brett <ggtgp@yahoo.com> schrieb:
>> Quadibloc <jsavard@ecn.ab.ca> wrote:
>>> On Wednesday, August 31, 2022 at 2:49:03 AM UTC-6, gg...@yahoo.com wrote:
>>>
>>>> Look at Apple silicon, the CPU’s are a tiny block in a corner with half the
>>>> die graphics an most of the rest dedicated compute engines for different
>>>> tasks like real time video encoding at a tenth the power of the CPU and a
>>>> tenth the die space and four times the performance.
>>>
>>> And just how do you use a video encoding engine for computational fluid
>>> dynamics?
>>
>> Write your code in Metal and Apple will schedule that code on the fastest
>> hardware capable of running that code, be it CPU or GPU, or AI or one of
>> the other dedicated blocks.
>>
>> Metal is designed for bulk transform of data.
>
> I tried reading some of the introductions by Apple, but my bullshit
> meter went off-scale high.

Yeah, it seems like much outside of specific high-level APIs or general
categories of workload, making something like this work in the general
case would be difficult.

Though, if one has several class of hardware which can all run a given
piece of code (CPU, GPU, DSP cores, etc) it does seem like it would be
possible to compile code for all of them and then see which type of core
could run the faster.

There are likely to be limits though, for example, if one has dedicated
video decoder hardware features, it is rather unlikely one can
efficiently pattern match these from generic C or C++ source of a video
codec or similar (as opposed to code written specifically to use these
features).

So, ultimately, there are limits, and one would need either specialized
APIs or language extensions to make it work.

Even in my own ISA efforts, I am running into this issue:
* I don't have auto vectorization, or fancy pattern recognition;
* So, some stuff depends on language extensions
** Which in turn only works with code which uses these extensions.
** Which creates issues with keeping the code portable and efficient.
** ...

Some level of portability can be maintained by falling back to
structures and function calls in the generic case, but then cases where
things diverge are harder to fake efficiently (some amount of code
optimized for BJX2 is worse off on x86 due to divergence).

Similarly, Intel/MSVC and GCC using two different ways of approaching
SIMD support doesn't really help either.

One could argue that maybe it would be nice if the C standard could
support (explicit) SIMD operations, but alas this implicitly assumes
that the targets roughly agree as to SIMD hardware capabilities and/or
there is a defined way to deal with this (say, the compiler has to fake
it if the actual ISA can't do it).

While one could argue that needing to fall back to runtime support to
fake SIMD operations that don't exist in hardware, it is not exactly
like things like: "for(i=0; i<4; i++) vc[i]=va[i]+vb[i];" is necessarily
a high performance option either (and then leaves it up to the C
compiler to try to figure out that a "float *" pointer is really meant
to be understood as a 4 element vector).

Granted, "How about we start gluing stuff inspired by GLSL onto C?" (and
maybe start combining this with async/join semantics, *, etc) isn't
really a popular solution either...

*: where async can allow a sub-task to run asynchronously (either fully
independent or with a "promise") and "join" allows blocking until an
async sub-task is complete (and potentially fetching its final result),
and implicitly using these as a compiler optimization hint (we don't
care about the relative order of these sub-tasks, only that maybe we
care that they finish before the program "joins" on the promise handle).

But, then one can then debate between finer grained / lighter weight
solutions (compiler optimization hint, but otherwise mostly implemented
in terms of conventional linear execution semantics), and a more
powerful but heavyweight solution (the async/promise effectively
schedules a sub-task as a green-thread or similar). Or, if both are
possible, how exactly the compiler should know when to choose which is
better (eg: you wouldn't want to green-thread a 4x4 matrix multiply, the
performance overhead on this would *suck*; but a 128x128 matrix multiply
is a different beast entirely).

Though, some of this could be inferred based on how the feature is used
(if used on a "clean" expression or a "sufficiently pure" function (*2),
assume it is a hint unless the expression is doing something
particularly crazy; else assume the more heavyweight "use green-threads
or something" option). Well, or alternately provide additional keywords
to allow the programmer to further specify what sort of semantics they
expect from the async/promise (well, or infer it based on whether the
async and its corresponding join exist within the same control-path, or
assume green-threads if no immediately visible join exists, etc).

*2: Actually, what mostly matters for this is if there is some way the
called function can potentially block on IO or lock a mutex or similar,
which would lead to visibly different behaviors between "schedule this
as a sub-task" and "allow lazy reordering but otherwise still execute
the code sequentially".

....

Re: The age of the CPU as God is OVER.

<terjbk$2an4l$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=27587&group=comp.arch#27587

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: ggt...@yahoo.com (Brett)
Newsgroups: comp.arch
Subject: Re: The age of the CPU as God is OVER.
Date: Fri, 2 Sep 2022 00:39:30 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 133
Message-ID: <terjbk$2an4l$1@dont-email.me>
References: <913f27f9-0d29-4d31-af6a-df51e05a13f2n@googlegroups.com>
<ten79s$1ourf$1@dont-email.me>
<f649d624-adf3-4813-8539-d20e3af6e733n@googlegroups.com>
<teopsk$1udih$1@dont-email.me>
<tepecm$rc8$1@newsreader4.netcologne.de>
<tepnim$2460v$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 2 Sep 2022 00:39:30 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="97194dfafced47eb1aed3d107e999b80";
logging-data="2448533"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19B7iWA1varmP12IPmrsxGV"
User-Agent: NewsTap/5.5 (iPad)
Cancel-Lock: sha1:uCMI1KeMiur+u9LoFHM1IIe/SEw=
sha1:v1v5tDWScVVs8fyWqShupegOMW8=
 by: Brett - Fri, 2 Sep 2022 00:39 UTC

BGB <cr88192@gmail.com> wrote:
> On 9/1/2022 12:02 AM, Thomas Koenig wrote:
>> Brett <ggtgp@yahoo.com> schrieb:
>>> Quadibloc <jsavard@ecn.ab.ca> wrote:
>>>> On Wednesday, August 31, 2022 at 2:49:03 AM UTC-6, gg...@yahoo.com wrote:
>>>>
>>>>> Look at Apple silicon, the CPU’s are a tiny block in a corner with half the
>>>>> die graphics an most of the rest dedicated compute engines for different
>>>>> tasks like real time video encoding at a tenth the power of the CPU and a
>>>>> tenth the die space and four times the performance.
>>>>
>>>> And just how do you use a video encoding engine for computational fluid
>>>> dynamics?
>>>
>>> Write your code in Metal and Apple will schedule that code on the fastest
>>> hardware capable of running that code, be it CPU or GPU, or AI or one of
>>> the other dedicated blocks.
>>>
>>> Metal is designed for bulk transform of data.
>>
>> I tried reading some of the introductions by Apple, but my bullshit
>> meter went off-scale high.
>
> Yeah, it seems like much outside of specific high-level APIs or general
> categories of workload, making something like this work in the general
> case would be difficult.
>
> Though, if one has several class of hardware which can all run a given
> piece of code (CPU, GPU, DSP cores, etc) it does seem like it would be
> possible to compile code for all of them and then see which type of core
> could run the faster.
>
> There are likely to be limits though, for example, if one has dedicated
> video decoder hardware features, it is rather unlikely one can
> efficiently pattern match these from generic C or C++ source of a video
> codec or similar (as opposed to code written specifically to use these
> features).

The video decoder is a source, though half of decode is shifting data from
a previous frame for motion compensation and may be done on the GPU.
Then you may add an overlay like TV News in Metal, and output to video
encoder hardware, which again may be using the GPU (or DSP or AI) to detect
motion in the frames.

For extra bonus points use the source stream to hint the output stream on
what to try first for huge power savings. But that would be custom Apple
code. ;)

What is important is having a huge L3 so all these tasks can fit and all
sources and destinations have equal fast access to the data, be it CPU,
GPU, DSP, AI, and all the different encode and decode engines. All at the
same time, in sync.

For video you need to fit three 16 pixel tall stripes for the source and
another three stripes for the output, times the number of frames you can
borrow from.

The old way was the CPU did all of this, which was slow and hot.
The new way is an orchestra of musicians hardware efficiently producing
excellence.

Metal is the conductor of the orchestra and provides the play sheets.

> So, ultimately, there are limits, and one would need either specialized
> APIs or language extensions to make it work.
>
> Even in my own ISA efforts, I am running into this issue:
> * I don't have auto vectorization, or fancy pattern recognition;
> * So, some stuff depends on language extensions
> ** Which in turn only works with code which uses these extensions.
> ** Which creates issues with keeping the code portable and efficient.
> ** ...
>
> Some level of portability can be maintained by falling back to
> structures and function calls in the generic case, but then cases where
> things diverge are harder to fake efficiently (some amount of code
> optimized for BJX2 is worse off on x86 due to divergence).
>
> Similarly, Intel/MSVC and GCC using two different ways of approaching
> SIMD support doesn't really help either.
>
> One could argue that maybe it would be nice if the C standard could
> support (explicit) SIMD operations, but alas this implicitly assumes
> that the targets roughly agree as to SIMD hardware capabilities and/or
> there is a defined way to deal with this (say, the compiler has to fake
> it if the actual ISA can't do it).
>
> While one could argue that needing to fall back to runtime support to
> fake SIMD operations that don't exist in hardware, it is not exactly
> like things like: "for(i=0; i<4; i++) vc[i]=va[i]+vb[i];" is necessarily
> a high performance option either (and then leaves it up to the C
> compiler to try to figure out that a "float *" pointer is really meant
> to be understood as a 4 element vector).
>
> Granted, "How about we start gluing stuff inspired by GLSL onto C?" (and
> maybe start combining this with async/join semantics, *, etc) isn't
> really a popular solution either...
>
> *: where async can allow a sub-task to run asynchronously (either fully
> independent or with a "promise") and "join" allows blocking until an
> async sub-task is complete (and potentially fetching its final result),
> and implicitly using these as a compiler optimization hint (we don't
> care about the relative order of these sub-tasks, only that maybe we
> care that they finish before the program "joins" on the promise handle).
>
> But, then one can then debate between finer grained / lighter weight
> solutions (compiler optimization hint, but otherwise mostly implemented
> in terms of conventional linear execution semantics), and a more
> powerful but heavyweight solution (the async/promise effectively
> schedules a sub-task as a green-thread or similar). Or, if both are
> possible, how exactly the compiler should know when to choose which is
> better (eg: you wouldn't want to green-thread a 4x4 matrix multiply, the
> performance overhead on this would *suck*; but a 128x128 matrix multiply
> is a different beast entirely).
>
> Though, some of this could be inferred based on how the feature is used
> (if used on a "clean" expression or a "sufficiently pure" function (*2),
> assume it is a hint unless the expression is doing something
> particularly crazy; else assume the more heavyweight "use green-threads
> or something" option). Well, or alternately provide additional keywords
> to allow the programmer to further specify what sort of semantics they
> expect from the async/promise (well, or infer it based on whether the
> async and its corresponding join exist within the same control-path, or
> assume green-threads if no immediately visible join exists, etc).
>
> *2: Actually, what mostly matters for this is if there is some way the
> called function can potentially block on IO or lock a mutex or similar,
> which would lead to visibly different behaviors between "schedule this
> as a sub-task" and "allow lazy reordering but otherwise still execute
> the code sequentially".

Re: The age of the CPU as God is OVER.

<9bd8b96d-1f5e-4511-bd76-25e5f2329c99n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=27588&group=comp.arch#27588

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:20a8:b0:477:1882:3e7 with SMTP id 8-20020a05621420a800b00477188203e7mr27532324qvd.44.1662080972783;
Thu, 01 Sep 2022 18:09:32 -0700 (PDT)
X-Received: by 2002:a05:620a:294e:b0:6a7:750b:abf8 with SMTP id
n14-20020a05620a294e00b006a7750babf8mr21305690qkp.513.1662080972674; Thu, 01
Sep 2022 18:09:32 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 1 Sep 2022 18:09:32 -0700 (PDT)
In-Reply-To: <teopsk$1udih$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:b0a7:469b:dcf9:baf5;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:b0a7:469b:dcf9:baf5
References: <913f27f9-0d29-4d31-af6a-df51e05a13f2n@googlegroups.com>
<ten79s$1ourf$1@dont-email.me> <f649d624-adf3-4813-8539-d20e3af6e733n@googlegroups.com>
<teopsk$1udih$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <9bd8b96d-1f5e-4511-bd76-25e5f2329c99n@googlegroups.com>
Subject: Re: The age of the CPU as God is OVER.
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 02 Sep 2022 01:09:32 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 1692
 by: Quadibloc - Fri, 2 Sep 2022 01:09 UTC

On Wednesday, August 31, 2022 at 5:12:24 PM UTC-6, gg...@yahoo.com wrote:

> Write your code in Metal

I think I'd rather write it in Fortran if at all possible. That has been what
the serious people doing this kind of stuff have used, and therefore
Fortran compilers continue to be designed to produce highlty efficient
object code, continuing the tradition set by the very first FORTRAN
compiler.

John Savard

Re: The age of the CPU as God is OVER.

<e46e0d3d-89b2-45ad-bcb8-30b4a8df2f12n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=27589&group=comp.arch#27589

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:2b0c:b0:499:c34:5f99 with SMTP id jx12-20020a0562142b0c00b004990c345f99mr16746795qvb.40.1662086710545;
Thu, 01 Sep 2022 19:45:10 -0700 (PDT)
X-Received: by 2002:a05:622a:214:b0:342:f97c:1706 with SMTP id
b20-20020a05622a021400b00342f97c1706mr26147153qtx.291.1662086710306; Thu, 01
Sep 2022 19:45:10 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 1 Sep 2022 19:45:09 -0700 (PDT)
In-Reply-To: <9bd8b96d-1f5e-4511-bd76-25e5f2329c99n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.14.162; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.14.162
References: <913f27f9-0d29-4d31-af6a-df51e05a13f2n@googlegroups.com>
<ten79s$1ourf$1@dont-email.me> <f649d624-adf3-4813-8539-d20e3af6e733n@googlegroups.com>
<teopsk$1udih$1@dont-email.me> <9bd8b96d-1f5e-4511-bd76-25e5f2329c99n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <e46e0d3d-89b2-45ad-bcb8-30b4a8df2f12n@googlegroups.com>
Subject: Re: The age of the CPU as God is OVER.
From: jim.brak...@ieee.org (JimBrakefield)
Injection-Date: Fri, 02 Sep 2022 02:45:10 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 1973
 by: JimBrakefield - Fri, 2 Sep 2022 02:45 UTC

On Thursday, September 1, 2022 at 8:09:33 PM UTC-5, Quadibloc wrote:
> On Wednesday, August 31, 2022 at 5:12:24 PM UTC-6, gg...@yahoo.com wrote:
>
> > Write your code in Metal
> I think I'd rather write it in Fortran if at all possible. That has been what
> the serious people doing this kind of stuff have used, and therefore
> Fortran compilers continue to be designed to produce highlty efficient
> object code, continuing the tradition set by the very first FORTRAN
> compiler.
>
> John Savard

Fortran is burned into my head. Julia is its worthy successor (IMHO).
Relied on Fortran's subscript optimization to make my code readable.

Re: The age of the CPU as God is OVER.

<teuhq1$9ng$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=27594&group=comp.arch#27594

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!vxOTfS5Jn/b499FNW7Y8DA.user.46.165.242.75.POSTED!not-for-mail
From: nos...@nowhere.com (Andy)
Newsgroups: comp.arch
Subject: Re: The age of the CPU as God is OVER.
Date: Sat, 3 Sep 2022 15:31:11 +1200
Organization: Aioe.org NNTP Server
Message-ID: <teuhq1$9ng$1@gioia.aioe.org>
References: <913f27f9-0d29-4d31-af6a-df51e05a13f2n@googlegroups.com>
<ten79s$1ourf$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="9968"; posting-host="vxOTfS5Jn/b499FNW7Y8DA.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.11.0
X-Notice: Filtered by postfilter v. 0.9.2
Content-Language: en-US
 by: Andy - Sat, 3 Sep 2022 03:31 UTC

On 31/08/22 20:49, Brett wrote:

>
> The new Ryzen chips clock at 5.7 GHz, what is not to like.
>

If the reports of Ryzen CPUs containing Microsoft's PLUTON 'security'
feature are true and it is indeed impossible to install any other
operating system rather than Windows 11 on the things, then I for one
would say there is quite a lot not to like about them.

Is AMD expecting Linux users to run out and buy a Zen4 based server
board to use as a home PC? is that even possible let alone affordable?

Re: The age of the CPU as God is OVER.

<tf0lhl$30i3n$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=27600&group=comp.arch#27600

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: ggt...@yahoo.com (Brett)
Newsgroups: comp.arch
Subject: Re: The age of the CPU as God is OVER.
Date: Sat, 3 Sep 2022 22:47:18 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 28
Message-ID: <tf0lhl$30i3n$1@dont-email.me>
References: <913f27f9-0d29-4d31-af6a-df51e05a13f2n@googlegroups.com>
<ten79s$1ourf$1@dont-email.me>
<teuhq1$9ng$1@gioia.aioe.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 3 Sep 2022 22:47:18 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="a7e745f0a5b1852f93246ff953e9cced";
logging-data="3164279"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18QpCCvisKrOhFi0vqIo+tp"
User-Agent: NewsTap/5.5 (iPad)
Cancel-Lock: sha1:QU03YNFw83016uwtBIDnDR/mMx0=
sha1:tGzDXY9500rSWkGtr6dreKpOH1s=
 by: Brett - Sat, 3 Sep 2022 22:47 UTC

Andy <nospam@nowhere.com> wrote:
> On 31/08/22 20:49, Brett wrote:
>
>>
>> The new Ryzen chips clock at 5.7 GHz, what is not to like.
>>
>
> If the reports of Ryzen CPUs containing Microsoft's PLUTON 'security'
> feature are true and it is indeed impossible to install any other
> operating system rather than Windows 11 on the things, then I for one
> would say there is quite a lot not to like about them.
>
> Is AMD expecting Linux users to run out and buy a Zen4 based server
> board to use as a home PC? is that even possible let alone affordable?

Fake news.
You can turn off Pluton and install Linux, assuming you have your password.

https://www.kitguru.net/components/cpu/joao-silva/amd-ryzen-pro-chips-with-microsoft-pluton-wont-boot-linux/

The reason for Pluton is to stop resale of stolen laptops, like iPhones
today.

Even worse using a stolen laptop to steal and empty your bank account and
max out you credit cards used on the laptop. Steal your identity and get
credit cards in your name, etc. Pluton encrypts your data making this
non-government impossible. Same as iPhone.

Re: The age of the CPU as God is OVER.

<tf1bqc$rrj$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=27602&group=comp.arch#27602

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!aioe.org!vxOTfS5Jn/b499FNW7Y8DA.user.46.165.242.75.POSTED!not-for-mail
From: nos...@nowhere.com (Andy)
Newsgroups: comp.arch
Subject: Re: The age of the CPU as God is OVER.
Date: Sun, 4 Sep 2022 17:07:22 +1200
Organization: Aioe.org NNTP Server
Message-ID: <tf1bqc$rrj$1@gioia.aioe.org>
References: <913f27f9-0d29-4d31-af6a-df51e05a13f2n@googlegroups.com>
<ten79s$1ourf$1@dont-email.me> <teuhq1$9ng$1@gioia.aioe.org>
<tf0lhl$30i3n$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="28531"; posting-host="vxOTfS5Jn/b499FNW7Y8DA.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.11.0
Content-Language: en-US
X-Notice: Filtered by postfilter v. 0.9.2
 by: Andy - Sun, 4 Sep 2022 05:07 UTC

On 4/09/22 10:47, Brett wrote:
> Andy <nospam@nowhere.com> wrote:
>> On 31/08/22 20:49, Brett wrote:
>>
>>>
>>> The new Ryzen chips clock at 5.7 GHz, what is not to like.
>>>
>>
>> If the reports of Ryzen CPUs containing Microsoft's PLUTON 'security'
>> feature are true and it is indeed impossible to install any other
>> operating system rather than Windows 11 on the things, then I for one
>> would say there is quite a lot not to like about them.
>>
>> Is AMD expecting Linux users to run out and buy a Zen4 based server
>> board to use as a home PC? is that even possible let alone affordable?
>
> Fake news.

You have an odd definition of 'fake news' when the article you point to
essentially backs up what I heard, the only saving grace being the last
line which points out you can turn Pluton off in the UEFI in order to
install other operating systems, so yay! I guess.

Why is this Microsoft toxic sludge on any CPU again?

> You can turn off Pluton and install Linux, assuming you have your password.

Wait you need a password to do so?
So forget/misplace your Pluton password and your otherwise fully
functional computer becomes little more than useless e-waste?

> https://www.kitguru.net/components/cpu/joao-silva/amd-ryzen-pro-chips-with-microsoft-pluton-wont-boot-linux/
>
> The reason for Pluton is to stop resale of stolen laptops, like iPhones
> today.

Which if you're not running Windows 11 you had to turn off in order to
install your preferred operating system.

> Even worse using a stolen laptop to steal and empty your bank account and
> max out you credit cards used on the laptop. Steal your identity and get
> credit cards in your name, etc. Pluton encrypts your data making this
> non-government impossible. Same as iPhone.

Given that can be accomplished by criminals over the internet already I
have little faith MS is going to fix the problem, I just hope the Pluton
'cure' doesn't end up being worse than the disease!

Re: The age of the CPU as God is OVER.

<tf1pel$8t4$1@newsreader4.netcologne.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=27603&group=comp.arch#27603

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2a0a-a540-1471-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: The age of the CPU as God is OVER.
Date: Sun, 4 Sep 2022 09:00:05 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <tf1pel$8t4$1@newsreader4.netcologne.de>
References: <913f27f9-0d29-4d31-af6a-df51e05a13f2n@googlegroups.com>
<ten79s$1ourf$1@dont-email.me> <teuhq1$9ng$1@gioia.aioe.org>
<tf0lhl$30i3n$1@dont-email.me> <tf1bqc$rrj$1@gioia.aioe.org>
Injection-Date: Sun, 4 Sep 2022 09:00:05 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2a0a-a540-1471-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2a0a:a540:1471:0:7285:c2ff:fe6c:992d";
logging-data="9124"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)
 by: Thomas Koenig - Sun, 4 Sep 2022 09:00 UTC

Andy <nospam@nowhere.com> schrieb:
> On 4/09/22 10:47, Brett wrote:
>> Andy <nospam@nowhere.com> wrote:
>>> On 31/08/22 20:49, Brett wrote:
>>>
>>>>
>>>> The new Ryzen chips clock at 5.7 GHz, what is not to like.
>>>>
>>>
>>> If the reports of Ryzen CPUs containing Microsoft's PLUTON 'security'
>>> feature are true and it is indeed impossible to install any other
>>> operating system rather than Windows 11 on the things, then I for one
>>> would say there is quite a lot not to like about them.
>>>
>>> Is AMD expecting Linux users to run out and buy a Zen4 based server
>>> board to use as a home PC? is that even possible let alone affordable?
>>
>> Fake news.
>
> You have an odd definition of 'fake news' when the article you point to
> essentially backs up what I heard, the only saving grace being the last
> line which points out you can turn Pluton off in the UEFI in order to
> install other operating systems, so yay! I guess.
>
> Why is this Microsoft toxic sludge on any CPU again?

Microsoft appears to have blocked Linux some Linux boot loaders
with a "security" update, among them Ubuntu 20.04 LTS, according to
https://www.heise.de/hintergrund/Bootloader-Signaturen-per-Update-zurueckgezogen-Microsoft-bootet-Linux-aus-7250544.html
(German, unfortunately behind a paywall).

Hmm...

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor