Message-ID:

19 May, 2024: Line wrapping has been changed to be more consistent with Usenet standards.
If you find that it is broken please let me know here rocksolid.nodes.help

devel / comp.arch / Re: Jason Cong's future of high performance computing

Re: Jason Cong's future of high performance computing

<8a8e6b50-215a-4a1c-b675-184992060940n@googlegroups.com>

https://www.novabbs.com/devel/article-flat.php?id=30608&group=comp.arch#30608

X-Received: by 2002:a05:622a:1a11:b0:3b6:36cb:51f8 with SMTP id f17-20020a05622a1a1100b003b636cb51f8mr2742445qtb.498.1675026665331;
Sun, 29 Jan 2023 13:11:05 -0800 (PST)
X-Received: by 2002:aca:1819:0:b0:36f:2426:36ae with SMTP id
h25-20020aca1819000000b0036f242636aemr943605oih.113.1675026665085; Sun, 29
Jan 2023 13:11:05 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 29 Jan 2023 13:11:04 -0800 (PST)
In-Reply-To: <b38f6677-5e13-4358-8a8f-9f0e70d84aedn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=162.157.97.93; posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 162.157.97.93
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com>
<a74b1c3c-bf8f-485d-9c43-a576555e25d5n@googlegroups.com> <c37102e3-e86a-45d9-a994-ef824720a907n@googlegroups.com>
<3a9395bb-c010-48f5-b7ec-259a43c8ef79n@googlegroups.com> <b38f6677-5e13-4358-8a8f-9f0e70d84aedn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <8a8e6b50-215a-4a1c-b675-184992060940n@googlegroups.com>
Subject: Re: Jason Cong's future of high performance computing
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sun, 29 Jan 2023 21:11:05 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 2460

by: Quadibloc - Sun, 29 Jan 2023 21:11 UTC

On Sunday, January 29, 2023 at 10:31:45 AM UTC-7, MitchAlsup wrote:
> On Sunday, January 29, 2023 at 11:07:44 AM UTC-6, Quadibloc wrote:

> > So I was advocating FPGAs with real FPUs as a component, not synthesizing
> > an FPU on an FPGA, which is much slower.

> The IP to which might cost even more than the FPGA it goes in.

Wouldn't that be an argument that the cost of CPUs and GPUs
would (also) be prohibitive?

I am being serious here. An FPGA that included a large number of
full-bore 64-bit floating point ALUs could indeed be designed to
accelerate the inner loops of a lot of programs, particularly in
scientific computing, which is the field that makes the most use
of HPC.

That might still be a special-purpose device, but no more so - and
from some viewpoints, considerably less so - than the typical FPGA,
which seems only to be applicable to things which are otherwise
difficult to do on a CPU.

I suppose a joke to the effect that a special-purpose computing
device is one that's good for somebody else's purpose might fit in
here.

John Savard

Re: Jason Cong's future of high performance computing

<9b0af5e7-933a-4c75-8654-9edda742626fn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30609&group=comp.arch#30609

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:6214:4110:b0:537:7f85:22b7 with SMTP id kc16-20020a056214411000b005377f8522b7mr742524qvb.77.1675030029655;
Sun, 29 Jan 2023 14:07:09 -0800 (PST)
X-Received: by 2002:a05:6870:ac0f:b0:163:bf69:86ce with SMTP id
kw15-20020a056870ac0f00b00163bf6986cemr72328oab.245.1675030029534; Sun, 29
Jan 2023 14:07:09 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 29 Jan 2023 14:07:09 -0800 (PST)
In-Reply-To: <8a8e6b50-215a-4a1c-b675-184992060940n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:45c0:b342:13b0:20ac;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:45c0:b342:13b0:20ac
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com>
<a74b1c3c-bf8f-485d-9c43-a576555e25d5n@googlegroups.com> <c37102e3-e86a-45d9-a994-ef824720a907n@googlegroups.com>
<3a9395bb-c010-48f5-b7ec-259a43c8ef79n@googlegroups.com> <b38f6677-5e13-4358-8a8f-9f0e70d84aedn@googlegroups.com>
<8a8e6b50-215a-4a1c-b675-184992060940n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <9b0af5e7-933a-4c75-8654-9edda742626fn@googlegroups.com>
Subject: Re: Jason Cong's future of high performance computing
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 29 Jan 2023 22:07:09 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 2256

by: MitchAlsup - Sun, 29 Jan 2023 22:07 UTC

On Sunday, January 29, 2023 at 3:11:06 PM UTC-6, Quadibloc wrote:
> On Sunday, January 29, 2023 at 10:31:45 AM UTC-7, MitchAlsup wrote:
> > On Sunday, January 29, 2023 at 11:07:44 AM UTC-6, Quadibloc wrote:
>
> > > So I was advocating FPGAs with real FPUs as a component, not synthesizing
> > > an FPU on an FPGA, which is much slower.
>
> > The IP to which might cost even more than the FPGA it goes in.
<
> Wouldn't that be an argument that the cost of CPUs and GPUs
> would (also) be prohibitive?
<
This falls into the category where there might be excellent engineering
reasons that something should be done with an FPGA added to a
system, but the practicality of getting there is impracticable (licensing,
legal, intellectual property $$$s,...)

Re: Jason Cong's future of high performance computing

<tr6ra9$3dlpe$2@newsreader4.netcologne.de>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30610&group=comp.arch#30610

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd4-de6f-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Jason Cong's future of high performance computing
Date: Sun, 29 Jan 2023 22:17:45 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <tr6ra9$3dlpe$2@newsreader4.netcologne.de>
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com>
<eea89e9b-5bcf-484a-aa13-4041b3331492n@googlegroups.com>
<HcvBL.592257$9sn9.321594@fx17.iad> <72ABL.148955$PXw7.10070@fx45.iad>
Injection-Date: Sun, 29 Jan 2023 22:17:45 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd4-de6f-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd4:de6f:0:7285:c2ff:fe6c:992d";
logging-data="3594030"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Sun, 29 Jan 2023 22:17 UTC

EricP <ThatWouldBeTelling@thevillage.com> schrieb:
> Scott Lurndal wrote:
>> MitchAlsup <MitchAlsup@aol.com> writes:
>>> On Saturday, January 28, 2023 at 6:11:30 PM UTC-6, JimBrakefield wrote:
>>>> https://www.youtube.com/watch?v=3D-XuMWvGUocI&t=3D123s=20
>>>> Starts at the 2 minute mark. He argues that computer performance=20
>>>> has plateaued and that FPGAs offer a route to higher performance.=20
>>> <
>>
>>> <
>>> I should note: you cannot debug HW by single stepping !! as there is no def=
>>> inition=20
>>> of what single stepping means at the gate level. No, HW designer use simula=
>>> tors=20
>>> where they can stop at =C2=BD clock intervals and then examine millions of =
>>> signals--
>>> some of them X (unknown value) and Z (high impeadence).
>>
>> Actually, that's how we debugged the Burroughs mainframes, by stepping a
>> single cycle at a time (using an external "maintenance processor" to
>> drive the processor logic using scan chains). This was late 70's.
>
> Luxury! I used a dual trace oscilloscope and an In-Circuit Emulator.

https://dilbert.com/strip/1992-09-08 comes to mind.

Re: Jason Cong's future of high performance computing

<rXCBL.523654$vBI8.190519@fx15.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30611&group=comp.arch#30611

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx15.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Jason Cong's future of high performance computing
Newsgroups: comp.arch
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com> <a74b1c3c-bf8f-485d-9c43-a576555e25d5n@googlegroups.com> <c37102e3-e86a-45d9-a994-ef824720a907n@googlegroups.com> <3a9395bb-c010-48f5-b7ec-259a43c8ef79n@googlegroups.com> <b38f6677-5e13-4358-8a8f-9f0e70d84aedn@googlegroups.com> <8a8e6b50-215a-4a1c-b675-184992060940n@googlegroups.com> <9b0af5e7-933a-4c75-8654-9edda742626fn@googlegroups.com>
Lines: 19
Message-ID: <rXCBL.523654$vBI8.190519@fx15.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Sun, 29 Jan 2023 23:03:51 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Sun, 29 Jan 2023 23:03:51 GMT
X-Received-Bytes: 1824

by: Scott Lurndal - Sun, 29 Jan 2023 23:03 UTC

MitchAlsup <MitchAlsup@aol.com> writes:
>On Sunday, January 29, 2023 at 3:11:06 PM UTC-6, Quadibloc wrote:
>> On Sunday, January 29, 2023 at 10:31:45 AM UTC-7, MitchAlsup wrote:
>> > On Sunday, January 29, 2023 at 11:07:44 AM UTC-6, Quadibloc wrote:
>>
>> > > So I was advocating FPGAs with real FPUs as a component, not synthesizing
>> > > an FPU on an FPGA, which is much slower.
>>
>> > The IP to which might cost even more than the FPGA it goes in.
><
>> Wouldn't that be an argument that the cost of CPUs and GPUs
>> would (also) be prohibitive?
><
>This falls into the category where there might be excellent engineering
>reasons that something should be done with an FPGA added to a
>system, but the practicality of getting there is impracticable (licensing,
>legal, intellectual property $$$s,...)

Have you priced out a 3mm mask recently?

Re: Jason Cong's future of high performance computing

<cYCBL.523655$vBI8.316638@fx15.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30612&group=comp.arch#30612

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx15.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Jason Cong's future of high performance computing
Newsgroups: comp.arch
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com> <a74b1c3c-bf8f-485d-9c43-a576555e25d5n@googlegroups.com> <c37102e3-e86a-45d9-a994-ef824720a907n@googlegroups.com> <3a9395bb-c010-48f5-b7ec-259a43c8ef79n@googlegroups.com> <b38f6677-5e13-4358-8a8f-9f0e70d84aedn@googlegroups.com> <8a8e6b50-215a-4a1c-b675-184992060940n@googlegroups.com> <9b0af5e7-933a-4c75-8654-9edda742626fn@googlegroups.com> <rXCBL.523654$vBI8.190519@fx15.iad>
Lines: 21
Message-ID: <cYCBL.523655$vBI8.316638@fx15.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Sun, 29 Jan 2023 23:04:40 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Sun, 29 Jan 2023 23:04:40 GMT
X-Received-Bytes: 1962

by: Scott Lurndal - Sun, 29 Jan 2023 23:04 UTC

scott@slp53.sl.home (Scott Lurndal) writes:
>MitchAlsup <MitchAlsup@aol.com> writes:
>>On Sunday, January 29, 2023 at 3:11:06 PM UTC-6, Quadibloc wrote:
>>> On Sunday, January 29, 2023 at 10:31:45 AM UTC-7, MitchAlsup wrote:
>>> > On Sunday, January 29, 2023 at 11:07:44 AM UTC-6, Quadibloc wrote:
>>>
>>> > > So I was advocating FPGAs with real FPUs as a component, not synthesizing
>>> > > an FPU on an FPGA, which is much slower.
>>>
>>> > The IP to which might cost even more than the FPGA it goes in.
>><
>>> Wouldn't that be an argument that the cost of CPUs and GPUs
>>> would (also) be prohibitive?
>><
>>This falls into the category where there might be excellent engineering
>>reasons that something should be done with an FPGA added to a
>>system, but the practicality of getting there is impracticable (licensing,
>>legal, intellectual property $$$s,...)
>
>Have you priced out a 3mm mask recently?
^^ ---> nm.

Re: Jason Cong's future of high performance computing

<444b061a-fce2-4da1-b69c-7b35fd9a71e8n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30613&group=comp.arch#30613

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:6903:0:b0:3b6:2cdc:81f1 with SMTP id bt3-20020ac86903000000b003b62cdc81f1mr2354853qtb.315.1675037096269;
Sun, 29 Jan 2023 16:04:56 -0800 (PST)
X-Received: by 2002:a05:6808:8c8:b0:35c:27c2:68a4 with SMTP id
k8-20020a05680808c800b0035c27c268a4mr3411205oij.42.1675037096033; Sun, 29 Jan
2023 16:04:56 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 29 Jan 2023 16:04:55 -0800 (PST)
In-Reply-To: <rXCBL.523654$vBI8.190519@fx15.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:45c0:b342:13b0:20ac;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:45c0:b342:13b0:20ac
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com>
<a74b1c3c-bf8f-485d-9c43-a576555e25d5n@googlegroups.com> <c37102e3-e86a-45d9-a994-ef824720a907n@googlegroups.com>
<3a9395bb-c010-48f5-b7ec-259a43c8ef79n@googlegroups.com> <b38f6677-5e13-4358-8a8f-9f0e70d84aedn@googlegroups.com>
<8a8e6b50-215a-4a1c-b675-184992060940n@googlegroups.com> <9b0af5e7-933a-4c75-8654-9edda742626fn@googlegroups.com>
<rXCBL.523654$vBI8.190519@fx15.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <444b061a-fce2-4da1-b69c-7b35fd9a71e8n@googlegroups.com>
Subject: Re: Jason Cong's future of high performance computing
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 30 Jan 2023 00:04:56 +0000
Content-Type: text/plain; charset="UTF-8"

by: MitchAlsup - Mon, 30 Jan 2023 00:04 UTC

On Sunday, January 29, 2023 at 5:03:55 PM UTC-6, Scott Lurndal wrote:
> MitchAlsup <Mitch...@aol.com> writes:
> >On Sunday, January 29, 2023 at 3:11:06 PM UTC-6, Quadibloc wrote:
> >> On Sunday, January 29, 2023 at 10:31:45 AM UTC-7, MitchAlsup wrote:
> >> > On Sunday, January 29, 2023 at 11:07:44 AM UTC-6, Quadibloc wrote:
> >>
> >> > > So I was advocating FPGAs with real FPUs as a component, not synthesizing
> >> > > an FPU on an FPGA, which is much slower.
> >>
> >> > The IP to which might cost even more than the FPGA it goes in.
> ><
> >> Wouldn't that be an argument that the cost of CPUs and GPUs
> >> would (also) be prohibitive?
> ><
> >This falls into the category where there might be excellent engineering
> >reasons that something should be done with an FPGA added to a
> >system, but the practicality of getting there is impracticable (licensing,
> >legal, intellectual property $$$s,...)
<
> Have you priced out a 3mm mask recently?
<
22nm and 14nm are not that expensive right now.

Re: Jason Cong's future of high performance computing

<2023Jan30.092557@mips.complang.tuwien.ac.at>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30615&group=comp.arch#30615

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Jason Cong's future of high performance computing
Date: Mon, 30 Jan 2023 08:25:57 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 21
Message-ID: <2023Jan30.092557@mips.complang.tuwien.ac.at>
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com> <2avBL.592255$9sn9.484650@fx17.iad> <82ABL.148956$PXw7.24310@fx45.iad> <4532399a-3e46-421f-81d4-ee15be3a0420n@googlegroups.com>
Injection-Info: reader01.eternal-september.org; posting-host="d2dad5d5cabdfd704a26275767f96f65";
logging-data="3381501"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/39SbvNCzl4sgSudS8uTX8"
Cancel-Lock: sha1:r5F03ORiwkgeTDA2mpvRRz7So8A=
X-newsreader: xrn 10.11

by: Anton Ertl - Mon, 30 Jan 2023 08:25 UTC

Michael S <already5chosen@yahoo.com> writes:
>Pay attention that programming FPGAs in Verilog is almost exclusively
>USA trait. The rest of the world does it in VHDL.

Bernd Paysan from Europe wrote b16(-dsp) and b16-small in Verilog
<https://github.com/forthy42/b16-small>. It has been used in custom
silicon, not (to my knowledge) in FPGA, but does that make a
difference?

From the HOPL talk about Verilog, my impression is: Around 2000 all
the buzz was for VHDL, and that Verilog was doomed. Verilog survived
and won in large projects, because it was designed for efficient
implementation of simulators, while the design of VHDL necessarily
leads to less efficiency. For large projects this efficiency is very
important, while for smaller projects the VHDL simulators are fast
enough.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Jason Cong's future of high performance computing

<857fc84f-241d-4136-bed4-9b759d4299e7n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30618&group=comp.arch#30618

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:4b5a:0:b0:3b6:32fa:23e3 with SMTP id e26-20020ac84b5a000000b003b632fa23e3mr1593639qts.132.1675092186702;
Mon, 30 Jan 2023 07:23:06 -0800 (PST)
X-Received: by 2002:a05:6870:e2d5:b0:14e:9c17:1804 with SMTP id
w21-20020a056870e2d500b0014e9c171804mr5113212oad.186.1675092186464; Mon, 30
Jan 2023 07:23:06 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 30 Jan 2023 07:23:06 -0800 (PST)
In-Reply-To: <2023Jan30.092557@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com>
<2avBL.592255$9sn9.484650@fx17.iad> <82ABL.148956$PXw7.24310@fx45.iad>
<4532399a-3e46-421f-81d4-ee15be3a0420n@googlegroups.com> <2023Jan30.092557@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <857fc84f-241d-4136-bed4-9b759d4299e7n@googlegroups.com>
Subject: Re: Jason Cong's future of high performance computing
From: already5...@yahoo.com (Michael S)
Injection-Date: Mon, 30 Jan 2023 15:23:06 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 2841

by: Michael S - Mon, 30 Jan 2023 15:23 UTC

It absolutely does.
FPGA development and ASIC development are different cultures.
Naturally, use of FPGAs for ASIC prototyping is part of ASIC culture.

I could imagine that "FPGAs as compute accelerators" is yet another
culture if there are enough people involved to form the culture.
Likely with different set of preferred tools. I know nothing about
it except that I know that it does not really work. But even that
knowledge is not 1st hand.

> From the HOPL talk about Verilog, my impression is: Around 2000 all
> the buzz was for VHDL, and that Verilog was doomed. Verilog survived
> and won in large projects, because it was designed for efficient
> implementation of simulators, while the design of VHDL necessarily
> leads to less efficiency. For large projects this efficiency is very
> important, while for smaller projects the VHDL simulators are fast
> enough.
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Jason Cong's future of high performance computing

<UWSBL.445233$8_id.34107@fx09.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30619&group=comp.arch#30619

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx09.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Jason Cong's future of high performance computing
Newsgroups: comp.arch
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com> <2avBL.592255$9sn9.484650@fx17.iad> <82ABL.148956$PXw7.24310@fx45.iad> <4532399a-3e46-421f-81d4-ee15be3a0420n@googlegroups.com> <2023Jan30.092557@mips.complang.tuwien.ac.at> <857fc84f-241d-4136-bed4-9b759d4299e7n@googlegroups.com>
Lines: 52
Message-ID: <UWSBL.445233$8_id.34107@fx09.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Mon, 30 Jan 2023 17:15:32 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Mon, 30 Jan 2023 17:15:32 GMT
X-Received-Bytes: 3058

by: Scott Lurndal - Mon, 30 Jan 2023 17:15 UTC

Michael S <already5chosen@yahoo.com> writes:
>On Monday, January 30, 2023 at 10:39:35 AM UTC+2, Anton Ertl wrote:
>> Michael S <already...@yahoo.com> writes:
>> >Pay attention that programming FPGAs in Verilog is almost exclusively
>> >USA trait. The rest of the world does it in VHDL.
>> Bernd Paysan from Europe wrote b16(-dsp) and b16-small in Verilog
>> <https://github.com/forthy42/b16-small>. It has been used in custom
>> silicon, not (to my knowledge) in FPGA, but does that make a
>> difference?
>>

>
>I could imagine that "FPGAs as compute accelerators" is yet another
>culture if there are enough people involved to form the culture.
>Likely with different set of preferred tools. I know nothing about
>it except that I know that it does not really work. But even that
>knowledge is not 1st hand.

You "know that it does not really work". But not from first-hand
experience. So, what data (other than off-hand anecdotal data) do
you have to support your position?

https://www.researchgate.net/publication/354063174_FPGA-based_HPC_accelerators_An_evaluation_on_performance_and_energy_efficiency
"Results show that while FPGAs struggle to compete in absolute
terms with GPUs on memory- and compute- intensive kernels,
they require far less power and can deliver nearly the same
energy efficiency."

https://ieeexplore.ieee.org/document/9556357

"FPGAs are already known to provide interesting speedups in
several application fields, but to estimate their expected
performance in the context of typical HPC workloads is not
straightforward."

https://evision-systems.com/high-performance-computing/

FPGAs have been promient at SC for the last twenty years,
see the program for SC22, e.g.

Advances in FPGA Programming and Technology for HPC

"FPGAs have gone from niche components to being a central
part of many data centers worldwide to being considered for
core HPC installations. The last year has seen tremendous advances in
FPGA programmability and technology, and FPGAs for general HPC is
apparently within reach."

Task Scheduling on FPGA-Based Accelerators without Partial Reconfiguration

etc.

Re: Jason Cong's future of high performance computing

<5_VBL.494480$iS99.384203@fx16.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30623&group=comp.arch#30623

copy link Newsgroups: comp.arch

by: Scott Lurndal - Mon, 30 Jan 2023 20:43 UTC

jgd@cix.co.uk (John Dallman) writes:
>In article <UWSBL.445233$8_id.34107@fx09.iad>, scott@slp53.sl.home (Scott
>Lurndal) wrote:
>
>> You "know that it does not really work". But not from first-hand
>> experience.
>
>From the last time I looked at add-on accelerators, how fast does data
>get in and out of them? What's the minimum size block of doubles when you
>can save time, overall, on getting the data into the FPGA, processing it,
>and getting it back out again?

Most use PCI-Express. So the bandwidth depends on which generation
and the number of lanes. Many FPGA have hard PCI-E gen 4 or Gen5
using 25Gbps SERDES.

Re: Jason Cong's future of high performance computing

<fbe81176-03c1-4c80-ae31-d06b3ef74daan@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30624&group=comp.arch#30624

copy link Newsgroups: comp.arch

X-Received: by 2002:ae9:e002:0:b0:71c:6bfd:621b with SMTP id m2-20020ae9e002000000b0071c6bfd621bmr325261qkk.279.1675112312616;
Mon, 30 Jan 2023 12:58:32 -0800 (PST)
X-Received: by 2002:a05:6870:f593:b0:163:aadd:a457 with SMTP id
eh19-20020a056870f59300b00163aadda457mr632529oab.201.1675112312131; Mon, 30
Jan 2023 12:58:32 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 30 Jan 2023 12:58:31 -0800 (PST)
In-Reply-To: <2023Jan30.092557@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7465:9c8f:f537:2149;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7465:9c8f:f537:2149
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com>
<2avBL.592255$9sn9.484650@fx17.iad> <82ABL.148956$PXw7.24310@fx45.iad>
<4532399a-3e46-421f-81d4-ee15be3a0420n@googlegroups.com> <2023Jan30.092557@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <fbe81176-03c1-4c80-ae31-d06b3ef74daan@googlegroups.com>
Subject: Re: Jason Cong's future of high performance computing
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 30 Jan 2023 20:58:32 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 2712

by: MitchAlsup - Mon, 30 Jan 2023 20:58 UTC

On Monday, January 30, 2023 at 2:39:35 AM UTC-6, Anton Ertl wrote:
> Michael S <already...@yahoo.com> writes:
> >Pay attention that programming FPGAs in Verilog is almost exclusively
> >USA trait. The rest of the world does it in VHDL.
> Bernd Paysan from Europe wrote b16(-dsp) and b16-small in Verilog
> <https://github.com/forthy42/b16-small>. It has been used in custom
> silicon, not (to my knowledge) in FPGA, but does that make a
> difference?
>
> From the HOPL talk about Verilog, my impression is: Around 2000 all
> the buzz was for VHDL, and that Verilog was doomed. Verilog survived
> and won in large projects, because it was designed for efficient
> implementation of simulators, while the design of VHDL necessarily
> leads to less efficiency. For large projects this efficiency is very
> important, while for smaller projects the VHDL simulators are fast
> enough.
<
In other words--with todays fast CPUs--one can dispense with the
pipeline timing simulators written in C and proceed directly to System
Verilog which can be used as pipeline timing simulator, FPGA output
ASIC, and Standard cell library implementations. Saving design team
effort.
<
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Jason Cong's future of high performance computing

<ba3858a3-2dce-4b8d-b15d-c035d634c368n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30625&group=comp.arch#30625

copy link Newsgroups: comp.arch

X-Received: by 2002:ad4:4b68:0:b0:537:6e55:eeb7 with SMTP id m8-20020ad44b68000000b005376e55eeb7mr1427157qvx.66.1675112622640;
Mon, 30 Jan 2023 13:03:42 -0800 (PST)
X-Received: by 2002:a9d:4114:0:b0:68b:d1cf:eedf with SMTP id
o20-20020a9d4114000000b0068bd1cfeedfmr236362ote.62.1675112622377; Mon, 30 Jan
2023 13:03:42 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 30 Jan 2023 13:03:42 -0800 (PST)
In-Reply-To: <5_VBL.494480$iS99.384203@fx16.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7465:9c8f:f537:2149;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7465:9c8f:f537:2149
References: <UWSBL.445233$8_id.34107@fx09.iad> <memo.20230130201308.24676L@jgd.cix.co.uk>
<5_VBL.494480$iS99.384203@fx16.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ba3858a3-2dce-4b8d-b15d-c035d634c368n@googlegroups.com>
Subject: Re: Jason Cong's future of high performance computing
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 30 Jan 2023 21:03:42 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2392

by: MitchAlsup - Mon, 30 Jan 2023 21:03 UTC

On Monday, January 30, 2023 at 2:43:49 PM UTC-6, Scott Lurndal wrote:
> j...@cix.co.uk (John Dallman) writes:
> >In article <UWSBL.445233$8_id....@fx09.iad>, sc...@slp53.sl.home (Scott
> >Lurndal) wrote:
> >
> >> You "know that it does not really work". But not from first-hand
> >> experience.
> >
> >From the last time I looked at add-on accelerators, how fast does data
> >get in and out of them? What's the minimum size block of doubles when you
> >can save time, overall, on getting the data into the FPGA, processing it,
> >and getting it back out again?
<
> Most use PCI-Express. So the bandwidth depends on which generation
> and the number of lanes. Many FPGA have hard PCI-E gen 4 or Gen5
> using 25Gbps SERDES.
<
Yes, but (the BIG but) the cores on the "chip(s)" access DRAM with ½
cache line width busses at full core speeds, while PCIe has a) lots
of added latency, b) way less than 256-bits per cycle, and c) slower
cycles than the cores.
<
So, it if is DRAM bound application, FPGAs are not going to win
accessing DRAM via PCIe.

Re: Jason Cong's future of high performance computing

<OAWBL.545920$vBI8.267239@fx15.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30626&group=comp.arch#30626

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx15.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Jason Cong's future of high performance computing
Newsgroups: comp.arch
References: <UWSBL.445233$8_id.34107@fx09.iad> <memo.20230130201308.24676L@jgd.cix.co.uk> <5_VBL.494480$iS99.384203@fx16.iad> <ba3858a3-2dce-4b8d-b15d-c035d634c368n@googlegroups.com>
Lines: 34
Message-ID: <OAWBL.545920$vBI8.267239@fx15.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Mon, 30 Jan 2023 21:25:02 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Mon, 30 Jan 2023 21:25:02 GMT
X-Received-Bytes: 2125

by: Scott Lurndal - Mon, 30 Jan 2023 21:25 UTC

MitchAlsup <MitchAlsup@aol.com> writes:
>On Monday, January 30, 2023 at 2:43:49 PM UTC-6, Scott Lurndal wrote:
>> j...@cix.co.uk (John Dallman) writes:=20
>> >In article <UWSBL.445233$8_id....@fx09.iad>, sc...@slp53.sl.home (Scott=
>=20
>> >Lurndal) wrote:=20
>> >=20
>> >> You "know that it does not really work". But not from first-hand=20
>> >> experience.=20
>> >=20
>> >From the last time I looked at add-on accelerators, how fast does data=
>=20
>> >get in and out of them? What's the minimum size block of doubles when yo=
>u=20
>> >can save time, overall, on getting the data into the FPGA, processing it=
>,=20
>> >and getting it back out again?
><
>> Most use PCI-Express. So the bandwidth depends on which generation=20
>> and the number of lanes. Many FPGA have hard PCI-E gen 4 or Gen5=20
>> using 25Gbps SERDES.
><
>Yes, but (the BIG but) the cores on the "chip(s)" access DRAM with =C2=BD
>cache line width busses at full core speeds, while PCIe has a) lots
>of added latency, b) way less than 256-bits per cycle, and c) slower=20
>cycles than the cores.
><
>So, it if is DRAM bound application, FPGAs are not going to win=20
>accessing DRAM via PCIe.

Have you looked at CXL at all? That's all transported by PCIe and
is expected to be used for DRAM-bound applications. The FPGA
can accesses _coherently_ the PCI host memory directly, no DMA
required.

Re: Jason Cong's future of high performance computing

<23e42dd0-541b-4a1d-bb1a-46aaed023c03n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30627&group=comp.arch#30627

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:6214:398b:b0:532:fc7:f39c with SMTP id ny11-20020a056214398b00b005320fc7f39cmr2307290qvb.35.1675116516022;
Mon, 30 Jan 2023 14:08:36 -0800 (PST)
X-Received: by 2002:a05:6870:f69c:b0:163:9ade:ea88 with SMTP id
el28-20020a056870f69c00b001639adeea88mr1041403oab.298.1675116515793; Mon, 30
Jan 2023 14:08:35 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 30 Jan 2023 14:08:35 -0800 (PST)
In-Reply-To: <OAWBL.545920$vBI8.267239@fx15.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7465:9c8f:f537:2149;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7465:9c8f:f537:2149
References: <UWSBL.445233$8_id.34107@fx09.iad> <memo.20230130201308.24676L@jgd.cix.co.uk>
<5_VBL.494480$iS99.384203@fx16.iad> <ba3858a3-2dce-4b8d-b15d-c035d634c368n@googlegroups.com>
<OAWBL.545920$vBI8.267239@fx15.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <23e42dd0-541b-4a1d-bb1a-46aaed023c03n@googlegroups.com>
Subject: Re: Jason Cong's future of high performance computing
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Mon, 30 Jan 2023 22:08:35 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 3876

by: MitchAlsup - Mon, 30 Jan 2023 22:08 UTC

On Monday, January 30, 2023 at 3:25:06 PM UTC-6, Scott Lurndal wrote:
> MitchAlsup <Mitch...@aol.com> writes:
> >On Monday, January 30, 2023 at 2:43:49 PM UTC-6, Scott Lurndal wrote:
> >> j...@cix.co.uk (John Dallman) writes:=20
> >> >In article <UWSBL.445233$8_id....@fx09.iad>, sc...@slp53.sl.home (Scott=
> >=20
> >> >Lurndal) wrote:=20
> >> >=20
> >> >> You "know that it does not really work". But not from first-hand=20
> >> >> experience.=20
> >> >=20
> >> >From the last time I looked at add-on accelerators, how fast does data=
> >=20
> >> >get in and out of them? What's the minimum size block of doubles when yo=
> >u=20
> >> >can save time, overall, on getting the data into the FPGA, processing it=
> >,=20
> >> >and getting it back out again?
> ><
> >> Most use PCI-Express. So the bandwidth depends on which generation=20
> >> and the number of lanes. Many FPGA have hard PCI-E gen 4 or Gen5=20
> >> using 25Gbps SERDES.
> ><
> >Yes, but (the BIG but) the cores on the "chip(s)" access DRAM with =C2=BD
> >cache line width busses at full core speeds, while PCIe has a) lots
> >of added latency, b) way less than 256-bits per cycle, and c) slower=20
> >cycles than the cores.
> ><
> >So, it if is DRAM bound application, FPGAs are not going to win=20
> >accessing DRAM via PCIe.
>
> Have you looked at CXL at all? That's all transported by PCIe and
> is expected to be used for DRAM-bound applications. The FPGA
> can accesses _coherently_ the PCI host memory directly, no DMA
> required.
<
I have looked at it in brief. What CLX to me is is coherence across
PCIe PHYs (i.e., pins).
It still has PCIe latency which is more like 50 clocks instead of 10
clocks to the DRAM controller.
And it has PCIe width limitations which the on-die DRAM controller
does not.
So, you go from <say> two (2) 128-bit HBM at 2GHz DDR channels
down to 8-differential-pins at 5GHz and you start to see the BW
problem. The latency through PCIe is "not all that great".
<
If you can afford to spend 128-PCIe-pins on DRAM, you can probably
make CLX word reasonably for BW limited applications but not for
latency bound applications.
<
You may or may not get the added reliability of PCIe data transfers.
<
On-die DRAM has the advantage by spending lots of pins on the
DRAM DIMM profile busses connecting the DIMMs to the die.
If you can afford this number of pins for PCIe DRAM, then you can
get into the comparative performance game; but latency remains
a stumbling block.

Re: Jason Cong's future of high performance computing

<391cb38e-17c6-44c8-b16c-adf0fc64383cn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30630&group=comp.arch#30630

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:605a:0:b0:3a8:15e1:757 with SMTP id k26-20020ac8605a000000b003a815e10757mr1793781qtm.194.1675133949708;
Mon, 30 Jan 2023 18:59:09 -0800 (PST)
X-Received: by 2002:a05:6870:f69c:b0:163:9ade:ea88 with SMTP id
el28-20020a056870f69c00b001639adeea88mr1146240oab.298.1675133949173; Mon, 30
Jan 2023 18:59:09 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 30 Jan 2023 18:59:08 -0800 (PST)
In-Reply-To: <23e42dd0-541b-4a1d-bb1a-46aaed023c03n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=136.50.14.162; posting-account=AoizIQoAAADa7kQDpB0DAj2jwddxXUgl
NNTP-Posting-Host: 136.50.14.162
References: <UWSBL.445233$8_id.34107@fx09.iad> <memo.20230130201308.24676L@jgd.cix.co.uk>
<5_VBL.494480$iS99.384203@fx16.iad> <ba3858a3-2dce-4b8d-b15d-c035d634c368n@googlegroups.com>
<OAWBL.545920$vBI8.267239@fx15.iad> <23e42dd0-541b-4a1d-bb1a-46aaed023c03n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <391cb38e-17c6-44c8-b16c-adf0fc64383cn@googlegroups.com>
Subject: Re: Jason Cong's future of high performance computing
From: jim.brak...@ieee.org (JimBrakefield)
Injection-Date: Tue, 31 Jan 2023 02:59:09 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 4423

by: JimBrakefield - Tue, 31 Jan 2023 02:59 UTC

On Monday, January 30, 2023 at 4:08:37 PM UTC-6, MitchAlsup wrote:
> On Monday, January 30, 2023 at 3:25:06 PM UTC-6, Scott Lurndal wrote:
> > MitchAlsup <Mitch...@aol.com> writes:
> > >On Monday, January 30, 2023 at 2:43:49 PM UTC-6, Scott Lurndal wrote:
> > >> j...@cix.co.uk (John Dallman) writes:=20
> > >> >In article <UWSBL.445233$8_id....@fx09.iad>, sc...@slp53.sl.home (Scott=
> > >=20
> > >> >Lurndal) wrote:=20
> > >> >=20
> > >> >> You "know that it does not really work". But not from first-hand=20
> > >> >> experience.=20
> > >> >=20
> > >> >From the last time I looked at add-on accelerators, how fast does data=
> > >=20
> > >> >get in and out of them? What's the minimum size block of doubles when yo=
> > >u=20
> > >> >can save time, overall, on getting the data into the FPGA, processing it=
> > >,=20
> > >> >and getting it back out again?
> > ><
> > >> Most use PCI-Express. So the bandwidth depends on which generation=20
> > >> and the number of lanes. Many FPGA have hard PCI-E gen 4 or Gen5=20
> > >> using 25Gbps SERDES.
> > ><
> > >Yes, but (the BIG but) the cores on the "chip(s)" access DRAM with =C2=BD
> > >cache line width busses at full core speeds, while PCIe has a) lots
> > >of added latency, b) way less than 256-bits per cycle, and c) slower=20
> > >cycles than the cores.
> > ><
> > >So, it if is DRAM bound application, FPGAs are not going to win=20
> > >accessing DRAM via PCIe.
> >
> > Have you looked at CXL at all? That's all transported by PCIe and
> > is expected to be used for DRAM-bound applications. The FPGA
> > can accesses _coherently_ the PCI host memory directly, no DMA
> > required.
> <
> I have looked at it in brief. What CLX to me is is coherence across
> PCIe PHYs (i.e., pins).
> It still has PCIe latency which is more like 50 clocks instead of 10
> clocks to the DRAM controller.
> And it has PCIe width limitations which the on-die DRAM controller
> does not.
> So, you go from <say> two (2) 128-bit HBM at 2GHz DDR channels
> down to 8-differential-pins at 5GHz and you start to see the BW
> problem. The latency through PCIe is "not all that great".
> <
> If you can afford to spend 128-PCIe-pins on DRAM, you can probably
> make CLX word reasonably for BW limited applications but not for
> latency bound applications.
> <
> You may or may not get the added reliability of PCIe data transfers.
> <
> On-die DRAM has the advantage by spending lots of pins on the
> DRAM DIMM profile busses connecting the DIMMs to the die.
> If you can afford this number of pins for PCIe DRAM, then you can
> get into the comparative performance game; but latency remains
> a stumbling block.

Most high performance FPGAs show support for external DDR memory.
Looks like a complicated topic, in particular,
https://www.xilinx.com/products/intellectual-property/ddr4.html
appears to be licensed IP.
I would find a board or module with DDR and with example RTL.
Similar situation for Intel/Altera.

Re: Jason Cong's future of high performance computing

<tra92h$3m91b$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30631&group=comp.arch#30631

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Jason Cong's future of high performance computing
Date: Mon, 30 Jan 2023 23:30:53 -0600
Organization: A noiseless patient Spider
Lines: 97
Message-ID: <tra92h$3m91b$1@dont-email.me>
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com>
<2avBL.592255$9sn9.484650@fx17.iad> <82ABL.148956$PXw7.24310@fx45.iad>
<4532399a-3e46-421f-81d4-ee15be3a0420n@googlegroups.com>
<2023Jan30.092557@mips.complang.tuwien.ac.at>
<fbe81176-03c1-4c80-ae31-d06b3ef74daan@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 31 Jan 2023 05:30:57 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="fa2b2073de3c22c4e8d7ffb8d83ed1f1";
logging-data="3875883"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18HVOyCVPlN+E0KXn4ln9es"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.6.1
Cancel-Lock: sha1:geyL/gyvnqMNvwFnwh9DDveb96w=
Content-Language: en-US
In-Reply-To: <fbe81176-03c1-4c80-ae31-d06b3ef74daan@googlegroups.com>

by: BGB - Tue, 31 Jan 2023 05:30 UTC

On 1/30/2023 2:58 PM, MitchAlsup wrote:
> On Monday, January 30, 2023 at 2:39:35 AM UTC-6, Anton Ertl wrote:
>> Michael S <already...@yahoo.com> writes:
>>> Pay attention that programming FPGAs in Verilog is almost exclusively
>>> USA trait. The rest of the world does it in VHDL.
>> Bernd Paysan from Europe wrote b16(-dsp) and b16-small in Verilog
>> <https://github.com/forthy42/b16-small>. It has been used in custom
>> silicon, not (to my knowledge) in FPGA, but does that make a
>> difference?
>>
>> From the HOPL talk about Verilog, my impression is: Around 2000 all
>> the buzz was for VHDL, and that Verilog was doomed. Verilog survived
>> and won in large projects, because it was designed for efficient
>> implementation of simulators, while the design of VHDL necessarily
>> leads to less efficiency. For large projects this efficiency is very
>> important, while for smaller projects the VHDL simulators are fast
>> enough.
> <
> In other words--with todays fast CPUs--one can dispense with the
> pipeline timing simulators written in C and proceed directly to System
> Verilog which can be used as pipeline timing simulator, FPGA output
> ASIC, and Standard cell library implementations. Saving design team
> effort.
> <

In my case, I am mostly using a combination of an emulator and simulations.

The emulator is written in C, and also models the pipeline timing,
branch predictor, cache hierarchy, and similar. Modeling this stuff
isn't ideal for emulation performance, but given its main goal is mostly
to emulate a CPU running at 50MHz, it works. To be useful, it does
generally need to be fast enough to keep up with real-time.

Early on, this is easier, but with some newer and "more complicated"
instructions, maintaining real-time emulation speed is more difficult
(this would mostly include things like Binary16 SIMD ops and
compressed-texture instructions and similar, which are relatively
expensive to emulate).

The simulations mostly run the Verilog code (via Verilator), and are
further divided:
Partial simulation only simulates the CPU core, with the bus and all
MMIO devices being implemented in a mix of C and C++;
Full simulation runs everything that would run in the FPGA in Verilog,
mostly providing an interface at the level of external components (DDR
RAM module, SDcard pins, VGA pins, ...).

Former is ~ 200x slower than real-time (*1), latter is ~ 1000x slower
than real-time.

Generally seems to work...

*1: Despite operating in kHz territory, for the most part its
command-like interface is still surprisingly responsive.
However, something like Doom is "one frame every 10 to 15 seconds or
so.", and GLQuake is roughly "one frame per minute".

....

I guess arguably, if one had an FPGA accelerator card, they could use it
to run Verilog at somewhat faster speeds than if using a simulation...

But, likely, one would be limited to one simulation at a time, vs on my
PC where I can often run 5 or 6 simulations at the same time (mostly to
keep watch for bugs and crashes).

Some bugs remain elusive though. Despite my efforts, I have not resolved
the "alias models in GLQuake sometimes get mangled" bug.

Have at least resolved the "robot enemies are broken in ROTT" bug:
It turns out a bug in the "expression reducer" was causing expressions
to sometimes be reduced in ways which did not respect lexical binding
semantics (so it would resolve a symbol in terms of an outer scope
before the inner scope copes into being; in cases where the intended
inner scope variable shadows a definition in an outer scope).

This turns out to have also been the cause of "Third demo in Doom
desyncs on BJX2 in a different way than on x86" bug.

Had also found and fixed another bug where "label lookup for a given
address" in BGBCC would sometimes fail if the label was at the same
address as a line-number. This was also resulting in a few minor bugs
(mostly involving the WEXifier incorrectly shuffling instructions across
label boundaries).

>> - anton
>> --
>> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
>> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Jason Cong's future of high performance computing

<2023Jan31.110053@mips.complang.tuwien.ac.at>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30633&group=comp.arch#30633

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Jason Cong's future of high performance computing
Date: Tue, 31 Jan 2023 10:00:53 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 35
Message-ID: <2023Jan31.110053@mips.complang.tuwien.ac.at>
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com> <2avBL.592255$9sn9.484650@fx17.iad> <82ABL.148956$PXw7.24310@fx45.iad> <4532399a-3e46-421f-81d4-ee15be3a0420n@googlegroups.com> <2023Jan30.092557@mips.complang.tuwien.ac.at> <fbe81176-03c1-4c80-ae31-d06b3ef74daan@googlegroups.com>
Injection-Info: reader01.eternal-september.org; posting-host="9298d93583733b1e8f0a581c7a979823";
logging-data="3961206"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/yb3h+1zvr3BeLRzQHlxgA"
Cancel-Lock: sha1:3kqA/lezjhlrK7Esc79ZVZpPeZE=
X-newsreader: xrn 10.11

by: Anton Ertl - Tue, 31 Jan 2023 10:00 UTC

MitchAlsup <MitchAlsup@aol.com> writes:
>On Monday, January 30, 2023 at 2:39:35 AM UTC-6, Anton Ertl wrote:
>> From the HOPL talk about Verilog, my impression is: Around 2000 all
>> the buzz was for VHDL, and that Verilog was doomed. Verilog survived
>> and won in large projects, because it was designed for efficient
>> implementation of simulators, while the design of VHDL necessarily
>> leads to less efficiency. For large projects this efficiency is very
>> important, while for smaller projects the VHDL simulators are fast
>> enough.
><
>In other words--with todays fast CPUs--one can dispense with the
>pipeline timing simulators written in C and proceed directly to System
>Verilog which can be used as pipeline timing simulator, FPGA output
>ASIC, and Standard cell library implementations. Saving design team
>effort.

I am not an expert, but

1) "todays fast CPUs" don't help, because programs are now written to
require faster CPUs, and therefore faster simulators. If the
simulator is slower by a factor X, the development of faster CPUs
means that the simulator is still a factor X slower than the CPU you
want to simulate.

2) I expect you still want architecture simulators, microarchitecture
simulators, and circuit-level simulators. You seem to be discussing
microarchitecture simulators above. Switching from C to System
Verilog for that may be useful for consistency between the
microarchitecture simulator and the circuit-level simulator (in
Verilog), but otherwise has little to do with what I wrote.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Jason Cong's future of high performance computing

<bc2f7034-1352-4c02-b90f-6f887d21a225n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30634&group=comp.arch#30634

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:6214:70e:b0:53a:bf35:63eb with SMTP id c14-20020a056214070e00b0053abf3563ebmr545546qvz.55.1675164572291;
Tue, 31 Jan 2023 03:29:32 -0800 (PST)
X-Received: by 2002:a05:6871:29a:b0:166:732f:23fd with SMTP id
i26-20020a056871029a00b00166732f23fdmr19389oae.186.1675164572062; Tue, 31 Jan
2023 03:29:32 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 31 Jan 2023 03:29:31 -0800 (PST)
In-Reply-To: <2023Jan30.092557@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=199.203.251.52; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 199.203.251.52
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com>
<2avBL.592255$9sn9.484650@fx17.iad> <82ABL.148956$PXw7.24310@fx45.iad>
<4532399a-3e46-421f-81d4-ee15be3a0420n@googlegroups.com> <2023Jan30.092557@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <bc2f7034-1352-4c02-b90f-6f887d21a225n@googlegroups.com>
Subject: Re: Jason Cong's future of high performance computing
From: already5...@yahoo.com (Michael S)
Injection-Date: Tue, 31 Jan 2023 11:29:32 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 2617

by: Michael S - Tue, 31 Jan 2023 11:29 UTC

On Monday, January 30, 2023 at 10:39:35 AM UTC+2, Anton Ertl wrote:
> Michael S <already...@yahoo.com> writes:
> >Pay attention that programming FPGAs in Verilog is almost exclusively
> >USA trait. The rest of the world does it in VHDL.
> Bernd Paysan from Europe wrote b16(-dsp) and b16-small in Verilog
> <https://github.com/forthy42/b16-small>. It has been used in custom
> silicon, not (to my knowledge) in FPGA, but does that make a
> difference?
>
> From the HOPL talk about Verilog, my impression is: Around 2000 all
> the buzz was for VHDL, and that Verilog was doomed. Verilog survived
> and won in large projects, because it was designed for efficient
> implementation of simulators, while the design of VHDL necessarily
> leads to less efficiency.

According to my understanding, VHDL is hard to simulate efficiently
with interpreted simulators. On so called compiled-code simulators
the speed of simulation either does not depend at all on the HDL
used or depends very little.

> For large projects this efficiency is very
> important, while for smaller projects the VHDL simulators are fast
> enough.
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Jason Cong's future of high performance computing

<PzaCL.506164$iS99.46596@fx16.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30640&group=comp.arch#30640

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx16.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Jason Cong's future of high performance computing
Newsgroups: comp.arch
References: <UWSBL.445233$8_id.34107@fx09.iad> <memo.20230130201308.24676L@jgd.cix.co.uk> <5_VBL.494480$iS99.384203@fx16.iad> <ba3858a3-2dce-4b8d-b15d-c035d634c368n@googlegroups.com> <OAWBL.545920$vBI8.267239@fx15.iad> <23e42dd0-541b-4a1d-bb1a-46aaed023c03n@googlegroups.com> <391cb38e-17c6-44c8-b16c-adf0fc64383cn@googlegroups.com>
Lines: 12
Message-ID: <PzaCL.506164$iS99.46596@fx16.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Tue, 31 Jan 2023 15:36:15 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Tue, 31 Jan 2023 15:36:15 GMT
X-Received-Bytes: 1412

by: Scott Lurndal - Tue, 31 Jan 2023 15:36 UTC

JimBrakefield <jim.brakefield@ieee.org> writes:
>On Monday, January 30, 2023 at 4:08:37 PM UTC-6, MitchAlsup wrote:
>> On Monday, January 30, 2023 at 3:25:06 PM UTC-6, Scott Lurndal wrote:
>> > MitchAlsup <Mitch...@aol.com> writes:

>> <
>> If you can afford to spend 128-PCIe-pins on DRAM, you can probably
>> make CLX word reasonably for BW limited applications but not for
>> latency bound applications.

A x16 requires 82 pins. A x4, 21. PCIe 5.0 bandwidth is 4GB/sec per lane,
so x4 gives you 16GB/sec; x16 64GB/sec.

Re: Jason Cong's future of high performance computing

<trbflb$3sk3h$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30642&group=comp.arch#30642

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.arch
Subject: Re: Jason Cong's future of high performance computing
Date: Tue, 31 Jan 2023 17:29:31 +0100
Organization: A noiseless patient Spider
Lines: 38
Message-ID: <trbflb$3sk3h$1@dont-email.me>
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com>
<2avBL.592255$9sn9.484650@fx17.iad> <82ABL.148956$PXw7.24310@fx45.iad>
<4532399a-3e46-421f-81d4-ee15be3a0420n@googlegroups.com>
<2023Jan30.092557@mips.complang.tuwien.ac.at>
<bc2f7034-1352-4c02-b90f-6f887d21a225n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 31 Jan 2023 16:29:31 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="b3f5d1a331197cc7e664d9d37655eb86";
logging-data="4083825"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/eFGKzVgr43LLumbylrAX+kHTiFiX+T/0="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.4.2
Cancel-Lock: sha1:CsTlVdpxvJJhbZEW4Y9X968T/S8=
In-Reply-To: <bc2f7034-1352-4c02-b90f-6f887d21a225n@googlegroups.com>
Content-Language: en-GB

by: David Brown - Tue, 31 Jan 2023 16:29 UTC

On 31/01/2023 12:29, Michael S wrote:
> On Monday, January 30, 2023 at 10:39:35 AM UTC+2, Anton Ertl wrote:
>> Michael S <already...@yahoo.com> writes:
>>> Pay attention that programming FPGAs in Verilog is almost exclusively
>>> USA trait. The rest of the world does it in VHDL.
>> Bernd Paysan from Europe wrote b16(-dsp) and b16-small in Verilog
>> <https://github.com/forthy42/b16-small>. It has been used in custom
>> silicon, not (to my knowledge) in FPGA, but does that make a
>> difference?
>>
>> From the HOPL talk about Verilog, my impression is: Around 2000 all
>> the buzz was for VHDL, and that Verilog was doomed. Verilog survived
>> and won in large projects, because it was designed for efficient
>> implementation of simulators, while the design of VHDL necessarily
>> leads to less efficiency.
>
> According to my understanding, VHDL is hard to simulate efficiently
> with interpreted simulators. On so called compiled-code simulators
> the speed of simulation either does not depend at all on the HDL
> used or depends very little.
>
>> For large projects this efficiency is very
>> important, while for smaller projects the VHDL simulators are fast
>> enough.

My understanding is that for big projects, neither Verilog nor VHDL are
used because both languages are designed for modelling analogue
circuits, not designing digital circuits. There are a variety of
high-level digital design languages that are used that are far easier to
write correctly (and in some cases, prove correctness). Simulation is
orders of magnitude more efficient. They generally output VHDL and/or
Verilog so that synthesis tools can generate FPGA bitstreams.

An example of such a language would be SpinalHDL, which has been used
for RISC-V implementations: <https://github.com/SpinalHDL>

Re: Jason Cong's future of high performance computing

<SMbCL.36083$eRZ7.3617@fx06.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30643&group=comp.arch#30643

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx06.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Jason Cong's future of high performance computing
Newsgroups: comp.arch
References: <UWSBL.445233$8_id.34107@fx09.iad> <memo.20230130201308.24676L@jgd.cix.co.uk> <5_VBL.494480$iS99.384203@fx16.iad> <ba3858a3-2dce-4b8d-b15d-c035d634c368n@googlegroups.com> <OAWBL.545920$vBI8.267239@fx15.iad> <23e42dd0-541b-4a1d-bb1a-46aaed023c03n@googlegroups.com> <391cb38e-17c6-44c8-b16c-adf0fc64383cn@googlegroups.com> <PzaCL.506164$iS99.46596@fx16.iad>
Lines: 15
Message-ID: <SMbCL.36083$eRZ7.3617@fx06.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Tue, 31 Jan 2023 16:58:26 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Tue, 31 Jan 2023 16:58:26 GMT
X-Received-Bytes: 1582

by: Scott Lurndal - Tue, 31 Jan 2023 16:58 UTC

scott@slp53.sl.home (Scott Lurndal) writes:
>JimBrakefield <jim.brakefield@ieee.org> writes:
>>On Monday, January 30, 2023 at 4:08:37 PM UTC-6, MitchAlsup wrote:
>>> On Monday, January 30, 2023 at 3:25:06 PM UTC-6, Scott Lurndal wrote:
>>> > MitchAlsup <Mitch...@aol.com> writes:
>
>>> <
>>> If you can afford to spend 128-PCIe-pins on DRAM, you can probably
>>> make CLX word reasonably for BW limited applications but not for
>>> latency bound applications.
>
>A x16 requires 82 pins. A x4, 21. PCIe 5.0 bandwidth is 4GB/sec per lane,
>so x4 gives you 16GB/sec; x16 64GB/sec.

It is effectively a NUMA system, not much different from westmere latencies.

Re: Jason Cong's future of high performance computing

<k3svt8Fe67pU1@mid.individual.net>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30644&group=comp.arch#30644

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!news.mb-net.net!open-news-network.org!news.mind.de!bolzen.all.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: niklas.h...@tidorum.invalid (Niklas Holsti)
Newsgroups: comp.arch
Subject: Re: Jason Cong's future of high performance computing
Date: Tue, 31 Jan 2023 19:02:32 +0200
Organization: Tidorum Ltd
Lines: 50
Message-ID: <k3svt8Fe67pU1@mid.individual.net>
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com>
<2avBL.592255$9sn9.484650@fx17.iad> <82ABL.148956$PXw7.24310@fx45.iad>
<4532399a-3e46-421f-81d4-ee15be3a0420n@googlegroups.com>
<2023Jan30.092557@mips.complang.tuwien.ac.at>
<bc2f7034-1352-4c02-b90f-6f887d21a225n@googlegroups.com>
<trbflb$3sk3h$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Trace: individual.net 9I15ftMkNouVal4WoJ5iRwYafXV8nV+OlmIXufenZVWQzrHgQX
Cancel-Lock: sha1:dbmaGOL1tA9c7xiycf7/2EKYIaM=
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:102.0)
Gecko/20100101 Thunderbird/102.6.1
Content-Language: en-US
In-Reply-To: <trbflb$3sk3h$1@dont-email.me>

by: Niklas Holsti - Tue, 31 Jan 2023 17:02 UTC

On 2023-01-31 18:29, David Brown wrote:
> On 31/01/2023 12:29, Michael S wrote:
>> On Monday, January 30, 2023 at 10:39:35 AM UTC+2, Anton Ertl wrote:
>>> Michael S <already...@yahoo.com> writes:
>>>> Pay attention that programming FPGAs in Verilog is almost exclusively
>>>> USA trait. The rest of the world does it in VHDL.
>>> Bernd Paysan from Europe wrote b16(-dsp) and b16-small in Verilog
>>> <https://github.com/forthy42/b16-small>. It has been used in custom
>>> silicon, not (to my knowledge) in FPGA, but does that make a
>>> difference?
>>>
>>> From the HOPL talk about Verilog, my impression is: Around 2000 all
>>> the buzz was for VHDL, and that Verilog was doomed. Verilog survived
>>> and won in large projects, because it was designed for efficient
>>> implementation of simulators, while the design of VHDL necessarily
>>> leads to less efficiency.
>>
>> According to my understanding, VHDL is hard to simulate efficiently
>> with interpreted simulators. On so called compiled-code simulators
>> the speed of simulation either does not depend at all on the HDL
>> used or depends very little.
>>
>>> For large projects this efficiency is very
>>> important, while for smaller projects the VHDL simulators are fast
>>> enough.
>
> My understanding is that for big projects, neither Verilog nor VHDL are
> used because both languages are designed for modelling analogue
> circuits, not designing digital circuits.

Not so for VHDL, at least. For example, the standard ESA on-board
spacecraft processors for the last decade and currently are/were the
ERC32 and LEON series of SPARC v8 processors which were designed in VHDL
by ESA and Gaisler Research. Those are of course digital, not analogue.
Whether they can be considered "big" is subjective.

> There are a variety of high-level digital design languages that are
> used that are far easier to write correctly (and in some cases, prove
> correctness). Simulation is orders of magnitude more efficient.
> They generally output VHDL and/or Verilog so that synthesis tools can
> generate FPGA bitstreams.
>
> An example of such a language would be SpinalHDL, which has been used
> for RISC-V implementations: <https://github.com/SpinalHDL>

How could that work if VHDL and Verilog were intended for analogue
circuits? RISC-V is a digital system.

Re: Jason Cong's future of high performance computing

<c7f03af6-51ee-49d1-90d5-5fd1a70218f5n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30646&group=comp.arch#30646

copy link Newsgroups: comp.arch

X-Received: by 2002:ae9:f107:0:b0:717:a5d4:de3f with SMTP id k7-20020ae9f107000000b00717a5d4de3fmr1074659qkg.157.1675187392706;
Tue, 31 Jan 2023 09:49:52 -0800 (PST)
X-Received: by 2002:a05:6870:ac0f:b0:163:bf69:86ce with SMTP id
kw15-20020a056870ac0f00b00163bf6986cemr816344oab.245.1675187392378; Tue, 31
Jan 2023 09:49:52 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 31 Jan 2023 09:49:52 -0800 (PST)
In-Reply-To: <2023Jan31.110053@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:fdb4:df09:e0d4:ab5c;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:fdb4:df09:e0d4:ab5c
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com>
<2avBL.592255$9sn9.484650@fx17.iad> <82ABL.148956$PXw7.24310@fx45.iad>
<4532399a-3e46-421f-81d4-ee15be3a0420n@googlegroups.com> <2023Jan30.092557@mips.complang.tuwien.ac.at>
<fbe81176-03c1-4c80-ae31-d06b3ef74daan@googlegroups.com> <2023Jan31.110053@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c7f03af6-51ee-49d1-90d5-5fd1a70218f5n@googlegroups.com>
Subject: Re: Jason Cong's future of high performance computing
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 31 Jan 2023 17:49:52 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 3784

by: MitchAlsup - Tue, 31 Jan 2023 17:49 UTC

On Tuesday, January 31, 2023 at 4:09:09 AM UTC-6, Anton Ertl wrote:
> MitchAlsup <Mitch...@aol.com> writes:
> >On Monday, January 30, 2023 at 2:39:35 AM UTC-6, Anton Ertl wrote:
> >> From the HOPL talk about Verilog, my impression is: Around 2000 all
> >> the buzz was for VHDL, and that Verilog was doomed. Verilog survived
> >> and won in large projects, because it was designed for efficient
> >> implementation of simulators, while the design of VHDL necessarily
> >> leads to less efficiency. For large projects this efficiency is very
> >> important, while for smaller projects the VHDL simulators are fast
> >> enough.
> ><
> >In other words--with todays fast CPUs--one can dispense with the
> >pipeline timing simulators written in C and proceed directly to System
> >Verilog which can be used as pipeline timing simulator, FPGA output
> >ASIC, and Standard cell library implementations. Saving design team
> >effort.
> I am not an expert, but
>
> 1) "todays fast CPUs" don't help, because programs are now written to
> require faster CPUs, and therefore faster simulators. If the
> simulator is slower by a factor X, the development of faster CPUs
> means that the simulator is still a factor X slower than the CPU you
> want to simulate.
<
It is not so much speed as having every nuance of the microarchitecture;
such as:: interrupts, exceptions, privilege's, protection, all cycle accurate
with the microarchitecture. You want the simulator able to BOOT the Hyper-
Visor and fork off GuestOSs (in less than an hour.) You want the simulator
to be capable of debugging the Operating System, the compilers, the linker
the dynamic linker, the file system, and the network, as well as the I/O MMU.
>
> 2) I expect you still want architecture simulators, microarchitecture
> simulators, and circuit-level simulators. You seem to be discussing
> microarchitecture simulators above. Switching from C to System
> Verilog for that may be useful for consistency between the
> microarchitecture simulator and the circuit-level simulator (in
> Verilog), but otherwise has little to do with what I wrote.
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

Re: Jason Cong's future of high performance computing

<68064b02-7bcd-4f64-9dcf-c11fa1dfb0dcn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30647&group=comp.arch#30647

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:620a:15bc:b0:71f:af75:9914 with SMTP id f28-20020a05620a15bc00b0071faf759914mr513140qkk.164.1675187550390;
Tue, 31 Jan 2023 09:52:30 -0800 (PST)
X-Received: by 2002:a05:6870:c14e:b0:163:a303:fe36 with SMTP id
g14-20020a056870c14e00b00163a303fe36mr1508702oad.106.1675187550103; Tue, 31
Jan 2023 09:52:30 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 31 Jan 2023 09:52:29 -0800 (PST)
In-Reply-To: <k3svt8Fe67pU1@mid.individual.net>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:fdb4:df09:e0d4:ab5c;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:fdb4:df09:e0d4:ab5c
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com>
<2avBL.592255$9sn9.484650@fx17.iad> <82ABL.148956$PXw7.24310@fx45.iad>
<4532399a-3e46-421f-81d4-ee15be3a0420n@googlegroups.com> <2023Jan30.092557@mips.complang.tuwien.ac.at>
<bc2f7034-1352-4c02-b90f-6f887d21a225n@googlegroups.com> <trbflb$3sk3h$1@dont-email.me>
<k3svt8Fe67pU1@mid.individual.net>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <68064b02-7bcd-4f64-9dcf-c11fa1dfb0dcn@googlegroups.com>
Subject: Re: Jason Cong's future of high performance computing
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 31 Jan 2023 17:52:30 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 4218

by: MitchAlsup - Tue, 31 Jan 2023 17:52 UTC

On Tuesday, January 31, 2023 at 11:02:36 AM UTC-6, Niklas Holsti wrote:
> On 2023-01-31 18:29, David Brown wrote:
> > On 31/01/2023 12:29, Michael S wrote:
> >> On Monday, January 30, 2023 at 10:39:35 AM UTC+2, Anton Ertl wrote:
> >>> Michael S <already...@yahoo.com> writes:
> >>>> Pay attention that programming FPGAs in Verilog is almost exclusively
> >>>> USA trait. The rest of the world does it in VHDL.
> >>> Bernd Paysan from Europe wrote b16(-dsp) and b16-small in Verilog
> >>> <https://github.com/forthy42/b16-small>. It has been used in custom
> >>> silicon, not (to my knowledge) in FPGA, but does that make a
> >>> difference?
> >>>
> >>> From the HOPL talk about Verilog, my impression is: Around 2000 all
> >>> the buzz was for VHDL, and that Verilog was doomed. Verilog survived
> >>> and won in large projects, because it was designed for efficient
> >>> implementation of simulators, while the design of VHDL necessarily
> >>> leads to less efficiency.
> >>
> >> According to my understanding, VHDL is hard to simulate efficiently
> >> with interpreted simulators. On so called compiled-code simulators
> >> the speed of simulation either does not depend at all on the HDL
> >> used or depends very little.
> >>
> >>> For large projects this efficiency is very
> >>> important, while for smaller projects the VHDL simulators are fast
> >>> enough.
> >
> > My understanding is that for big projects, neither Verilog nor VHDL are
> > used because both languages are designed for modelling analogue
> > circuits, not designing digital circuits.
> Not so for VHDL, at least. For example, the standard ESA on-board
> spacecraft processors for the last decade and currently are/were the
> ERC32 and LEON series of SPARC v8 processors which were designed in VHDL
> by ESA and Gaisler Research. Those are of course digital, not analogue.
> Whether they can be considered "big" is subjective.
> > There are a variety of high-level digital design languages that are
> > used that are far easier to write correctly (and in some cases, prove
> > correctness). Simulation is orders of magnitude more efficient.
> > They generally output VHDL and/or Verilog so that synthesis tools can
> > generate FPGA bitstreams.
> >
> > An example of such a language would be SpinalHDL, which has been used
> > for RISC-V implementations: <https://github.com/SpinalHDL>
<
> How could that work if VHDL and Verilog were intended for analogue
> circuits? RISC-V is a digital system.
<
Neither one is suitable for analog--design of Operational Amplifiers,
analog comparators, analog multipliers:: but they may be suitable
for some parts of A/D and D/A converters.

Re: Jason Cong's future of high performance computing

<lLcCL.38565$5jd8.4985@fx05.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=30649&group=comp.arch#30649

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!feed1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx05.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: Jason Cong's future of high performance computing
Newsgroups: comp.arch
References: <bdd94600-f8a0-4367-8c12-ba1f9d38821en@googlegroups.com> <2avBL.592255$9sn9.484650@fx17.iad> <82ABL.148956$PXw7.24310@fx45.iad> <4532399a-3e46-421f-81d4-ee15be3a0420n@googlegroups.com> <2023Jan30.092557@mips.complang.tuwien.ac.at> <fbe81176-03c1-4c80-ae31-d06b3ef74daan@googlegroups.com> <2023Jan31.110053@mips.complang.tuwien.ac.at> <c7f03af6-51ee-49d1-90d5-5fd1a70218f5n@googlegroups.com>
Lines: 35
Message-ID: <lLcCL.38565$5jd8.4985@fx05.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Tue, 31 Jan 2023 18:05:05 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Tue, 31 Jan 2023 18:05:05 GMT
X-Received-Bytes: 2913

by: Scott Lurndal - Tue, 31 Jan 2023 18:05 UTC

MitchAlsup <MitchAlsup@aol.com> writes:
>On Tuesday, January 31, 2023 at 4:09:09 AM UTC-6, Anton Ertl wrote:
>> MitchAlsup <Mitch...@aol.com> writes:
>> >On Monday, January 30, 2023 at 2:39:35 AM UTC-6, Anton Ertl wrote:
>> >> From the HOPL talk about Verilog, my impression is: Around 2000 all
>> >> the buzz was for VHDL, and that Verilog was doomed. Verilog survived
>> >> and won in large projects, because it was designed for efficient
>> >> implementation of simulators, while the design of VHDL necessarily
>> >> leads to less efficiency. For large projects this efficiency is very
>> >> important, while for smaller projects the VHDL simulators are fast
>> >> enough.
>> ><
>> >In other words--with todays fast CPUs--one can dispense with the
>> >pipeline timing simulators written in C and proceed directly to System
>> >Verilog which can be used as pipeline timing simulator, FPGA output
>> >ASIC, and Standard cell library implementations. Saving design team
>> >effort.
>> I am not an expert, but
>>
>> 1) "todays fast CPUs" don't help, because programs are now written to
>> require faster CPUs, and therefore faster simulators. If the
>> simulator is slower by a factor X, the development of faster CPUs
>> means that the simulator is still a factor X slower than the CPU you
>> want to simulate.
><
>It is not so much speed as having every nuance of the microarchitecture;
>such as:: interrupts, exceptions, privilege's, protection, all cycle accurate
>with the microarchitecture. You want the simulator able to BOOT the Hyper-
>Visor and fork off GuestOSs (in less than an hour.) You want the simulator
>to be capable of debugging the Operating System, the compilers, the linker
>the dynamic linker, the file system, and the network, as well as the I/O MMU.

Indeed, although even more important is to support the custom
logical blocks in the simulator to allow driver development in
advance of tapeout.

Subject	Author
Jason Cong's future of high performance computing	JimBrakefield
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	Scott Lurndal
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	EricP
Re: Jason Cong's future of high performance computing	Thomas Koenig
Re: Jason Cong's future of high performance computing	Quadibloc
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	robf...@gmail.com
Re: Jason Cong's future of high performance computing	Michael S
Re: Jason Cong's future of high performance computing	BGB
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	BGB
Re: Jason Cong's future of high performance computing	Quadibloc
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	Quadibloc
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	Scott Lurndal
Re: Jason Cong's future of high performance computing	Scott Lurndal
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	JimBrakefield
Re: Jason Cong's future of high performance computing	BGB
Re: Jason Cong's future of high performance computing	Terje Mathisen
Re: Jason Cong's future of high performance computing	Michael S
Re: Jason Cong's future of high performance computing	JimBrakefield
Re: Jason Cong's future of high performance computing	BGB
Re: Jason Cong's future of high performance computing	Scott Lurndal
Re: Jason Cong's future of high performance computing	Anton Ertl
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	EricP
Re: Jason Cong's future of high performance computing	Michael S
Re: Jason Cong's future of high performance computing	Anton Ertl
Re: Jason Cong's future of high performance computing	Michael S
Re: Jason Cong's future of high performance computing	Scott Lurndal
Re: Jason Cong's future of high performance computing	Scott Lurndal
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	Scott Lurndal
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	JimBrakefield
Re: Jason Cong's future of high performance computing	Scott Lurndal
Re: Jason Cong's future of high performance computing	Scott Lurndal
Re: Jason Cong's future of high performance computing	Michael S
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	BGB
Re: Jason Cong's future of high performance computing	Anton Ertl
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	Scott Lurndal
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	Scott Lurndal
Re: Jason Cong's future of high performance computing	Michael S
Re: Jason Cong's future of high performance computing	David Brown
Re: Jason Cong's future of high performance computing	Niklas Holsti
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	David Brown
Re: Jason Cong's future of high performance computing	Niklas Holsti
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	Michael S
Re: Jason Cong's future of high performance computing	MitchAlsup
Re: Jason Cong's future of high performance computing	JimBrakefield
Re: Jason Cong's future of high performance computing	Anton Ertl
Re: Jason Cong's future of high performance computing	Michael S
Re: Jason Cong's future of high performance computing	BGB
Re: Jason Cong's future of high performance computing	Thomas Koenig

19 May, 2024: Line wrapping has been changed to be more consistent with Usenet standards. If you find that it is broken please let me know here rocksolid.nodes.help

devel / comp.arch / Re: Jason Cong's future of high performance computing

devel / comp.arch / Re: Jason Cong's future of high performance computing

19 May, 2024: Line wrapping has been changed to be more consistent with Usenet standards.
If you find that it is broken please let me know here rocksolid.nodes.help