Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

core error - bus dumped


devel / comp.arch / The Tera MTA

SubjectAuthor
* The Tera MTAQuadibloc
+* Re: The Tera MTAMitchAlsup
|`* Re: The Tera MTABGB
| `* Re: The Tera MTAMitchAlsup
|  +* Re: The Tera MTABGB
|  |+* Re: The Tera MTAMitchAlsup
|  ||+* Re: The Tera MTAQuadibloc
|  |||`- Re: The Tera MTAAnton Ertl
|  ||`- Re: The Tera MTABGB
|  |`* Re: The Tera MTAMarcus
|  | `* Re: The Tera MTAQuadibloc
|  |  +- Re: The Tera MTAQuadibloc
|  |  +* Re: The Tera MTAStephen Fuld
|  |  |+- Re: The Tera MTABGB
|  |  |`* Re: The Tera MTAQuadibloc
|  |  | +* Re: The Tera MTAMitchAlsup
|  |  | |`* Re: The Tera MTAQuadibloc
|  |  | | +* Re: The Tera MTATerje Mathisen
|  |  | | |`- Re: The Tera MTAIvan Godard
|  |  | | +- Re: The Tera MTAStephen Fuld
|  |  | | `- Re: The Tera MTABill Findlay
|  |  | `* Re: The Tera MTAMarcus
|  |  |  +- Re: The Tera MTAChris M. Thomasson
|  |  |  `* Re: The Tera MTABGB
|  |  |   `- Re: The Tera MTAMarcus
|  |  `* Re: The Tera MTAMitchAlsup
|  |   `* Re: The Tera MTAQuadibloc
|  |    `- Re: The Tera MTAMitchAlsup
|  `* Re: The Tera MTAPaul A. Clayton
|   `* Re: The Tera MTAQuadibloc
|    `* Re: The Tera MTAPaul A. Clayton
|     +* Re: The Tera MTAStephen Fuld
|     |+* Re: The Tera MTAPaul A. Clayton
|     ||`- Re: The Tera MTAStefan Monnier
|     |`- Re: The Tera MTAQuadibloc
|     `- Re: The Tera MTAPaul A. Clayton
+* Re: The Tera MTATom Gardner
|+* Re: The Tera MTAQuadibloc
||`* Re: The Tera MTAQuadibloc
|| +* Re: The Tera MTAQuadibloc
|| |`* Re: The Tera MTAMitchAlsup
|| | +- Re: The Tera MTAQuadibloc
|| | `* Re: The Tera MTAPaul A. Clayton
|| |  +* Re: The Tera MTAQuadibloc
|| |  |`- Re: The Tera MTAIvan Godard
|| |  `* Re: The Tera MTAPaul A. Clayton
|| |   `* Re: The Tera MTAAnton Ertl
|| |    `* Re: The Tera MTAMichael S
|| |     `* Re: The Tera MTAAnton Ertl
|| |      `- Re: The Tera MTAMichael S
|| `- Re: The Tera MTAAnton Ertl
|`* Re: The Tera MTAAnton Ertl
| +- Re: The Tera MTAJohn Dallman
| `- Re: The Tera MTAQuadibloc
`- Re: The Tera MTAQuadibloc

Pages:123
Re: The Tera MTA

<d6280cda-708f-4a7c-a5c0-11b21dc7a750n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19052&group=comp.arch#19052

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a0c:ee2a:: with SMTP id l10mr1512419qvs.22.1626981832769; Thu, 22 Jul 2021 12:23:52 -0700 (PDT)
X-Received: by 2002:aca:c7cb:: with SMTP id x194mr1054928oif.119.1626981832555; Thu, 22 Jul 2021 12:23:52 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!feeder1.feed.usenet.farm!feed.usenet.farm!tr2.eu1.usenetexpress.com!feeder.usenetexpress.com!tr2.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 22 Jul 2021 12:23:52 -0700 (PDT)
In-Reply-To: <8b75c15f-e926-4391-a6f9-05ca1ef62709n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=64.26.99.248; posting-account=6JNn0QoAAAD-Scrkl0ClrfutZTkrOS9S
NNTP-Posting-Host: 64.26.99.248
References: <d8e86c4a-44a4-4db2-b92b-ddd9c966b9fdn@googlegroups.com> <127e7a94-d498-4467-8834-c6b639515c3bn@googlegroups.com> <scv1hp$gh9$1@dont-email.me> <1b43802a-3438-4aa8-bdc4-0b86f976eda9n@googlegroups.com> <b669dfad-aafc-4933-9b2c-07900162d9fbn@googlegroups.com> <2d30e258-5cfc-47e6-b5e1-262e5c5f2701n@googlegroups.com> <8b75c15f-e926-4391-a6f9-05ca1ef62709n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <d6280cda-708f-4a7c-a5c0-11b21dc7a750n@googlegroups.com>
Subject: Re: The Tera MTA
From: paaroncl...@gmail.com (Paul A. Clayton)
Injection-Date: Thu, 22 Jul 2021 19:23:52 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 11
 by: Paul A. Clayton - Thu, 22 Jul 2021 19:23 UTC

On Thursday, July 22, 2021 at 11:50:00 AM UTC-4, Paul A. Clayton wrote:
[snip]
> (Another interesting technique was using three storage arrays
> composed of two banks and an array with the XOR of the
> entries from the other banks. This allows the alternate bank
> and XOR copy to be read when there would otherwise be a
> bank conflict. I would have to search for the paper.)

The paper was "CRAM: Coded Registers for Amplified Multiporting",
Vignyan Reddy Kothinti Naresh et al., 2011. (I have not yet
read through the paper carefully; I merely remembered the basic
concept from an earlier "skimming".)

Re: The Tera MTA

<5f1756e4-1553-4569-af6f-92bdbc48e8d2n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19053&group=comp.arch#19053

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:9b44:: with SMTP id d65mr1809857qke.71.1626990790604;
Thu, 22 Jul 2021 14:53:10 -0700 (PDT)
X-Received: by 2002:a9d:3a49:: with SMTP id j67mr1262097otc.114.1626990786347;
Thu, 22 Jul 2021 14:53:06 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 22 Jul 2021 14:53:06 -0700 (PDT)
In-Reply-To: <sdc5n4$5qq$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:f39a:d100:2919:cb81:b8ba:bba3;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:f39a:d100:2919:cb81:b8ba:bba3
References: <d8e86c4a-44a4-4db2-b92b-ddd9c966b9fdn@googlegroups.com>
<127e7a94-d498-4467-8834-c6b639515c3bn@googlegroups.com> <scv1hp$gh9$1@dont-email.me>
<1b43802a-3438-4aa8-bdc4-0b86f976eda9n@googlegroups.com> <b669dfad-aafc-4933-9b2c-07900162d9fbn@googlegroups.com>
<2d30e258-5cfc-47e6-b5e1-262e5c5f2701n@googlegroups.com> <8b75c15f-e926-4391-a6f9-05ca1ef62709n@googlegroups.com>
<sdc5n4$5qq$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <5f1756e4-1553-4569-af6f-92bdbc48e8d2n@googlegroups.com>
Subject: Re: The Tera MTA
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Thu, 22 Jul 2021 21:53:10 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Quadibloc - Thu, 22 Jul 2021 21:53 UTC

On Thursday, July 22, 2021 at 10:17:42 AM UTC-6, Stephen Fuld wrote:
> On 7/22/2021 8:49 AM, Paul A. Clayton wrote:

> > (Another interesting technique was using three storage arrays
> > composed of two banks and an array with the XOR of the
> > entries from the other banks. This allows the alternate bank
> > and XOR copy to be read when there would otherwise be a
> > bank conflict. I would have to search for the paper.)

> RAIR (Redundant Array of Independent Registers) :-)

But does it kill bugs? That does seem like an overly complex way
of adding a read port to a register bank, rather than a practical one.

John Savard

Re: The Tera MTA

<sddjks$9gi$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19056&group=comp.arch#19056

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: The Tera MTA
Date: Fri, 23 Jul 2021 00:21:30 -0500
Organization: A noiseless patient Spider
Lines: 462
Message-ID: <sddjks$9gi$1@dont-email.me>
References: <d8e86c4a-44a4-4db2-b92b-ddd9c966b9fdn@googlegroups.com>
<127e7a94-d498-4467-8834-c6b639515c3bn@googlegroups.com>
<scv1hp$gh9$1@dont-email.me>
<1b43802a-3438-4aa8-bdc4-0b86f976eda9n@googlegroups.com>
<scvp1v$dgh$1@dont-email.me>
<98aaf00d-0858-4a5c-bad1-1022cb3f5f9dn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 23 Jul 2021 05:21:32 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="f2d0ea6b3e8abb37b40547670b01634c";
logging-data="9746"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19jtHrA1FfqZBXddDaXVZ8q"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.12.0
Cancel-Lock: sha1:xqkM6vQg50MJOsaYa9+JNtM4sRo=
In-Reply-To: <98aaf00d-0858-4a5c-bad1-1022cb3f5f9dn@googlegroups.com>
Content-Language: en-US
X-Mozilla-News-Host: news://news.albasani.net
 by: BGB - Fri, 23 Jul 2021 05:21 UTC

On 7/17/2021 9:45 PM, MitchAlsup wrote:
> On Saturday, July 17, 2021 at 6:28:02 PM UTC-5, BGB wrote:
>> On 7/17/2021 2:39 PM, MitchAlsup wrote:
>>> On Saturday, July 17, 2021 at 11:46:52 AM UTC-5, BGB wrote:
>>>> On 7/16/2021 10:40 AM, MitchAlsup wrote:
>>>>> On Friday, July 16, 2021 at 10:00:17 AM UTC-5, Quadibloc wrote:
>>>>>> The Cray XMT (not to be confused with the X-MP) recently came to my
>>>>>> attention again when I happened to stumble upon an eBay sale for a
>>>>>> Threadcrusher 4.0 chip.
>>>>>>
>>>>>> I remembered that at one point, AMD allowed other companies to make
>>>>>> chips that would fit into the same sockets as an AMD processor; in
>>>>>> connection with that, I remember hearing of an FPGA accelerator chip that
>>>>>> did this.
>>>>>>
>>>>>> But I didn't know there was also the Threadcrusher 4.0 from Cray, which could
>>>>>> fit in an Opteron socket also.
>>>>>>
>>>>>> I knew that Sun and/or Oracle made versions of the SPARC chip that took
>>>>>> simultaneous multi-threading (SMT) beyond what Intel did with
>>>>>> Hyper-Threading - instead of two simultaneous threads, their chips could
>>>>>> offer up to _eight_.
>>>>>>
>>>>>> But the Threadcrusher chip from Cray (and it fit into an AMD socket... and
>>>>>> currently AMD has a product it calls the Threadripper...) took SMT to a
>>>>>> rather higher level, with 128 simultaneous threads.
>>>>> <
>>>>> Lookup Burton Smith
>>>>> <
>>>>> Tera was another Denelore HEP-like processor architecture.
>>>>>>
>>>>>> It seems, from the history, that the Tera MTA couldn't have been a
>>>>>> complete failure.
>>>>>>
>>>>>> The Tera MTA later became known as the Cray MTA. When I heard that
>>>>>> I thought, oh, Cray bought Tera. But no, this happened after Tera bought
>>>>>> Cray - from Silicon Graphics. And then decided that the name "Cray" had
>>>>>> a bit more of a _cachet_ than the name "Tera", and adopted, therefore,
>>>>>> that name for the combined company.
>>>>>>
>>>>>> The same architecture was used in later versions of the machine; the
>>>>>> chip for sale on eBay was, after all, a Threadcrusher 4.0, and the original
>>>>>> Tera MTA didn't use a single-chip processor.
>>>>>>
>>>>>> The rationale behind using so many threads on a processor was so that
>>>>>> the processor could be doing something useful while waiting, and
>>>>>> waiting, and waiting for requested data to arrive from main memory.
>>>>>> So, if one had enough threads, _memory latency_ would not be an
>>>>>> issue.
>>>>> <
>>>>> This same approach is used in GPUs, where one switches threads every
>>>>> instruction! I don't remember a lot about Tera, but at Denelcore the processor
>>>>> had a number of threads and a number of calculation units and memory.
>>>>> A thread was not eligible to run a second instruction until all aspects of
>>>>> the first instruction had been performed. Each word (64-bits) of memory
>>>>> could be {Read, Written, Read-if-Full, Written-if-empty} So a memory ref
>>>>> could be sent out and literally not return for thousands of clocks. The
>>>>> registers had the same kind of stuff but we didn't use that--just the memory.
>>>>>>
>>>>>> Apparently, this processor had eight data registers, and eight address
>>>>>> modification registers. A bit like a 68000, or even my original
>>>>>> Concertina architecture. But I haven't been able to find any information
>>>>>> about its instruction set.
>>>>>>
>>>>>> Of course, having 128 threads on a single-core chip, while it does,
>>>>>> without diminishing throughput, slow each individual thread down to
>>>>>> the pace of memory, might seem to create bandwidth issues. However,
>>>>>> that didn't stop Intel from giving the world Xeon Phi.
>>>>>>
>>>>>> In any case, was this a valid approach, or is there a good reason why
>>>>>> it is no longer viable?
>>>>> <
>>>>> It migrated into GPUs, where a single instruction may cause 32 (or 64) calculations
>>>>> or memory references to 64 different cache lines, in 64 different pages,.....
>>>>> The average memory reference takes something like 400 cycles, so the GPU
>>>>> better have other things to do while it waits. So, switching to a new set of
>>>>> threads every cycle dramatically improves throughput even if dragging out the
>>>>> latency on a per thread basis.
>>>> Hmm...
>>>>
>>>> Reminds me of an observation I made while fiddling around with stuff for
>>>> my own ISA:
>>> <
>>>> If the CPU core ran at half the effective clock-speed, but only had to
>>>> spend half as many cycles on cache misses, the relative impact on
>>>> performance would be "surprisingly minor" for things like Doom (the
>>>> clock speed drops but the IPC increases enough to compensate).
>> As noted, this seems to be because Doom is mostly bound by memory
>> accesses, spending the majority of its clock cycles waiting on L2 misses.
>>
>>
>> Poking at it some more:
>>
>> If memory stays at the same speed, Doom is still fairly playable at
>> around 12MHz or 16MHz, though at these speeds transitions over into
>> being mostly instruction-bound.
>>
>> Its behavior and properties seem to change slightly, and the "low/high
>> detail" option becomes "actually relevant" (has a more obvious effect on
>> framerate).
>>
>> A similar property seems to hold for Quake, which only seems to suffer a
>> modest slowdown (relative to 50MHz) when running at 12MHz (still low
>> single digit framterates).
>>
>> GLQuake performance tanks pretty bad at 12MHz though (averages 0 fps,
>> this scenario somewhat favors software-rendered Quake).

I looked at it some more, and decided I would not go this route.

Even if Doom tolerates a drop in MHz, my software OpenGL rasterizer does
not, I suspect because it is closer to being cycle-bound than mem-access
bound.

>>> <
>>> I have noticed several occurrences of similar merit::
>>> <
>>> As the length of the pipeline is decreased the pressure on the predictors
>>> {branch, Jump, Call/Return} is reduced because the recovery multiplier is
>>> smaller. Conversely, a 15 stage pipeline needs a predictor that has no more
>>> than 1/3rd of the mispredicts as a 5 stage pipeline to achieve similar miss
>>> prediction recovery cycles.
>> OK. Branch misses don't appear to be too major of a problem with an
>> 8-stage pipeline. The predictors I have mostly seem to work OK.
>>> <
>>> As the execution width becomes wider, one needs to service more AGENs
>>> per cycle:: 1-wide needs 1-AGEN, 4-wide needs 2 AGENs, 6-wide needs
>>> 3 AGENs--all these AGENs end up colliding at the cache ports and it you
>>> can't service this many on a per cycle basis, you might want to reconsider
>>> the width you are trying to achieve.
>> OK, in my 3-wide core, there is only 1 memory port.
>>
>> So, I guess from the pattern above, the idea is that, as the number of
>> lanes increases, one needs roughly n/2 memory ports to be worthwhile (as
>> opposed to, say, doing a 6-wide core with 1 memory port, and the other 5
>> lanes only able to do ALU ops).
> <
> MATRIX300 is DGEMM at its core and this has 3 memrefs and 2 Fops
> (unrolled 4 times) so the ADD-CMP-BC is amortized over 4 loops or
> 5.75 I/C (if your execution window can absorb an L1 DCache miss).
> {Modern equivalent uses FMAC instead of FMUL and FADD.} We
> specifically targeted this application (it was 1991) for our 6-wide
> machine.

OK.

I did an early spec for a possible approach to SMT, it would have 5 or 6
lanes internally and 2 memory ports.

https://github.com/cr88192/bgbtech_btsr1arch/wiki/BJX2-ISA#Symmetric_Multithreading_SMT


Click here to read the complete article
Re: The Tera MTA

<jwvmtqdrqn2.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19069&group=comp.arch#19069

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: The Tera MTA
Date: Fri, 23 Jul 2021 11:35:42 -0400
Organization: A noiseless patient Spider
Lines: 10
Message-ID: <jwvmtqdrqn2.fsf-monnier+comp.arch@gnu.org>
References: <d8e86c4a-44a4-4db2-b92b-ddd9c966b9fdn@googlegroups.com>
<127e7a94-d498-4467-8834-c6b639515c3bn@googlegroups.com>
<scv1hp$gh9$1@dont-email.me>
<1b43802a-3438-4aa8-bdc4-0b86f976eda9n@googlegroups.com>
<b669dfad-aafc-4933-9b2c-07900162d9fbn@googlegroups.com>
<2d30e258-5cfc-47e6-b5e1-262e5c5f2701n@googlegroups.com>
<8b75c15f-e926-4391-a6f9-05ca1ef62709n@googlegroups.com>
<sdc5n4$5qq$1@dont-email.me>
<d815ba44-a659-402a-b616-d06b98a6e8c0n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="2311b37170273f139f2906767a7b89dd";
logging-data="6951"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19dr3WFh949i5noeHivC5WO"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.0.50 (gnu/linux)
Cancel-Lock: sha1:U+LXthucxlIkyKMGxtoU5IgveSY=
sha1:EMRNINnEFY+Mpf2wDCYXNXZ8JeU=
 by: Stefan Monnier - Fri, 23 Jul 2021 15:35 UTC

> I think part of the idea (I do not remember) was that writes can be
> delayed (with forwarding/caching) and bank conflicts tend to
> average out over time. A single write has to go to two of the three
> arrays.

Plus writes need to read the third array (or equivalently read the old
value of the two arrays to compute the value held in the third).

Stefan

Re: The Tera MTA

<90397382-54a3-41fc-a3d6-237d78ce6dffn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19699&group=comp.arch#19699

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:6af:: with SMTP id s15mr8059124qvz.52.1628541345328;
Mon, 09 Aug 2021 13:35:45 -0700 (PDT)
X-Received: by 2002:a05:6808:10d5:: with SMTP id s21mr17466497ois.7.1628541345123;
Mon, 09 Aug 2021 13:35:45 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.snarked.org!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 9 Aug 2021 13:35:44 -0700 (PDT)
In-Reply-To: <2021Jul22.104200@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=87.68.183.224; posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 87.68.183.224
References: <d8e86c4a-44a4-4db2-b92b-ddd9c966b9fdn@googlegroups.com>
<51548316-5543-41d9-8ec6-822b861ea6afn@googlegroups.com> <aeab88d8-c9c6-49aa-9738-c778ac4174den@googlegroups.com>
<02416980-bf6b-4094-8cea-34bb69937472n@googlegroups.com> <19d3513e-ceb0-4e4a-821b-c699fd542981n@googlegroups.com>
<7482c2f6-9d84-4bdb-a6e5-bebfc45b33ecn@googlegroups.com> <2021Jul21.162714@mips.complang.tuwien.ac.at>
<446fed80-1f97-42f3-bf0c-96c25c075bd1n@googlegroups.com> <2021Jul22.104200@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <90397382-54a3-41fc-a3d6-237d78ce6dffn@googlegroups.com>
Subject: Re: The Tera MTA
From: already5...@yahoo.com (Michael S)
Injection-Date: Mon, 09 Aug 2021 20:35:45 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 63
 by: Michael S - Mon, 9 Aug 2021 20:35 UTC

On Thursday, July 22, 2021 at 12:10:43 PM UTC+3, Anton Ertl wrote:
> Michael S <already...@yahoo.com> writes:
> >On Wednesday, July 21, 2021 at 6:06:57 PM UTC+3, Anton Ertl wrote:
> >> "Paul A. Clayton" <paaron...@gmail.com> writes:
> >> >On Tuesday, July 20, 2021 at 3:31:00 PM UTC-4, Paul A. Clayton wrote:
> >> >> On Saturday, July 17, 2021 at 5:34:02 PM UTC-4, MitchAlsup wrote:
> >> >[snip]
> >> >>> Secondly, 20nm still is cheaper per transistor than 14nm, 10nm, 7nm or 5nm.
> >> >>
> >> >> Really (in 2021)? Obviously, it depends on volume (NRE is significant)
> >> >> and chip size (dicing overhead/pad limits), less scaling transistors (I/O),
> >> >> wiring constraints (SRAMs might be relatively cheaper, i.e., scaling better),
> >> >> and probably other factors, but I thought I had read that Intel claimed
> >> >> its 14nm was cheaper per transistor than its 20nm (not that Intel lacked
> >> >> motivation to fudge figures).
> >> >>
> >> >> Are power and transistors per die the only benefits then of 14nm?
> >> If you look at Intel, they had their difficulties with the 14nm
> >> processes, but they have produced their mainstream CPUs on 14nm since
> >> ~2016 (Skylake was still scarce in 2015, so I guess they did not
> >> produce that many at that time); they produced (server) CPUs with more
> >> transistors in 22nm, so the reason for mainstream 14nm was not
> >> transistors per die.
> >>
> >> Both the 4770 (22nm, 65W TDP) and 6700 (14nm, 65W TDP) have 3.4GHz
> >> base clock, so they are probably comparable in power consumption for a
> >> given clock rate (although admittedly the Skylake has more IPC and
> >> more transistors, so it's not an apples-to-apples comparison; Haswell
> >> vs. Broadwell would be better, but Broadwell seemed to suffer from
> >> Intel's 14nm woes, and the 5775C (14nm, 65W TDP) has 3.3GHz base
> >> clock, lower than either). So it's not power, either, at least at
> >> first.
> >>
> >
> >D-1541 (Broadwell) vs E5-1428L v3 (Haswell)
> >Broadwell runs both faster and cooler.
> >It's not exactly apple2apple because E5 is higher end product with 3 memory channels vs 2 and 20MB LLC vs 12 MB,
> >but still, the difference in Watt per GHz is rather bigger than can be attributed to this factors.
> But then, when I look at the E5-2609 v4 with 20MB cache and 8 cores, I
> see that it has a higher TDP and a lower clock rate than the E5-1428L
> v3. And when I look at the E5-2608L v4 (also with 20MB), it has a
> slightly lower TDP (50W vs. 60W), but roughly proportionally less
> clock rate (1.6GHz base, 1.7GHz Turbo). And there's the E5-2630L v3
> with 55W TDP, 1.8GHz base and 2.9GHz Turbo, while the E5-2428L v3 has
> the same TDP and only 1.8GHz base without turbo.
>
> I guess if we want to learn something about Intel processes, we need
> to look at the top-end CPUs for a given TDP. Looking at CPUs like the
> E5-1428L v3 and E5-2428L v3 is misleading. And even that should be
> taken with a grain of salt, because for some TDPs, Intel may not offer
> the maximum that's technically possible.
> - anton
> --
> 'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
> Mitch Alsup, <c17fcd89-f024-40e7...@googlegroups.com>

I think, E5-1428L v3 is actually top Haswell Xeon in terms of performance per watt.
Is it not so?

On the other hand, E5-2609 v4 and E5-2608L v4 look like defective dies that were sold cheaply instead of being trashed.
More successful exemplars of the same die were sold as E5-2618L v4.

Pages:123
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor