Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Computer programmers never die, they just get lost in the processing.


devel / comp.arch / Re: Changing the Width of Memory is Easy

SubjectAuthor
* Changing the Width of Memory is EasyQuadibloc
+* Re: Changing the Width of Memory is EasyTheo Markettos
|+* Re: Changing the Width of Memory is EasyMichael S
||+- Re: Changing the Width of Memory is EasyTheo Markettos
||`* Re: Changing the Width of Memory is EasyQuadibloc
|| `- Re: Changing the Width of Memory is EasyQuadibloc
|+- Re: Changing the Width of Memory is EasyAnton Ertl
|+- Re: Changing the Width of Memory is EasyQuadibloc
|`- Re: Changing the Width of Memory is EasyBGB
+* Re: Changing the Width of Memory is EasyQuadibloc
|`* Re: Changing the Width of Memory is EasyQuadibloc
| +- Re: Changing the Width of Memory is EasyQuadibloc
| `* Re: Changing the Width of Memory is EasyBGB
|  `* Re: Changing the Width of Memory is EasyQuadibloc
|   `* Re: Changing the Width of Memory is EasyQuadibloc
|    `* Re: Changing the Width of Memory is EasyQuadibloc
|     `- Re: Changing the Width of Memory is EasyQuadibloc
`* Re: Changing the Width of Memory is EasyMitchAlsup
 `- Re: Changing the Width of Memory is EasyQuadibloc

1
Changing the Width of Memory is Easy

<1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24951&group=comp.arch#24951

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:4104:b0:42c:1db0:da28 with SMTP id kc4-20020a056214410400b0042c1db0da28mr27049954qvb.67.1651223778634;
Fri, 29 Apr 2022 02:16:18 -0700 (PDT)
X-Received: by 2002:a05:6808:1926:b0:323:3c4:947d with SMTP id
bf38-20020a056808192600b0032303c4947dmr956609oib.103.1651223778399; Fri, 29
Apr 2022 02:16:18 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Apr 2022 02:16:18 -0700 (PDT)
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:6947:3c86:73e1:a64e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:6947:3c86:73e1:a64e
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
Subject: Changing the Width of Memory is Easy
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 29 Apr 2022 09:16:18 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 20
 by: Quadibloc - Fri, 29 Apr 2022 09:16 UTC

On further reflection, I see that I have been overthinking things a bit.
How can one get 48-bit words in a 64-bit world?
If one is reconciled to the fact that DRAM is only efficient when blocks of
memory are accessed, then it should be clear that the problem goes away.
To allocate an area of 96-bit memory words, allocate one area of the same
number of 64-bit memory words, and another area, half as large, of 64-bit
memory words. Access two blocks of words from the first area, and one
block of words from the second area - and one has a block, with twice as
many words in it of the 96-bit size.
By itself, this technique isn't very flexible. One could multiply the number of
words in a block by two again to handle 80-bit words or 112-bit words if one
wanted.
But if one wants memory organized around a 12-bit fundamental unit to
efficiently handle 36-bit and 60-bit data items... the idea of using dual-
channel memory to allow handling unaligned data easily is applicable, but it
should be applied at the appropriate level. (Although it can also be applied
to main memory as well, perhaps with some complexity.)
If there's an L2 cache organized with cache lines made up out of 96-bit
words for this type of computing, make that cache dual-channel!

John Savard

Re: Changing the Width of Memory is Easy

<Kdx*z7UMy@news.chiark.greenend.org.uk>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24952&group=comp.arch#24952

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!nntp.terraraq.uk!nntp-feed.chiark.greenend.org.uk!ewrotcd!.POSTED!not-for-mail
From: theom+n...@chiark.greenend.org.uk (Theo Markettos)
Newsgroups: comp.arch
Subject: Re: Changing the Width of Memory is Easy
Date: 29 Apr 2022 11:18:12 +0100 (BST)
Organization: University of Cambridge, England
Lines: 33
Message-ID: <Kdx*z7UMy@news.chiark.greenend.org.uk>
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
NNTP-Posting-Host: chiark.greenend.org.uk
X-Trace: chiark.greenend.org.uk 1651227494 23356 212.13.197.229 (29 Apr 2022 10:18:14 GMT)
X-Complaints-To: abuse@chiark.greenend.org.uk
NNTP-Posting-Date: Fri, 29 Apr 2022 10:18:14 +0000 (UTC)
User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX) (Linux/3.16.0-11-amd64 (x86_64))
Originator: theom@chiark.greenend.org.uk ([212.13.197.229])
 by: Theo Markettos - Fri, 29 Apr 2022 10:18 UTC

Quadibloc <jsavard@ecn.ab.ca> wrote:
> On further reflection, I see that I have been overthinking things a bit.
> How can one get 48-bit words in a 64-bit world?
> If one is reconciled to the fact that DRAM is only efficient when blocks of
> memory are accessed, then it should be clear that the problem goes away.

I haven't read it if this came up as a previous thread, but I'm not sure
what the problem is.

Your DRAM *bus* might be 64 bits wide, but your DRAM rows aren't. Your LLC
would ideally be as wide as your DRAM row, or if not a power of two fraction
(half, quarter, etc). When you do a cache miss it is most efficient to pull
the entire DRAM row in a burst. Let's say your LLC is 512 or 1024 bits
wide, so that's the size of data you fetch in a DRAM transaction.

What you do above your LLC is up to you. If you want to have 48 bit words,
you can make your L1, L2, etc a multiple of that size.

Most of the time a fetch will come from a single LLC line, and that's fine.
In rare cases they will span two LLC lines, and you need to deal with that
(which is an annoying state machine but not intractable).

It makes no difference the width of the external memory - you can do the
same with 32 or 128 bit wide memory. Multiple memory channels don't really
help you (the timings on each channel may be different, and it's better not
to have to wait for the 'slow' channel).

> If there's an L2 cache organized with cache lines made up out of 96-bit
> words for this type of computing, make that cache dual-channel!

What does dual channel buy you that single channel doesn't?

Theo

Re: Changing the Width of Memory is Easy

<5198c09e-a4fb-4e8e-be24-180e1dfefe32n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24953&group=comp.arch#24953

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:4104:b0:42c:1db0:da28 with SMTP id kc4-20020a056214410400b0042c1db0da28mr27451916qvb.67.1651233761630;
Fri, 29 Apr 2022 05:02:41 -0700 (PDT)
X-Received: by 2002:a05:6830:1489:b0:605:e8f6:5047 with SMTP id
s9-20020a056830148900b00605e8f65047mr2528768otq.185.1651233761257; Fri, 29
Apr 2022 05:02:41 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Apr 2022 05:02:41 -0700 (PDT)
In-Reply-To: <Kdx*z7UMy@news.chiark.greenend.org.uk>
Injection-Info: google-groups.googlegroups.com; posting-host=2a0d:6fc2:55b0:ca00:98cb:c316:8e74:644c;
posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 2a0d:6fc2:55b0:ca00:98cb:c316:8e74:644c
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com> <Kdx*z7UMy@news.chiark.greenend.org.uk>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <5198c09e-a4fb-4e8e-be24-180e1dfefe32n@googlegroups.com>
Subject: Re: Changing the Width of Memory is Easy
From: already5...@yahoo.com (Michael S)
Injection-Date: Fri, 29 Apr 2022 12:02:41 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 45
 by: Michael S - Fri, 29 Apr 2022 12:02 UTC

On Friday, April 29, 2022 at 1:18:18 PM UTC+3, Theo Markettos wrote:
> Quadibloc <jsa...@ecn.ab.ca> wrote:
> > On further reflection, I see that I have been overthinking things a bit.
> > How can one get 48-bit words in a 64-bit world?
> > If one is reconciled to the fact that DRAM is only efficient when blocks of
> > memory are accessed, then it should be clear that the problem goes away.
> I haven't read it if this came up as a previous thread, but I'm not sure
> what the problem is.
>

I'm not sure that John can explain what the problem is :(

> Your DRAM *bus* might be 64 bits wide, but your DRAM rows aren't. Your LLC
> would ideally be as wide as your DRAM row, or if not a power of two fraction
> (half, quarter, etc). When you do a cache miss it is most efficient to pull
> the entire DRAM row in a burst. Let's say your LLC is 512 or 1024 bits
> wide, so that's the size of data you fetch in a DRAM transaction.
>

What you call LLC width, most people call cache line size. And, indeed,
most popular sizes nowadays are 64B (512bits) and 128B (1024 bits).
Both much smaller than DRAM DIMM row (page) size which tends to
be 8 KB, but for some server-class high-capacity DIMMs that a made
of x4 individual devices can be as high as 16 KB.

> What you do above your LLC is up to you. If you want to have 48 bit words,
> you can make your L1, L2, etc a multiple of that size.
>
> Most of the time a fetch will come from a single LLC line, and that's fine.
> In rare cases they will span two LLC lines, and you need to deal with that
> (which is an annoying state machine but not intractable).

I'd think, in modern designs LLC does not see requests that span two line.
Even L2 cache does not see them. They are splat at the level of core itself
or, at worst, of L1D.

>
> It makes no difference the width of the external memory - you can do the
> same with 32 or 128 bit wide memory. Multiple memory channels don't really
> help you (the timings on each channel may be different, and it's better not
> to have to wait for the 'slow' channel).
> > If there's an L2 cache organized with cache lines made up out of 96-bit
> > words for this type of computing, make that cache dual-channel!
> What does dual channel buy you that single channel doesn't?
>
> Theo

Re: Changing the Width of Memory is Easy

<Mdx*uxVMy@news.chiark.greenend.org.uk>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24954&group=comp.arch#24954

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!nntp.terraraq.uk!nntp-feed.chiark.greenend.org.uk!ewrotcd!.POSTED!not-for-mail
From: theom+n...@chiark.greenend.org.uk (Theo Markettos)
Newsgroups: comp.arch
Subject: Re: Changing the Width of Memory is Easy
Date: 29 Apr 2022 13:17:22 +0100 (BST)
Organization: University of Cambridge, England
Lines: 39
Message-ID: <Mdx*uxVMy@news.chiark.greenend.org.uk>
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com> <Kdx*z7UMy@news.chiark.greenend.org.uk> <5198c09e-a4fb-4e8e-be24-180e1dfefe32n@googlegroups.com>
NNTP-Posting-Host: chiark.greenend.org.uk
X-Trace: chiark.greenend.org.uk 1651234644 13538 212.13.197.229 (29 Apr 2022 12:17:24 GMT)
X-Complaints-To: abuse@chiark.greenend.org.uk
NNTP-Posting-Date: Fri, 29 Apr 2022 12:17:24 +0000 (UTC)
User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX) (Linux/3.16.0-11-amd64 (x86_64))
Originator: theom@chiark.greenend.org.uk ([212.13.197.229])
 by: Theo Markettos - Fri, 29 Apr 2022 12:17 UTC

Michael S <already5chosen@yahoo.com> wrote:
> On Friday, April 29, 2022 at 1:18:18 PM UTC+3, Theo Markettos wrote:
> > Your DRAM *bus* might be 64 bits wide, but your DRAM rows aren't. Your LLC
> > would ideally be as wide as your DRAM row, or if not a power of two fraction
> > (half, quarter, etc). When you do a cache miss it is most efficient to pull
> > the entire DRAM row in a burst. Let's say your LLC is 512 or 1024 bits
> > wide, so that's the size of data you fetch in a DRAM transaction.
> >
>
> What you call LLC width, most people call cache line size. And, indeed,
> most popular sizes nowadays are 64B (512bits) and 128B (1024 bits).
> Both much smaller than DRAM DIMM row (page) size which tends to
> be 8 KB, but for some server-class high-capacity DIMMs that a made
> of x4 individual devices can be as high as 16 KB.

Agreed, depending on the density of the DRAM chip. You want to get as much
of a row as your line size allows.

> > What you do above your LLC is up to you. If you want to have 48 bit words,
> > you can make your L1, L2, etc a multiple of that size.
> >
> > Most of the time a fetch will come from a single LLC line, and that's fine.
> > In rare cases they will span two LLC lines, and you need to deal with that
> > (which is an annoying state machine but not intractable).
>
> I'd think, in modern designs LLC does not see requests that span two line.
> Even L2 cache does not see them. They are splat at the level of core itself
> or, at worst, of L1D.

John is proposing a scheme where words aren't a power-of-two sized, eg 48
bits. 48 = 3*2^4, so that prime factor of 3 will mean there will inevitably
be some words that cross a 2^N sized line, irrespective of how you structure
your power-of-two sized caches. My point is to attack that in two ways: we
make N large so that we reduce the probability that any arbitrary access
will cross the boundary, and then we build something to make a two-line
access atomic when that crossing happens and you need to cover that corner
case. This is annoying and potentially slow, but not fundamentally hard.

Theo

Re: Changing the Width of Memory is Easy

<2022Apr29.124153@mips.complang.tuwien.ac.at>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24955&group=comp.arch#24955

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.arch
Subject: Re: Changing the Width of Memory is Easy
Date: Fri, 29 Apr 2022 10:41:53 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 58
Message-ID: <2022Apr29.124153@mips.complang.tuwien.ac.at>
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com> <Kdx*z7UMy@news.chiark.greenend.org.uk>
Injection-Info: reader02.eternal-september.org; posting-host="730bad5f58573e73be5bb41042daa406";
logging-data="12408"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18ulyZ4Mf7wz5cKu5nbEo3c"
Cancel-Lock: sha1:mlKFUuaCE6Ip8iS51vyrupeEcy8=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Fri, 29 Apr 2022 10:41 UTC

Theo Markettos <theom+news@chiark.greenend.org.uk> writes:
>Your LLC
>would ideally be as wide as your DRAM row,

It seems to me that cache line sizes are very much influenced by
spatial locality (or the lack of it). We have seen bandwidths soar
and latencies stagnate, so one might expect cache lines to become
longer, yet cache lines on general-purpose machines have stayed at 64
bytes, i.e., 512 bits (which are transferred in 5ns on DDR4-3200 and
DDR5-6400, compared to ~50ns load-to-load latency); with a typical
8-device-per-rank DIMM, this is 64 bits per device (transferred in 8
beats of 8 bits with DDR4, 16 beats of 4 bits with DDR5).

Looking at Figure 3 of
<https://www.micron.com/-/media/client/global/documents/products/data-sheet/dram/ddr4/16gb_ddr4_sdram.pdf>,
I see that the row width of a 16Gb (2Gx8b) device is 8Kb (depending on
the device), i.e. 128 times as long as that device's part of the cache
line.

If you wanted to transfer the whole row, this would take 320ns. This
might be ok if we had very good spatial locality. But I think the
problem is that spatial locality is not so good, and having such long
cache lines would result in the caches having many fewer cache lines
than they do now, and with not-so-great spatial locality, the amount
of useful data in the cache would be much smaller than with 64-byte
cache lines.

> When you do a cache miss it is most efficient to pull
>the entire DRAM row in a burst. Let's say your LLC is 512 or 1024 bits
>wide, so that's the size of data you fetch in a DRAM transaction.

Note that for a DIMM made up of 8 2Gx8b devices above, a row (i.e.,
8 times the 8 rows of the individual devices) would be 65536 bits.

>What you do above your LLC is up to you.

That would require a cache organization that always loads into the LLC
cache. No treating of the LLC cache as victim cache.

>What does dual channel buy you that single channel doesn't?

In the old days (before DDR3) the minimum burst from a single DIMM was
shorter than a cache line, so with dual-channel organization you could
reduce the transfer time of a cache line (at the cost of not being
able to serve two independent accesses on the two caches). The faster
the transfer rates got the lower the benefit of dual-channel was; plus
the more cores and hardware prefetchers we got, the more benefits we
see from being able to perform independent accesses on different
channels.

Actually hardware prefetchers seem to be the preferable way to exploit
the spatial locality that long cache lines and dual-channel memory
controlers were also designed to exploit.

- anton
--
'Anyone trying for "industrial quality" ISA should avoid undefined behavior.'
Mitch Alsup, <c17fcd89-f024-40e7-a594-88a85ac10d20o@googlegroups.com>

Re: Changing the Width of Memory is Easy

<4a665dac-42ff-4c43-ae13-ec33bb9462b5n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24956&group=comp.arch#24956

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:9243:0:b0:69b:6009:856d with SMTP id u64-20020a379243000000b0069b6009856dmr83204qkd.274.1651252004037;
Fri, 29 Apr 2022 10:06:44 -0700 (PDT)
X-Received: by 2002:a05:6808:11ca:b0:2d9:a01a:488b with SMTP id
p10-20020a05680811ca00b002d9a01a488bmr2001780oiv.214.1651252003869; Fri, 29
Apr 2022 10:06:43 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Apr 2022 10:06:43 -0700 (PDT)
In-Reply-To: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:6947:3c86:73e1:a64e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:6947:3c86:73e1:a64e
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <4a665dac-42ff-4c43-ae13-ec33bb9462b5n@googlegroups.com>
Subject: Re: Changing the Width of Memory is Easy
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 29 Apr 2022 17:06:44 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 23
 by: Quadibloc - Fri, 29 Apr 2022 17:06 UTC

On Friday, April 29, 2022 at 3:16:20 AM UTC-6, Quadibloc wrote:

> But if one wants memory organized around a 12-bit fundamental unit to
> efficiently handle 36-bit and 60-bit data items... the idea of using dual-
> channel memory to allow handling unaligned data easily is applicable, but it
> should be applied at the appropriate level. (Although it can also be applied
> to main memory as well, perhaps with some complexity.)
> If there's an L2 cache organized with cache lines made up out of 96-bit
> words for this type of computing, make that cache dual-channel!

Another thing occurred to me. I figured it would also help if the external
memory was dual-channel.
But if my goal is to take advantage of the fact that I can choose which word
in a DRAM line to fetch first, so that I can get at a specific 64-bit word in memory
with less latency, and to take advantage of it for a 60-bit word that happens to cross
between one block of 96-bit words and the next block of 96-bit words... then I'm
going to need quad-channel DRAM connected to the chip, so that I can fetch,
at once, the 64-bit and 32-bit parts of two different blocks.

Oh, dear. The memory organization will have to be complicated, and it doesn't
appear that maximum bandwidth, with all lanes busy at once, is feasible. I may
need six-channel memory.

John Savard

Re: Changing the Width of Memory is Easy

<adc19c68-50cc-445b-9aaa-de22e539aa01n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24957&group=comp.arch#24957

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:1722:b0:69e:e99a:db06 with SMTP id az34-20020a05620a172200b0069ee99adb06mr80119qkb.534.1651252091728;
Fri, 29 Apr 2022 10:08:11 -0700 (PDT)
X-Received: by 2002:a05:6808:6d7:b0:325:67ff:a21b with SMTP id
m23-20020a05680806d700b0032567ffa21bmr158795oih.105.1651252091478; Fri, 29
Apr 2022 10:08:11 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Apr 2022 10:08:11 -0700 (PDT)
In-Reply-To: <Kdx*z7UMy@news.chiark.greenend.org.uk>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:6947:3c86:73e1:a64e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:6947:3c86:73e1:a64e
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com> <Kdx*z7UMy@news.chiark.greenend.org.uk>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <adc19c68-50cc-445b-9aaa-de22e539aa01n@googlegroups.com>
Subject: Re: Changing the Width of Memory is Easy
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 29 Apr 2022 17:08:11 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 9
 by: Quadibloc - Fri, 29 Apr 2022 17:08 UTC

On Friday, April 29, 2022 at 4:18:18 AM UTC-6, Theo Markettos wrote:

> What does dual channel buy you that single channel doesn't?

It allows the entirety of an _unaligned_ word to be fetched in
a single memory read operation. Otherwise, I would have to do
two reads in the case of an unaligned word that crosses a
memory word boundary.

John Savard

Re: Changing the Width of Memory is Easy

<cedb8419-1ca2-4f15-aacc-1d4fcd68dcd6n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24958&group=comp.arch#24958

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:2681:b0:456:347b:6f10 with SMTP id gm1-20020a056214268100b00456347b6f10mr225933qvb.82.1651252634292;
Fri, 29 Apr 2022 10:17:14 -0700 (PDT)
X-Received: by 2002:a54:4e92:0:b0:325:224c:8ff7 with SMTP id
c18-20020a544e92000000b00325224c8ff7mr191442oiy.154.1651252634063; Fri, 29
Apr 2022 10:17:14 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Apr 2022 10:17:13 -0700 (PDT)
In-Reply-To: <4a665dac-42ff-4c43-ae13-ec33bb9462b5n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:6947:3c86:73e1:a64e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:6947:3c86:73e1:a64e
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com> <4a665dac-42ff-4c43-ae13-ec33bb9462b5n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <cedb8419-1ca2-4f15-aacc-1d4fcd68dcd6n@googlegroups.com>
Subject: Re: Changing the Width of Memory is Easy
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 29 Apr 2022 17:17:14 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 23
 by: Quadibloc - Fri, 29 Apr 2022 17:17 UTC

On Friday, April 29, 2022 at 11:06:45 AM UTC-6, Quadibloc wrote:

> Oh, dear. The memory organization will have to be complicated, and it doesn't
> appear that maximum bandwidth, with all lanes busy at once, is feasible. I may
> need six-channel memory.

I am going to need a divide by three circuit, then, for conventional memory
accesses. But I can save a few pins and go to "four-channel" memory of a sort.

That is, I can have only four address buses out of the CPU. Two will be associated
with 128-bit-wide data buses, and the other two with 64-bit-wide data buses.

So one can, at max bandwidth, fetch six blocks of 64-bit words at a time, or
four blocks of 96-bit words at a time.

So the effective block size is increased, which is a penalty.

The channels are divided into two groups, each one with one with a 128-bit bus
and one with a 64-bit bus. And blocks of 96-bit words are grouped in pairs; a
pair goes on one group of channels, and then the next pair goes on the other
group of channels. So one can choose to fetch a pair of blocks, and _either_
the pair before _or_ the pair after - the requirement for handling unaligned data.

John Savard

Re: Changing the Width of Memory is Easy

<b6a59a31-ca3d-4d45-8fe4-3246e390106en@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24959&group=comp.arch#24959

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:454e:b0:69f:b9fd:f05d with SMTP id u14-20020a05620a454e00b0069fb9fdf05dmr194090qkp.633.1651254716073;
Fri, 29 Apr 2022 10:51:56 -0700 (PDT)
X-Received: by 2002:a05:6808:308a:b0:323:78d:e7df with SMTP id
bl10-20020a056808308a00b00323078de7dfmr2120004oib.228.1651254715889; Fri, 29
Apr 2022 10:51:55 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Apr 2022 10:51:55 -0700 (PDT)
In-Reply-To: <cedb8419-1ca2-4f15-aacc-1d4fcd68dcd6n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:6947:3c86:73e1:a64e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:6947:3c86:73e1:a64e
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
<4a665dac-42ff-4c43-ae13-ec33bb9462b5n@googlegroups.com> <cedb8419-1ca2-4f15-aacc-1d4fcd68dcd6n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b6a59a31-ca3d-4d45-8fe4-3246e390106en@googlegroups.com>
Subject: Re: Changing the Width of Memory is Easy
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 29 Apr 2022 17:51:56 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 28
 by: Quadibloc - Fri, 29 Apr 2022 17:51 UTC

On Friday, April 29, 2022 at 11:17:16 AM UTC-6, Quadibloc wrote:
> On Friday, April 29, 2022 at 11:06:45 AM UTC-6, Quadibloc wrote:
>
> > Oh, dear. The memory organization will have to be complicated, and it doesn't
> > appear that maximum bandwidth, with all lanes busy at once, is feasible. I may
> > need six-channel memory.
> I am going to need a divide by three circuit, then, for conventional memory
> accesses. But I can save a few pins and go to "four-channel" memory of a sort.
>
> That is, I can have only four address buses out of the CPU. Two will be associated
> with 128-bit-wide data buses, and the other two with 64-bit-wide data buses.
>
> So one can, at max bandwidth, fetch six blocks of 64-bit words at a time, or
> four blocks of 96-bit words at a time.
>
> So the effective block size is increased, which is a penalty.
>
> The channels are divided into two groups, each one with one with a 128-bit bus
> and one with a 64-bit bus. And blocks of 96-bit words are grouped in pairs; a
> pair goes on one group of channels, and then the next pair goes on the other
> group of channels. So one can choose to fetch a pair of blocks, and _either_
> the pair before _or_ the pair after - the requirement for handling unaligned data.

But what about the case when unaligned data is _inside_ a single block, instead
of crossing block boundaries? Then, we don't want to fetch two consecutive words
in a single block. So we need to have the odd words on one side and the even
words on the other side.

John Savard

Re: Changing the Width of Memory is Easy

<t4h8up$m3f$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24960&group=comp.arch#24960

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Changing the Width of Memory is Easy
Date: Fri, 29 Apr 2022 12:57:38 -0500
Organization: A noiseless patient Spider
Lines: 114
Message-ID: <t4h8up$m3f$1@dont-email.me>
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
<Kdx*z7UMy@news.chiark.greenend.org.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 29 Apr 2022 17:57:45 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="9aa8fcc635b3f40b88cb63f89e102c5e";
logging-data="22639"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+6ZwUrNMlNvZqvOkdpr9sD"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.8.0
Cancel-Lock: sha1:kYKuG4Q0VcqNEuTsZrI8ViqtsgI=
In-Reply-To: <Kdx*z7UMy@news.chiark.greenend.org.uk>
Content-Language: en-US
 by: BGB - Fri, 29 Apr 2022 17:57 UTC

On 4/29/2022 5:18 AM, Theo Markettos wrote:
> Quadibloc <jsavard@ecn.ab.ca> wrote:
>> On further reflection, I see that I have been overthinking things a bit.
>> How can one get 48-bit words in a 64-bit world?
>> If one is reconciled to the fact that DRAM is only efficient when blocks of
>> memory are accessed, then it should be clear that the problem goes away.
>
> I haven't read it if this came up as a previous thread, but I'm not sure
> what the problem is.
>
> Your DRAM *bus* might be 64 bits wide, but your DRAM rows aren't. Your LLC
> would ideally be as wide as your DRAM row, or if not a power of two fraction
> (half, quarter, etc). When you do a cache miss it is most efficient to pull
> the entire DRAM row in a burst. Let's say your LLC is 512 or 1024 bits
> wide, so that's the size of data you fetch in a DRAM transaction.
>

With DIMMs, maybe.
With typical FPGA boards, it is more often 4/8/16 bits.
You might get 32 bits if buying a more expensive FPGA board.

In my case, L2 line size is 64B (512 bits), mostly because this had
lower overheads for DRAM transfers when compared to 16B (128 bits).

This difference increases raw RAM bandwidth by a factor of around 3x.

> What you do above your LLC is up to you. If you want to have 48 bit words,
> you can make your L1, L2, etc a multiple of that size.
>
> Most of the time a fetch will come from a single LLC line, and that's fine.
> In rare cases they will span two LLC lines, and you need to deal with that
> (which is an annoying state machine but not intractable).
>
> It makes no difference the width of the external memory - you can do the
> same with 32 or 128 bit wide memory. Multiple memory channels don't really
> help you (the timings on each channel may be different, and it's better not
> to have to wait for the 'slow' channel).
>

NPOT lines are possible, though would be annoying to work with IMO.

It is extra annoying if one wants, say, memory that is "almost" a power
of 2, but then wants a collection of tag bits per cache line (such as
would be needed for implementing tripwires and/or capabilities).

At some level (probably in the L2), one needs to deal with a sort of
indirection mechanism such that they can have a separate sub-cache of
tag-bits lines or similar.

Arguably, a 3:2 split (3x DRAM lines become 2x logical lines) is an
easier case, in this case it is more of an addressing quirk.

Though, one would need to do multiple DRAM transfers in cases where the
group-of-2 crosses a row boundary, vs non-crossing cases where it could
probably be done as a single burst.

Or pay a speed penalty, and always do 3 DRAM transfers (whether or not
it would cross a boundary).

This is less of an issue with power-of-2 line sizes since rows are a
power-of-2, and thus a line will never cross a boundary.

Potentially, the logic in the DDR controller would be similar in both
cases, just in the "crosses row boundary" case, it would close and
re-open the row for each burst transfer.

This doesn't really make sense in my case though, since the RAM is
power-of-2, apart from a few cases where tag bits might help.

Though, even in my emulator, I ended up disabling a few features which
had used memory tagging, mostly as they turned out to be incompatible
with my swapfile mechanism.

I would need, in effect, to figure out a good way to turn the tag bits
into explicit architectural state which can also be remapped and/or
saved/restored with the swapped pages.

This would be roughly an additional 512B of state for each 16K page.
Simplest option would be making the swapfile pages also NPOT, but this
would likely be worse for the SDcard (main alternative being, say, to
cut off the end of the swapfile and use it mostly for "tags bits pages").

Though, this turns it less into an issue of "make cache lines slightly
bigger", and more "there is some memory 'over there' which is mapped to
holding tag bits".

Ironically, due to a quirk of my memory addressing, ~ 16MB of the RAM is
in part of a DMZ area (not readily addressable otherwise). Of this, some
of this area is used for VRAM, but the space is big enough that I could
potentially also map tag-bits RAM into this area, if I did decide to
make it part of the architecture proper.

Would still need to add a mechanism though for dealing with the tags-bit
cache, effectively another small cache glued onto the main L2 cache.

>> If there's an L2 cache organized with cache lines made up out of 96-bit
>> words for this type of computing, make that cache dual-channel!
>
> What does dual channel buy you that single channel doesn't?
>
> Theo

Re: Changing the Width of Memory is Easy

<e69a2ae5-5ac8-4dcd-bd56-efd93b14faafn@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24961&group=comp.arch#24961

 copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:7f4d:0:b0:2f1:f967:52bd with SMTP id g13-20020ac87f4d000000b002f1f96752bdmr647848qtk.597.1651257293745;
Fri, 29 Apr 2022 11:34:53 -0700 (PDT)
X-Received: by 2002:a05:6808:6d7:b0:325:67ff:a21b with SMTP id
m23-20020a05680806d700b0032567ffa21bmr334773oih.105.1651257293517; Fri, 29
Apr 2022 11:34:53 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Apr 2022 11:34:53 -0700 (PDT)
In-Reply-To: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:7802:3ad8:9b8d:ecdd;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:7802:3ad8:9b8d:ecdd
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <e69a2ae5-5ac8-4dcd-bd56-efd93b14faafn@googlegroups.com>
Subject: Re: Changing the Width of Memory is Easy
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 29 Apr 2022 18:34:53 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 35
 by: MitchAlsup - Fri, 29 Apr 2022 18:34 UTC

On Friday, April 29, 2022 at 4:16:20 AM UTC-5, Quadibloc wrote:
> On further reflection, I see that I have been overthinking things a bit.
> How can one get 48-bit words in a 64-bit world?
<
? <
> If one is reconciled to the fact that DRAM is only efficient when blocks of
> memory are accessed, then it should be clear that the problem goes away.
<
This is the wrong way of thinking about the problem:: The right way::
If you pay 20ns to get at the data you require (RAS),
You should spend 20ns funneling data in/out (multiple CAS).
This has nothing to do with the width of DRAM (except that a DRAM Word is
......big enough to support that long a data burst).
<
> To allocate an area of 96-bit memory words, allocate one area of the same
> number of 64-bit memory words, and another area, half as large, of 64-bit
> memory words. Access two blocks of words from the first area, and one
> block of words from the second area - and one has a block, with twice as
> many words in it of the 96-bit size.
<
What do you do when you have an array of structs, each field of which is a
different number of bits in size ?
<
> By itself, this technique isn't very flexible. One could multiply the number of
> words in a block by two again to handle 80-bit words or 112-bit words if one
> wanted.
> But if one wants memory organized around a 12-bit fundamental unit to
> efficiently handle 36-bit and 60-bit data items... the idea of using dual-
> channel memory to allow handling unaligned data easily is applicable, but it
> should be applied at the appropriate level. (Although it can also be applied
> to main memory as well, perhaps with some complexity.)
> If there's an L2 cache organized with cache lines made up out of 96-bit
> words for this type of computing, make that cache dual-channel!
>
> John Savard

Re: Changing the Width of Memory is Easy

<8df51e7a-71a6-459d-ad23-8ea50f33a5d7n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24962&group=comp.arch#24962

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1827:b0:2f3:6d90:1504 with SMTP id t39-20020a05622a182700b002f36d901504mr720478qtc.268.1651258013687;
Fri, 29 Apr 2022 11:46:53 -0700 (PDT)
X-Received: by 2002:a05:6870:d1cd:b0:e1:e7ee:faa0 with SMTP id
b13-20020a056870d1cd00b000e1e7eefaa0mr1940241oac.5.1651258013336; Fri, 29 Apr
2022 11:46:53 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Apr 2022 11:46:53 -0700 (PDT)
In-Reply-To: <e69a2ae5-5ac8-4dcd-bd56-efd93b14faafn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:6947:3c86:73e1:a64e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:6947:3c86:73e1:a64e
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com> <e69a2ae5-5ac8-4dcd-bd56-efd93b14faafn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <8df51e7a-71a6-459d-ad23-8ea50f33a5d7n@googlegroups.com>
Subject: Re: Changing the Width of Memory is Easy
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 29 Apr 2022 18:46:53 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 19
 by: Quadibloc - Fri, 29 Apr 2022 18:46 UTC

On Friday, April 29, 2022 at 12:34:55 PM UTC-6, MitchAlsup wrote:

> What do you do when you have an array of structs, each field of which is a
> different number of bits in size ?

I don't modify the memory to accomodate this, except for the fact
that the memory can handle unaligned data items with no additional
delay, any more than conventional computer systems would do so.

This would be true both for 64-bit mode memory blocks and 96-bit mode
memory blocks.

The goal is simply to make it transparent to the user that the memory
happens to be built out of 64-bit DRAM modules instead of 96-bit
DRAM modules when a program is being run that uses 36, 48, and 60-bit
floating-point numbers. Arrays of _those_ have to run at the full blazing
speed as if the computer was built around them. Structs, designed
by the user, of course might be more awkward to handle.

John Savard

Re: Changing the Width of Memory is Easy

<t4hd6u$pll$1@dont-email.me>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24963&group=comp.arch#24963

 copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Changing the Width of Memory is Easy
Date: Fri, 29 Apr 2022 14:10:15 -0500
Organization: A noiseless patient Spider
Lines: 69
Message-ID: <t4hd6u$pll$1@dont-email.me>
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
<4a665dac-42ff-4c43-ae13-ec33bb9462b5n@googlegroups.com>
<cedb8419-1ca2-4f15-aacc-1d4fcd68dcd6n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 29 Apr 2022 19:10:22 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="9aa8fcc635b3f40b88cb63f89e102c5e";
logging-data="26293"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18oivv5wLw7cYT42/WeTEFK"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.8.0
Cancel-Lock: sha1:pq4FyEb+fQinroKVc/BROoxTrgI=
In-Reply-To: <cedb8419-1ca2-4f15-aacc-1d4fcd68dcd6n@googlegroups.com>
Content-Language: en-US
 by: BGB - Fri, 29 Apr 2022 19:10 UTC

On 4/29/2022 12:17 PM, Quadibloc wrote:
> On Friday, April 29, 2022 at 11:06:45 AM UTC-6, Quadibloc wrote:
>
>> Oh, dear. The memory organization will have to be complicated, and it doesn't
>> appear that maximum bandwidth, with all lanes busy at once, is feasible. I may
>> need six-channel memory.
>
> I am going to need a divide by three circuit, then, for conventional memory
> accesses. But I can save a few pins and go to "four-channel" memory of a sort.
>

Presumably, one could do a mapping like:
P_LineB = (V_Line>>1)*3

So, V_Line 0/1 map to P_Line 0/1/2, 2/3 to 3/4/5, ...
The LSB of V_Line selecting whether to use the low or high half of this
3-line pair.

Though, this is assuming that the addressing is itself relative to the
non-power-of-2 size.

If you want a non-power-of-2 line-size with power-of-2 addressing, that
is gonna suck.

One could have non-power-of-2 memory accesses though without switching
to non-power-of-2 cache line sizes though.

Either:
L1 cache supports non-power-of-2 access sizes, at byte alignment.
Or:
L1 cache itself does the address-remapping.

Though, the latter would get a bit wonky for things like virtual memory.

The former is more like:
Well, RAM addressing is byte-based, but supports 24, 48 and 96 bit
load/store operations or similar.

The is pretty doable, but would come at the cost of giving up any real
pretense of traditional power-of-2 struct member alignment (either that
or design overly convoluted rules for struct member alignment; probably
easier to be like, "all structs are packed, deal with it...").

> That is, I can have only four address buses out of the CPU. Two will be associated
> with 128-bit-wide data buses, and the other two with 64-bit-wide data buses.
>
> So one can, at max bandwidth, fetch six blocks of 64-bit words at a time, or
> four blocks of 96-bit words at a time.
>
> So the effective block size is increased, which is a penalty.
>
> The channels are divided into two groups, each one with one with a 128-bit bus
> and one with a 64-bit bus. And blocks of 96-bit words are grouped in pairs; a
> pair goes on one group of channels, and then the next pair goes on the other
> group of channels. So one can choose to fetch a pair of blocks, and _either_
> the pair before _or_ the pair after - the requirement for handling unaligned data.
>

Not really sure I understand what you are thinking of here.

> John Savard

Re: Changing the Width of Memory is Easy

<dba9e4bd-8175-42fc-aaa6-dc973dc26899n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24964&group=comp.arch#24964

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:1cc4:b0:435:b8a0:1fe9 with SMTP id g4-20020a0562141cc400b00435b8a01fe9mr1151076qvd.54.1651268968792;
Fri, 29 Apr 2022 14:49:28 -0700 (PDT)
X-Received: by 2002:a05:6870:d254:b0:e9:5d17:9e35 with SMTP id
h20-20020a056870d25400b000e95d179e35mr602795oac.154.1651268968588; Fri, 29
Apr 2022 14:49:28 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Apr 2022 14:49:28 -0700 (PDT)
In-Reply-To: <t4hd6u$pll$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:6947:3c86:73e1:a64e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:6947:3c86:73e1:a64e
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
<4a665dac-42ff-4c43-ae13-ec33bb9462b5n@googlegroups.com> <cedb8419-1ca2-4f15-aacc-1d4fcd68dcd6n@googlegroups.com>
<t4hd6u$pll$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <dba9e4bd-8175-42fc-aaa6-dc973dc26899n@googlegroups.com>
Subject: Re: Changing the Width of Memory is Easy
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Fri, 29 Apr 2022 21:49:28 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 28
 by: Quadibloc - Fri, 29 Apr 2022 21:49 UTC

On Friday, April 29, 2022 at 1:10:25 PM UTC-6, BGB wrote:

> > The channels are divided into two groups, each one with one with a 128-bit bus
> > and one with a 64-bit bus. And blocks of 96-bit words are grouped in pairs; a
> > pair goes on one group of channels, and then the next pair goes on the other
> > group of channels. So one can choose to fetch a pair of blocks, and _either_
> > the pair before _or_ the pair after - the requirement for handling unaligned data.

> Not really sure I understand what you are thinking of here.

With further thought, I see that if I don't need to handle unaligned items wider
than 60 bits, I don't need to have the memory with that wide a path to the CPU.

Essentially, I'm looking for a way to have a system:

Suitable for operating either with all power-of-2 sizes, or sizes that are of the
form 3 times a power of 2. In either case, addresses are binary.

At first, I thought it would work well with memory connected for the power
of 2 case, but then I realized that wouldn't quite work. I would need at least
a dual-channel connection to handle unaligned items.

And once that's the case, there would be a conflict with sizes including the
factor 3 that would lead to bandwidth inefficiencies. Or at least so it seemed.
Maybe I still have to think harder about this, and perhaps an appropriate scheme
of memory allocation could indeed allow things to run at full speed, except for
the penalty of a larger block size.

John Savard

Re: Changing the Width of Memory is Easy

<f3560cf3-b900-4778-b206-e60e5bd597f1n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24965&group=comp.arch#24965

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:e6a:b0:446:154a:7e02 with SMTP id jz10-20020a0562140e6a00b00446154a7e02mr2096585qvb.82.1651301811633;
Fri, 29 Apr 2022 23:56:51 -0700 (PDT)
X-Received: by 2002:a05:6830:1259:b0:605:d104:fa9e with SMTP id
s25-20020a056830125900b00605d104fa9emr1075828otp.298.1651301811404; Fri, 29
Apr 2022 23:56:51 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 29 Apr 2022 23:56:51 -0700 (PDT)
In-Reply-To: <dba9e4bd-8175-42fc-aaa6-dc973dc26899n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:6947:3c86:73e1:a64e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:6947:3c86:73e1:a64e
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
<4a665dac-42ff-4c43-ae13-ec33bb9462b5n@googlegroups.com> <cedb8419-1ca2-4f15-aacc-1d4fcd68dcd6n@googlegroups.com>
<t4hd6u$pll$1@dont-email.me> <dba9e4bd-8175-42fc-aaa6-dc973dc26899n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f3560cf3-b900-4778-b206-e60e5bd597f1n@googlegroups.com>
Subject: Re: Changing the Width of Memory is Easy
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sat, 30 Apr 2022 06:56:51 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Quadibloc - Sat, 30 Apr 2022 06:56 UTC

On Friday, April 29, 2022 at 3:49:30 PM UTC-6, Quadibloc wrote:

> With further thought, I see that if I don't need to handle unaligned items wider
> than 60 bits, I don't need to have the memory with that wide a path to the CPU.
>
> Essentially, I'm looking for a way to have a system:
>
> Suitable for operating either with all power-of-2 sizes, or sizes that are of the
> form 3 times a power of 2. In either case, addresses are binary.
>
> At first, I thought it would work well with memory connected for the power
> of 2 case, but then I realized that wouldn't quite work. I would need at least
> a dual-channel connection to handle unaligned items.
>
> And once that's the case, there would be a conflict with sizes including the
> factor 3 that would lead to bandwidth inefficiencies. Or at least so it seemed.
> Maybe I still have to think harder about this, and perhaps an appropriate scheme
> of memory allocation could indeed allow things to run at full speed, except for
> the penalty of a larger block size.

I have finally organized my thoughts on this matter, and come up with two
arrangements that meet the conditions I seek; one favors conventional
memory widths, and is quad-channel, with each channel being 64 bits wide,
the other favors the 12-bit unit, and is dual-channel, with one 128-bit wide
channel and the other channel being 64 bits wide.

A diagram showing how it would be organized is on the page

http://www.quadibloc.com/arch/per14.htm

John Savard

Re: Changing the Width of Memory is Easy

<acc5976a-5332-4314-9fd8-9b8b1f45404fn@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24967&group=comp.arch#24967

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:24c5:b0:69e:e777:4323 with SMTP id m5-20020a05620a24c500b0069ee7774323mr3227893qkn.465.1651335035059;
Sat, 30 Apr 2022 09:10:35 -0700 (PDT)
X-Received: by 2002:a05:6870:ec8c:b0:e9:365:7a53 with SMTP id
eo12-20020a056870ec8c00b000e903657a53mr3588481oab.269.1651335034729; Sat, 30
Apr 2022 09:10:34 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 30 Apr 2022 09:10:34 -0700 (PDT)
In-Reply-To: <f3560cf3-b900-4778-b206-e60e5bd597f1n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:6947:3c86:73e1:a64e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:6947:3c86:73e1:a64e
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
<4a665dac-42ff-4c43-ae13-ec33bb9462b5n@googlegroups.com> <cedb8419-1ca2-4f15-aacc-1d4fcd68dcd6n@googlegroups.com>
<t4hd6u$pll$1@dont-email.me> <dba9e4bd-8175-42fc-aaa6-dc973dc26899n@googlegroups.com>
<f3560cf3-b900-4778-b206-e60e5bd597f1n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <acc5976a-5332-4314-9fd8-9b8b1f45404fn@googlegroups.com>
Subject: Re: Changing the Width of Memory is Easy
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sat, 30 Apr 2022 16:10:35 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 16
 by: Quadibloc - Sat, 30 Apr 2022 16:10 UTC

On Saturday, April 30, 2022 at 12:56:53 AM UTC-6, Quadibloc wrote:

> I have finally organized my thoughts on this matter, and come up with two
> arrangements that meet the conditions I seek; one favors conventional
> memory widths, and is quad-channel, with each channel being 64 bits wide,
> the other favors the 12-bit unit, and is dual-channel, with one 128-bit wide
> channel and the other channel being 64 bits wide.
>
> A diagram showing how it would be organized is on the page
>
> http://www.quadibloc.com/arch/per14.htm

And now this page has been augmented with an additional diagram, showing
an additional alternative: a conventional dual-channel arrangement, with
each channel 192 bits wide.

John Savard

Re: Changing the Width of Memory is Easy

<47643080-8db9-4c6a-9898-cea82b3f8d48n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24973&group=comp.arch#24973

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:d87:b0:67b:311c:ecbd with SMTP id q7-20020a05620a0d8700b0067b311cecbdmr4290733qkl.146.1651363830760;
Sat, 30 Apr 2022 17:10:30 -0700 (PDT)
X-Received: by 2002:a05:6870:f619:b0:e9:6d65:4aae with SMTP id
ek25-20020a056870f61900b000e96d654aaemr3992521oab.126.1651363830514; Sat, 30
Apr 2022 17:10:30 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 30 Apr 2022 17:10:30 -0700 (PDT)
In-Reply-To: <acc5976a-5332-4314-9fd8-9b8b1f45404fn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:6947:3c86:73e1:a64e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:6947:3c86:73e1:a64e
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
<4a665dac-42ff-4c43-ae13-ec33bb9462b5n@googlegroups.com> <cedb8419-1ca2-4f15-aacc-1d4fcd68dcd6n@googlegroups.com>
<t4hd6u$pll$1@dont-email.me> <dba9e4bd-8175-42fc-aaa6-dc973dc26899n@googlegroups.com>
<f3560cf3-b900-4778-b206-e60e5bd597f1n@googlegroups.com> <acc5976a-5332-4314-9fd8-9b8b1f45404fn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <47643080-8db9-4c6a-9898-cea82b3f8d48n@googlegroups.com>
Subject: Re: Changing the Width of Memory is Easy
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sun, 01 May 2022 00:10:30 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 50
 by: Quadibloc - Sun, 1 May 2022 00:10 UTC

On Saturday, April 30, 2022 at 10:10:36 AM UTC-6, Quadibloc wrote:
> On Saturday, April 30, 2022 at 12:56:53 AM UTC-6, Quadibloc wrote:
>
> > I have finally organized my thoughts on this matter, and come up with two
> > arrangements that meet the conditions I seek; one favors conventional
> > memory widths, and is quad-channel, with each channel being 64 bits wide,
> > the other favors the 12-bit unit, and is dual-channel, with one 128-bit wide
> > channel and the other channel being 64 bits wide.
> >
> > A diagram showing how it would be organized is on the page
> >
> > http://www.quadibloc.com/arch/per14.htm

> And now this page has been augmented with an additional diagram, showing
> an additional alternative: a conventional dual-channel arrangement, with
> each channel 192 bits wide.

And now, in a Kekulé benzene moment, it has finally come to me how a
conventional dual-channel arrangement, with each channel 128 bits wide,
can be used, and meet the conditions that I have! It just required a little
bit of original thinking.

And what is the point of this? Well, given that the success of the x86 has shown
that computer architectures tend towards being a monoculture, or a "natural
monopoly" like the telephone company...

I think if there must be only one computer architecture, it _must_ be capable
of emulating, at near-native speeds, the major historic architectures of the past...

the Control Data 1604
the Control Data 6600
the IBM 7094
the PDP-10

as well as the easy conventional case of the IBM 360! And thus, I have struggled
to find a memory architecture ideally suited to handling dusty decks. Of course,
by now, they're _very_ dusty indeed, if indeed any of them can even be found.

John Savard

Re: Changing the Width of Memory is Easy

<8a338f02-76dc-4ae6-b2e7-b575392c9d93n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24976&group=comp.arch#24976

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:4115:b0:69f:c00e:7aec with SMTP id j21-20020a05620a411500b0069fc00e7aecmr4828720qko.631.1651388660760;
Sun, 01 May 2022 00:04:20 -0700 (PDT)
X-Received: by 2002:a05:6830:2475:b0:605:4339:dbc9 with SMTP id
x53-20020a056830247500b006054339dbc9mr2379918otr.313.1651388660497; Sun, 01
May 2022 00:04:20 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 1 May 2022 00:04:20 -0700 (PDT)
In-Reply-To: <5198c09e-a4fb-4e8e-be24-180e1dfefe32n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:6947:3c86:73e1:a64e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:6947:3c86:73e1:a64e
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
<Kdx*z7UMy@news.chiark.greenend.org.uk> <5198c09e-a4fb-4e8e-be24-180e1dfefe32n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <8a338f02-76dc-4ae6-b2e7-b575392c9d93n@googlegroups.com>
Subject: Re: Changing the Width of Memory is Easy
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sun, 01 May 2022 07:04:20 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 89
 by: Quadibloc - Sun, 1 May 2022 07:04 UTC

On Friday, April 29, 2022 at 6:02:43 AM UTC-6, Michael S wrote:

> I'm not sure that John can explain what the problem is :(

Whatever the problem is, I think I've finally, at long last,
found an acceptable solution, as shown at the bottom of the
page:

http://www.quadibloc.com/arch/per14.htm

However, that question _is_ food for thought. What _is_ the
problem?

The goal, as I've stated in another post, is to be able to handle,
with the same efficiency as native data types, the data types used
by computers like these:

the Control Data 1604
the Control Data 6600
the IBM 7094
the PDP-10

which don't fit neatly and tidily into our power-of-2 world.

Why are such data types a problem?

When a computer is built up from memory modules that are 64
bits wide, addressing and fetching is most efficient if all the data
types have power-of-2 lengths - 8 bits, 16 bits, 32 bits, 64 bits -
_and_ they're all fully aligned; 64 bit data is aligned on 64-bit
boundaries, and so on.

Then, if your memory bus is wide enough, everything can be
read in a single fetch.

If you split the memory bus into two halves, giving each half its
own address bus, then data that's half the width of the memory
data bus can always be read in a single fetch, even if it crosses
boundaries because it isn't aligned. That's because you can read
the low-address half of the data bus with an address that's one
higher than used for the high-address half of the data bus.

Now, then, 48-bit numbers, which can only be considered to be
aligned on 16-bit boundaries in a power-of-2 system, and so
if you have a packed array of them, some will cross boundaries,
stop being a problem if you can handle unaligned data.

This will work for 72-bit floats as well, since they're nine bytes
long.

36-bit numbers and 60-bit numbers, though, still aren't problems.

But they, too, could be handled without leaving the power-of-2
world. Once one can handle unaligned operands, the only _other_
thing that's absolutely needed... would be to go from *byte*
addressing to *nibble* addressing!

However, that makes _all addresses_ one bit longer, for what is
presumably a rarely used case.

So, instead, to avoid that, I've gone to the scheme I'm talking
about here. One can put the computer into a special mode where
data memory is treated as if the data bus is 48 or 96 bits wide,
and addressing is based on a 12-bit unit instead of the 8-bit
byte. (Perhaps _only_ instructions accessing *floating point*
data are modified, with instructions for integer data staying in
power-of-2 memory space.)

That way, I don't lose an address bit.

This works great for 48-bit floating-point numbers, since they're
now the native length.

36-bit and 60-bit floats, on the other hand, are like 48-bit floats
were in power-of-2 memory space. They need to be addressed
with 12-bit alignment, and so this new 3 times a power-of-2
space still has to support unaligned operands.

Once that support is in place, though, it's trivial to make it look
to the programmer that the computer has a 36-bit word or a 60-bit
word - just multiply addresses (within 12 bit space) by 3 or 5.

And that does suggest, though, that there is a simpler way to
do things that I may not have considered. Going to nibble
addressing in general may not be desirable, but one could go into
12-bit space by multiplying addresses by 3 and _then_ feeding them
to an otherwise hidden addressing system that does nibble
addressing.

John Savard

Re: Changing the Width of Memory is Easy

<6b97b44b-c7c8-400c-a4f1-00f1bb2a6750n@googlegroups.com>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=24977&group=comp.arch#24977

 copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:2a8e:b0:443:8a10:c1ca with SMTP id jr14-20020a0562142a8e00b004438a10c1camr5364996qvb.88.1651390068200;
Sun, 01 May 2022 00:27:48 -0700 (PDT)
X-Received: by 2002:a05:6808:1141:b0:325:cd92:ef8d with SMTP id
u1-20020a056808114100b00325cd92ef8dmr2995462oiu.228.1651390067918; Sun, 01
May 2022 00:27:47 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 1 May 2022 00:27:47 -0700 (PDT)
In-Reply-To: <8a338f02-76dc-4ae6-b2e7-b575392c9d93n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2001:56a:fb70:6300:6947:3c86:73e1:a64e;
posting-account=1nOeKQkAAABD2jxp4Pzmx9Hx5g9miO8y
NNTP-Posting-Host: 2001:56a:fb70:6300:6947:3c86:73e1:a64e
References: <1b7446d6-171a-471b-ab3d-139b5ae062c4n@googlegroups.com>
<Kdx*z7UMy@news.chiark.greenend.org.uk> <5198c09e-a4fb-4e8e-be24-180e1dfefe32n@googlegroups.com>
<8a338f02-76dc-4ae6-b2e7-b575392c9d93n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6b97b44b-c7c8-400c-a4f1-00f1bb2a6750n@googlegroups.com>
Subject: Re: Changing the Width of Memory is Easy
From: jsav...@ecn.ab.ca (Quadibloc)
Injection-Date: Sun, 01 May 2022 07:27:48 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 35
 by: Quadibloc - Sun, 1 May 2022 07:27 UTC

On Sunday, May 1, 2022 at 1:04:22 AM UTC-6, Quadibloc wrote:

> And that does suggest, though, that there is a simpler way to
> do things that I may not have considered. Going to nibble
> addressing in general may not be desirable, but one could go into
> 12-bit space by multiplying addresses by 3 and _then_ feeding them
> to an otherwise hidden addressing system that does nibble
> addressing.

And asking myself the question of why I didn't do it that way all
along, instead of going to the somewhat elaborate lengths
described at

http://www.quadibloc.com/arch/per14.htm

has allowed me to articulate what the "problem" was I was
trying to solve there.
Basically, multiplying addresses by 3, and then using them
as nibble addresses, in a system that can handle unaligned
data, certainly _does_ allow efficient fetching of 36-bit or
60-bit quantities.
But then, after they're fetched from memory, they have to be
moved into alignment to be sent to registers or to ALUs. This
would be done on a barrel shifter with a minimum movement
step of four bits.
The elaborate scheme on the page

http://www.quadibloc.com/arch/per14.htm

on the other hand, moves _large_ chunks of data around,
128 bits or 64 bits in length, and then produces memory that
can now be accessed with shifts on a step size of *12* bits.
So what I was (perhaps subconsciously!) doing is trying to
shave a layer or two of gates off of the data path from memory.

John Savard

1
server_pubkey.txt

rocksolid light 0.9.7
clearnet tor