Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

One Bell System - it works.


computers / comp.arch / Re: Misc: Cache Sizes / Observations

Re: Misc: Cache Sizes / Observations

<3215daef-0814-4dd1-b2ae-a00cd22d0d1dn@googlegroups.com>

  copy mid

https://www.novabbs.com/computers/article-flat.php?id=23886&group=comp.arch#23886

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5dca:0:b0:2de:57d8:7a89 with SMTP id e10-20020ac85dca000000b002de57d87a89mr26106985qtx.635.1646264600255;
Wed, 02 Mar 2022 15:43:20 -0800 (PST)
X-Received: by 2002:a05:6808:3021:b0:2cf:177:968a with SMTP id
ay33-20020a056808302100b002cf0177968amr2176652oib.119.1646264600008; Wed, 02
Mar 2022 15:43:20 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 2 Mar 2022 15:43:19 -0800 (PST)
In-Reply-To: <svokin$rtp$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:446a:c769:120a:97bb;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:446a:c769:120a:97bb
References: <svj5qj$19u$1@dont-email.me> <svjak4$9kl$1@dont-email.me>
<svkeq1$phm$1@dont-email.me> <svm7r9$3ec$1@dont-email.me> <2022Mar2.101217@mips.complang.tuwien.ac.at>
<svoa5q$ud$1@dont-email.me> <2022Mar2.192515@mips.complang.tuwien.ac.at>
<svogd7$o9s$1@dont-email.me> <svokin$rtp$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <3215daef-0814-4dd1-b2ae-a00cd22d0d1dn@googlegroups.com>
Subject: Re: Misc: Cache Sizes / Observations
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 02 Mar 2022 23:43:20 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 186
 by: MitchAlsup - Wed, 2 Mar 2022 23:43 UTC

On Wednesday, March 2, 2022 at 2:37:47 PM UTC-6, Stephen Fuld wrote:
> On 3/2/2022 11:26 AM, BGB wrote:
> > On 3/2/2022 12:25 PM, Anton Ertl wrote:
> >> BGB <cr8...@gmail.com> writes:
> >>> On 3/2/2022 3:12 AM, Anton Ertl wrote:
> >>>> A 512KB cache with 64-byte lines has 8192 cache lines, just like a
> >>>> 256KB cache with 32-byte lines and a 128KB cache with 16-byte lines.
> >>>> Without spatial locality, I would expect similar miss rates for all of
> >>>> them for the same associativity; and given that programs have spatial
> >>>> locality, I would expect the larger among these configurations to have
> >>>> an advantage. Are you sure that your cache simulator has no bugs?
> >> ...
> >>> After fixing this bug:
> >>> 131072 2.004% 1.318% (1 way, 16 line)
> >>> 65536 2.851% 1.516% (1 way, 16 line)
> >>> 32768 3.540% 1.445% (1 way, 16 line)
> >>> 16384 6.604% 4.043% (1 way, 16 line)
> >>> 8192 9.112% 6.052% (1 way, 16 line)
> >>> 4096 11.310% 4.504% (1 way, 16 line)
> >>> 2048 14.326% 5.990% (1 way, 16 line)
> >>> 1024 17.632% 7.066% (1 way, 16 line)
> >>>
> >>> 131072 1.966% 0.821% (2 way, 16 line)
> >>> 65536 2.550% 0.513% (2 way, 16 line)
> >>> 32768 3.303% 0.791% (2 way, 16 line)
> >>> 16384 6.779% 3.842% (2 way, 16 line)
> >>> 8192 8.484% 1.766% (2 way, 16 line)
> >>> 4096 10.905% 2.773% (2 way, 16 line)
> >>> 2048 13.588% 3.319% (2 way, 16 line)
> >>> 1024 16.022% 3.229% (2 way, 16 line)
> >>>
> >>> 262144 15.748% 14.750% (1 way, 32 line)
> >>> 131072 15.929% 14.382% (1 way, 32 line)
> >>> 65536 16.274% 14.057% (1 way, 32 line)
> >>> 32768 16.843% 14.122% (1 way, 32 line)
> >>> 16384 19.819% 16.567% (1 way, 32 line)
> >>> 8192 22.360% 18.319% (1 way, 32 line)
> >>> 4096 24.583% 16.422% (1 way, 32 line)
> >>> 2048 27.325% 16.513% (1 way, 32 line)
> >>>
> >>> 262144 2.093% 0.872% (2 way, 32 line)
> >>> 131072 2.893% 0.836% (2 way, 32 line)
> >>> 65536 3.540% 0.949% (2 way, 32 line)
> >>> 32768 4.854% 1.808% (2 way, 32 line)
> >>> 16384 8.788% 4.997% (2 way, 32 line)
> >>> 8192 11.456% 3.638% (2 way, 32 line)
> >>> 4096 14.699% 4.352% (2 way, 32 line)
> >>> 2048 17.580% 4.251% (2 way, 32 line)
> >>>
> >>> 524288 10.825% 10.370% (1 way, 64 line)
> >>> 262144 11.019% 10.412% (1 way, 64 line)
> >>> 131072 11.188% 10.232% (1 way, 64 line)
> >>> 65536 11.652% 10.320% (1 way, 64 line)
> >>> 32768 12.388% 10.699% (1 way, 64 line)
> >>> 16384 15.806% 13.581% (1 way, 64 line)
> >>> 8192 18.753% 15.604% (1 way, 64 line)
> >>> 4096 21.158% 13.259% (1 way, 64 line)
> >>>
> >>> 524288 0.863% 0.439% (2 way, 64 line)
> >>> 262144 1.297% 0.535% (2 way, 64 line)
> >>> 131072 1.896% 0.700% (2 way, 64 line)
> >>> 65536 2.505% 0.979% (2 way, 64 line)
> >>> 32768 3.876% 1.923% (2 way, 64 line)
> >>> 16384 8.341% 5.535% (2 way, 64 line)
> >>> 8192 11.346% 3.956% (2 way, 64 line)
> >>> 4096 14.262% 3.722% (2 way, 64 line)
> >>>
> >>>
> >>> It would appear that 1-way still does poorly with larger cache lines,
> >>
> >> Probably still a bug, see above.
> >>
> >>> As for why now 1-way 32B appears to be doing worse than 1-way 64B, no
> >>> idea, this doesn't really make sense.
> >>
> >> For the same number of cache lines, that's expected, due to spatial
> >> locality.
> >>
> >
> > After posting this, I did find another bug:
> > The cache index pairs were always being calculated as-if the cache line
> > were 16B. After fixing this bug, all the lines got much closer together
> > (as shown on another graph posted to Twitter after the last graph).
> >
> > In this case, now the overall hit/miss ratio is more consistent between
> > cache-line sizes. The main difference is that larger cache-line sizes
> > still have a higher proportion of conflict misses.
> >
> >
> > Values following the more recent bugfix:
> > 131072 2.004% 1.318% (1 way, 16 line)
> > 65536 2.851% 1.516% (1 way, 16 line)
> > 32768 3.540% 1.445% (1 way, 16 line)
> > 16384 6.604% 4.043% (1 way, 16 line)
> > 8192 9.112% 6.052% (1 way, 16 line)
> > 4096 11.310% 4.504% (1 way, 16 line)
> > 2048 14.326% 5.990% (1 way, 16 line)
> > 1024 17.632% 7.066% (1 way, 16 line)
> >
> > 131072 1.966% 0.821% (2 way, 16 line)
> > 65536 2.550% 0.513% (2 way, 16 line)
> > 32768 3.303% 0.791% (2 way, 16 line)
> > 16384 6.779% 3.842% (2 way, 16 line)
> > 8192 8.484% 1.766% (2 way, 16 line)
> > 4096 10.905% 2.773% (2 way, 16 line)
> > 2048 13.588% 3.319% (2 way, 16 line)
> > 1024 16.022% 3.229% (2 way, 16 line)
> >
> > 262144 0.905% 0.630% (1 way, 32 line)
> > 131072 1.326% 0.912% (1 way, 32 line)
> > 65536 1.976% 1.160% (1 way, 32 line)
> > 32768 2.712% 1.504% (1 way, 32 line)
> > 16384 5.964% 4.456% (1 way, 32 line)
> > 8192 8.728% 6.671% (1 way, 32 line)
> > 4096 11.262% 5.347% (1 way, 32 line)
> > 2048 14.579% 6.659% (1 way, 32 line)
> >
> > 262144 0.671% 0.329% (2 way, 32 line)
> > 131072 1.148% 0.436% (2 way, 32 line)
> > 65536 1.514% 0.348% (2 way, 32 line)
> > 32768 2.259% 0.795% (2 way, 32 line)
> > 16384 6.041% 4.169% (2 way, 32 line)
> > 8192 8.102% 2.335% (2 way, 32 line)
> > 4096 10.690% 3.071% (2 way, 32 line)
> > 2048 13.354% 3.134% (2 way, 32 line)
> >
> > 524288 0.442% 0.311% (1 way, 64 line)
> > 262144 0.717% 0.561% (1 way, 64 line)
> > 131072 1.023% 0.759% (1 way, 64 line)
> > 65536 1.661% 1.154% (1 way, 64 line)
> > 32768 2.538% 1.818% (1 way, 64 line)
> > 16384 6.168% 5.201% (1 way, 64 line)
> > 8192 9.279% 7.640% (1 way, 64 line)
> > 4096 12.084% 6.163% (1 way, 64 line)
> >
> > 524288 0.219% 0.079% (2 way, 64 line)
> > 262144 0.428% 0.215% (2 way, 64 line)
> > 131072 0.697% 0.251% (2 way, 64 line)
> > 65536 0.984% 0.298% (2 way, 64 line)
> > 32768 1.924% 1.005% (2 way, 64 line)
> > 16384 6.169% 4.797% (2 way, 64 line)
> > 8192 8.642% 2.983% (2 way, 64 line)
> > 4096 10.878% 2.647% (2 way, 64 line)
> >
> >
> > Though, it seems 16B cache lines are no longer a clear winner in this
> > case...
> I am not sure how you determined this. To do an apples to apples
> comparison, as Anton explained, you have to compare the same amount of
> total cache. Thus, for example, comparing a 16 byte line at a
> particular cache size should be compared against a 32 byte line at twice
> the cache size. Otherwise, the you can't distinguish between the total
> cache size effect and the cache line size effect. To me, it seems like
> for equivalent cache sizes, 16B lines are a clear win. This reflects a
> higher proportion of temporal locality than spatial locality.
<
And then there is that bus occupancy thing.........
>
> BTW, with a little more work, you can distinguish spatial locality hits
> from temporal locality hits. You have to keep the actual address that
> caused the miss, then on subsequent hits, see if they are to the same or
> a different address.
<
We figured out (around 1992) that one could model all cache sizes
and association levels simultaneously.
>
> One more suggestion. If you are not outputting a trace, but keeping the
> statistics "on the fly" in the emulator, this is non-optimal. Yes,
> outputting the full trace (of all loads and stores) will slow the
> emulator down significantly, you only have to do it once. Then by
> creating another program that just emulates the cache behavior, you can
> run lots of tests on the same trace data (different line sizes, cache
> sizes, LRU policies, number of ways, etc.) without rerunning the full
> simulator. This second program should be quite fast, as it doesn't have
> to emulate the whole CPU.
> > There is still a hump in the conflict miss estimates, I still suspect
> > this is likely due to the smaller caches hitting the limit of the 4-way
> > estimator.
> >
> > Well, either that, or I am not estimating conflict miss rate correctly...
> See
>
>
> https://en.wikipedia.org/wiki/Cache_performance_measurement_and_metric#Conflict_misses
> --
> - Stephen Fuld
> (e-mail address disguised to prevent spam)

SubjectRepliesAuthor
o Misc: Cache Sizes / Observations

By: BGB on Mon, 28 Feb 2022

23BGB
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor