Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Make it right before you make it faster.


devel / comp.arch / Re: Packing Hash Tables

SubjectAuthor
* Packing Hash Tablesrobf...@gmail.com
+* Re: Packing Hash TablesBGB
|`- Re: Packing Hash Tablesrobf...@gmail.com
`* Re: Packing Hash TablesEricP
 `* Re: Packing Hash TablesBGB
  `* Re: Packing Hash Tablesrobf...@gmail.com
   `* Re: Packing Hash TablesBGB
    `* Re: Packing Hash Tablesrobf...@gmail.com
     `* Re: Packing Hash TablesBGB
      `* Re: Packing Hash Tablesrobf...@gmail.com
       `* Re: Packing Hash TablesMitchAlsup
        +* Re: Packing Hash Tablesrobf...@gmail.com
        |`* Re: Packing Hash TablesMitchAlsup
        | `* Re: Packing Hash TablesEricP
        |  +- Re: Packing Hash TablesBGB
        |  +* Re: Packing Hash TablesMitchAlsup
        |  |`- Re: Packing Hash TablesEricP
        |  `* Re: Packing Hash TablesMitchAlsup
        |   +* Re: Packing Hash Tablesrobf...@gmail.com
        |   |`* Re: Packing Hash TablesMitchAlsup
        |   | `* Re: Packing Hash Tablesrobf...@gmail.com
        |   |  `* Re: Packing Hash Tablesrobf...@gmail.com
        |   |   +- Re: Packing Hash TablesEricP
        |   |   `* Re: Packing Hash TablesMitchAlsup
        |   |    `* Re: Packing Hash TablesEricP
        |   |     `* Re: Packing Hash TablesMitchAlsup
        |   |      +- Re: Packing Hash Tablesrobf...@gmail.com
        |   |      +- Re: Packing Hash TablesMitchAlsup
        |   |      +* Re: Packing Hash TablesBGB
        |   |      |+* Re: Packing Hash TablesMitchAlsup
        |   |      ||`* Re: Packing Hash TablesBGB
        |   |      || +* Re: Packing Hash Tablesrobf...@gmail.com
        |   |      || |+* Re: Packing Hash TablesThomas Koenig
        |   |      || ||+* Re: Packing Hash Tablesrobf...@gmail.com
        |   |      || |||`- Re: Packing Hash TablesThomas Koenig
        |   |      || ||`- Re: Packing Hash TablesMitchAlsup
        |   |      || |+- Re: Packing Hash TablesMitchAlsup
        |   |      || |`- Re: Packing Hash TablesMitchAlsup
        |   |      || +* Re: Packing Hash Tablesrobf...@gmail.com
        |   |      || |`* Re: Packing Hash TablesBGB
        |   |      || | `* Re: Packing Hash Tablesaph
        |   |      || |  `* Re: Packing Hash TablesBGB
        |   |      || |   `* Re: Packing Hash TablesMitchAlsup
        |   |      || |    +* Re: Packing Hash Tablesrobf...@gmail.com
        |   |      || |    |`- Re: Packing Hash TablesEricP
        |   |      || |    `* Re: Packing Hash TablesBGB
        |   |      || |     `* Re: Packing Hash Tablesrobf...@gmail.com
        |   |      || |      `* Re: Packing Hash TablesBGB
        |   |      || |       `- Re: Packing Hash TablesIvan Godard
        |   |      || `- Re: Packing Hash Tablesrobf...@gmail.com
        |   |      |+- Re: Packing Hash Tablesrobf...@gmail.com
        |   |      |`* Re: Packing Hash TablesEricP
        |   |      | +- Re: Packing Hash Tablesrobf...@gmail.com
        |   |      | `* Re: Packing Hash TablesMitchAlsup
        |   |      |  +* Re: Packing Hash TablesEricP
        |   |      |  |`* Re: Packing Hash TablesMitchAlsup
        |   |      |  | `* Re: Packing Hash TablesEricP
        |   |      |  |  `* Re: Packing Hash TablesMitchAlsup
        |   |      |  |   `* Re: Packing Hash TablesEricP
        |   |      |  |    +* Re: Packing Hash Tablesrobf...@gmail.com
        |   |      |  |    |`- Re: Packing Hash TablesMitchAlsup
        |   |      |  |    `* Re: Packing Hash Tablesrobf...@gmail.com
        |   |      |  |     `* Re: Packing Hash TablesBGB
        |   |      |  |      `* Re: Packing Hash Tablesrobf...@gmail.com
        |   |      |  |       +- Re: Packing Hash TablesBGB
        |   |      |  |       +- Re: Packing Hash TablesMitchAlsup
        |   |      |  |       `- Re: Packing Hash Tablesrobf...@gmail.com
        |   |      |  `* Re: Packing Hash TablesIvan Godard
        |   |      |   `* Re: Packing Hash TablesMitchAlsup
        |   |      |    `* Re: Packing Hash TablesThomas Koenig
        |   |      |     `* Re: Packing Hash TablesMitchAlsup
        |   |      |      `* Re: Packing Hash TablesThomas Koenig
        |   |      |       `* Re: Packing Hash TablesStephen Fuld
        |   |      |        `- Re: Packing Hash TablesEricP
        |   |      `* Re: Packing Hash TablesEricP
        |   |       `- Re: Packing Hash TablesMitchAlsup
        |   `- Re: Packing Hash TablesEricP
        `- Re: Packing Hash TablesBGB

Pages:1234
Packing Hash Tables

<43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24083&group=comp.arch#24083

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:4ee5:0:b0:435:4480:58da with SMTP id dv5-20020ad44ee5000000b00435448058damr14957920qvb.131.1646808404732;
Tue, 08 Mar 2022 22:46:44 -0800 (PST)
X-Received: by 2002:a05:6808:2185:b0:2d9:ebf0:fb66 with SMTP id
be5-20020a056808218500b002d9ebf0fb66mr4914842oib.69.1646808404426; Tue, 08
Mar 2022 22:46:44 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 8 Mar 2022 22:46:44 -0800 (PST)
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:fb00:10c6:8cfc:6b3d:864c;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:fb00:10c6:8cfc:6b3d:864c
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
Subject: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Wed, 09 Mar 2022 06:46:44 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 13
 by: robf...@gmail.com - Wed, 9 Mar 2022 06:46 UTC

To-nights quandary, packing hash tables using a background process.
Tried to find info on this on the web. Anybody got any pointers? Using
a hash table to store address translations for memory management.
When there is a miss the table is searched with quadratic probing. If
it’s a real miss and no translation is present, then the first empty slot
in the hash table should be used to store the translation. As entries get removed from the table, following entries could be moved into those
empty positions to reduce search times. I think it requires keeping
track of the number of times the entry bounced before finding an
open spot. I was going to use a four-bit count to record this. The
bounce count needs to be present in the PTE. Entries that hash to the
same value are grouped together in a page table group, PTG, that
contains five PTEs.

Re: Packing Hash Tables

<t09pl6$nfp$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24087&group=comp.arch#24087

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
Date: Wed, 9 Mar 2022 02:48:34 -0600
Organization: A noiseless patient Spider
Lines: 32
Message-ID: <t09pl6$nfp$1@dont-email.me>
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 9 Mar 2022 08:48:38 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8df13030d16cd666421288ad29864a11";
logging-data="24057"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19z749kUT1CyoxYH5VoRsod"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:8K0F0w6OEBJTJ2byPbieGQPMzRI=
In-Reply-To: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
Content-Language: en-US
 by: BGB - Wed, 9 Mar 2022 08:48 UTC

On 3/9/2022 12:46 AM, robf...@gmail.com wrote:
> To-nights quandary, packing hash tables using a background process.
> Tried to find info on this on the web. Anybody got any pointers? Using
> a hash table to store address translations for memory management.
> When there is a miss the table is searched with quadratic probing. If
> it’s a real miss and no translation is present, then the first empty slot
> in the hash table should be used to store the translation. As entries get removed from the table, following entries could be moved into those
> empty positions to reduce search times. I think it requires keeping
> track of the number of times the entry bounced before finding an
> open spot. I was going to use a four-bit count to record this. The
> bounce count needs to be present in the PTE. Entries that hash to the
> same value are grouped together in a page table group, PTG, that
> contains five PTEs.

It is not really at all obvious what you mean by this...

But, yeah, assuming you are talking about a TLB:
In my case, it is 4 way associative, so all 4 entries get fetched from
the TLB array at the same time;
It checks for a hit;
Whichever entry hits is moved to the front of the list (with some
special cases for 96-bit addresses);
All 4 entries are written back to the TLB (in their updated order).

Currently, the index into the TLB array is address modulo, eg,
Addr[21:14]. This was because I had done some testing and found that
modulo had some useful properties when compared with an XOR hashed index.

A linear probing approach is probably not particularly likely to be
worthwhile IMO.

Re: Packing Hash Tables

<c06e5240-8a21-4f43-bd87-38f201e4b04cn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24091&group=comp.arch#24091

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a0c:d692:0:b0:432:3605:6192 with SMTP id k18-20020a0cd692000000b0043236056192mr16091077qvi.90.1646833583418;
Wed, 09 Mar 2022 05:46:23 -0800 (PST)
X-Received: by 2002:a05:6870:4346:b0:da:b3f:2b23 with SMTP id
x6-20020a056870434600b000da0b3f2b23mr5699014oah.194.1646833583116; Wed, 09
Mar 2022 05:46:23 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!3.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 9 Mar 2022 05:46:22 -0800 (PST)
In-Reply-To: <t09pl6$nfp$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:fb00:7c31:4d95:b674:3208;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:fb00:7c31:4d95:b674:3208
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com> <t09pl6$nfp$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c06e5240-8a21-4f43-bd87-38f201e4b04cn@googlegroups.com>
Subject: Re: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Wed, 09 Mar 2022 13:46:23 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 99
 by: robf...@gmail.com - Wed, 9 Mar 2022 13:46 UTC

On Wednesday, March 9, 2022 at 3:48:43 AM UTC-5, BGB wrote:
> On 3/9/2022 12:46 AM, robf...@gmail.com wrote:
> > To-nights quandary, packing hash tables using a background process.
> > Tried to find info on this on the web. Anybody got any pointers? Using
> > a hash table to store address translations for memory management.
> > When there is a miss the table is searched with quadratic probing. If
> > it’s a real miss and no translation is present, then the first empty slot
> > in the hash table should be used to store the translation. As entries get removed from the table, following entries could be moved into those
> > empty positions to reduce search times. I think it requires keeping
> > track of the number of times the entry bounced before finding an
> > open spot. I was going to use a four-bit count to record this. The
> > bounce count needs to be present in the PTE. Entries that hash to the
> > same value are grouped together in a page table group, PTG, that
> > contains five PTEs.
> It is not really at all obvious what you mean by this...
>
> But, yeah, assuming you are talking about a TLB:
> In my case, it is 4 way associative, so all 4 entries get fetched from
> the TLB array at the same time;
> It checks for a hit;
> Whichever entry hits is moved to the front of the list (with some
> special cases for 96-bit addresses);
> All 4 entries are written back to the TLB (in their updated order).
>
>
> Currently, the index into the TLB array is address modulo, eg,
> Addr[21:14]. This was because I had done some testing and found that
> modulo had some useful properties when compared with an XOR hashed index.
>
> A linear probing approach is probably not particularly likely to be
> worthwhile IMO.

Entries in the TLB are searched in parallel for a match. The TLB uses a modulo
index. Only part of the virtual address page number is stored in the TLB since
the modulo index gives bits 12 to 21 of the virtual address. Translations are being
stored in a hash table. What I am wondering about is accelerating searches of the
hash table on a TLB miss. If entries in the hash table can be moved around so
they are closer to the initial hash position then the search time can be reduced. I
assume this could be done by a background process. I am wondering if it is
worth it to do?

This is for a hashed page table where the entries are stored in groups PTGs.. This
is borrowing a piece gleaned from the PowerPC architecture. But it could be a
hash-table in general. PowerPC I studied uses a double-hash to select two different PTGs
to search. Makes a lot of sense since translations are usually found within two or
three steps. I decided to use quadratic probing instead of a double-hash. It has the
property that PTGs are selected close by the original target, getting progressively
further away as the miss count increases. I think the idea is to try and make use of
some locality of reference.

The hashed table is searched on a TLB miss. The hashed table is organized into
groups of PTEs so that they can be searched in parallel for a translation match. If
there is one, then it is inserted into the TLB. If there is not one, then the next PTG
is searched and so on. The next PTG is selected according to quadratic probing.
Since the probe function always returns the same distance for a given miss count,
if the miss count is known then related PTGs can be found by searching backwards
or forwards. It is possible to choose a backward location in a single calculation /
table lookup. For example, if the PTE is sitting in a PTG a distance of nine away from
its ideal location, if there is an empty spot closer to the ideal location it could be
moved. This is a bit like garbage collection. I am wondering if it is worthwhile to
garbage collect the table?

If it can be guaranteed that PTGs will always fill up before another one is chosen to
store a PTE in, then searching the PTGs can be made to stop as soon as an empty
entry in a PTG is found. Otherwise, searches must continue for some time since
empty entries might be due to deletes.

Re: Packing Hash Tables

<Ct4WJ.121302$z688.64338@fx35.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24096&group=comp.arch#24096

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!news.freedyn.de!newsreader4.netcologne.de!news.netcologne.de!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx35.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
In-Reply-To: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 32
Message-ID: <Ct4WJ.121302$z688.64338@fx35.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Wed, 09 Mar 2022 16:20:50 UTC
Date: Wed, 09 Mar 2022 11:20:17 -0500
X-Received-Bytes: 2476
 by: EricP - Wed, 9 Mar 2022 16:20 UTC

robf...@gmail.com wrote:
> To-nights quandary, packing hash tables using a background process.
> Tried to find info on this on the web. Anybody got any pointers? Using
> a hash table to store address translations for memory management.
> When there is a miss the table is searched with quadratic probing. If
> it’s a real miss and no translation is present, then the first empty slot
> in the hash table should be used to store the translation. As entries get removed from the table, following entries could be moved into those
> empty positions to reduce search times. I think it requires keeping
> track of the number of times the entry bounced before finding an
> open spot. I was going to use a four-bit count to record this. The
> bounce count needs to be present in the PTE. Entries that hash to the
> same value are grouped together in a page table group, PTG, that
> contains five PTEs.

Note that operating systems such as Linux the hierarchical page table is
part of its design. To port it to Power with its hash/inverted page table
they kept the hierarchical table managed in memory as before,
and move valid PTE entries into and out of the hash table.

In effect, the hierarchical table is still the OS page table,
and the hash table is treated like a giant software managed TLB.
Software could implement, say, a FIFO mechanism to rotate a working set
of Present hierarchical PTE's through hash PTE slots,
similar to how a HW TLB might rotate through its entries.

In this arrangement, Address Space ID's (ASID) for the hash table entries
would be very important for efficiency, as without them a process switch
would require invalidating all hash table entries just like a real TLB.

Re: Packing Hash Tables

<t0aokj$nnc$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24098&group=comp.arch#24098

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
Date: Wed, 9 Mar 2022 11:37:17 -0600
Organization: A noiseless patient Spider
Lines: 56
Message-ID: <t0aokj$nnc$1@dont-email.me>
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 9 Mar 2022 17:37:23 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="8df13030d16cd666421288ad29864a11";
logging-data="24300"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18jHoTOLkb9KaK+dViN0328"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:9NEjT0Hye9gG5gHx1jmQMglSsHo=
In-Reply-To: <Ct4WJ.121302$z688.64338@fx35.iad>
Content-Language: en-US
 by: BGB - Wed, 9 Mar 2022 17:37 UTC

On 3/9/2022 10:20 AM, EricP wrote:
> robf...@gmail.com wrote:
>> To-nights quandary, packing hash tables using a background process.
>> Tried to find info on this on the web. Anybody got any pointers? Using
>> a hash table to store address translations for memory management.
>> When there is a miss the table is searched with quadratic probing. If
>> it’s a real miss and no translation is present, then the first empty slot
>> in the hash table should be used to store the translation. As entries
>> get removed from the table, following entries could be moved into those
>> empty positions to reduce search times. I think it requires keeping
>> track of the number of times the entry bounced before finding an
>> open spot. I was going to use a four-bit count to record this. The
>> bounce count needs to be present in the PTE. Entries that hash to the
>> same value are grouped together in a page table group, PTG, that
>> contains five PTEs.
>
> Note that operating systems such as Linux the hierarchical page table is
> part of its design. To port it to Power with its hash/inverted page table
> they kept the hierarchical table managed in memory as before,
> and move valid PTE entries into and out of the hash table.
>
> In effect, the hierarchical table is still the OS page table,
> and the hash table is treated like a giant software managed TLB.
> Software could implement, say, a FIFO mechanism to rotate a working set
> of Present hierarchical PTE's through hash PTE slots,
> similar to how a HW TLB might rotate through its entries.
>
> In this arrangement, Address Space ID's (ASID) for the hash table entries
> would be very important for efficiency, as without them a process switch
> would require invalidating all hash table entries just like a real TLB.
>

OK. This gives more context for what I was confused about in my response.

In my case, I use a software managed TLB, with hierarchical page tables,
and no sort of intermediate stage representation.

TLB miss interrupt occurs, handler walks the page table, and loads the
resulting entry into the TLB. Basic design for the mechanism was
more-or-less lifted from SH-4, but with the major difference of going
from a 64-entry fully associative TLB to a 64x or 256x 4-way TLB.

Arguably, a 6-way or 8-way TLB could be better, but more expensive.

I had considered potentially using B-Trees for the upper levels (instead
of hierarchical nesting) for the 96-bit mode, mostly because as-is this
leads to a "very sparse" 8-level table, and a B-Tree could be more
compact (with more traditional nested page tables for the lower 2 or 3
levels).

Though, there are other (potentially simpler/cheaper) ways to deal with
this.

This would not really effect the hardware in my case.

Re: Packing Hash Tables

<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24126&group=comp.arch#24126

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:309:b0:2dd:53e9:ae80 with SMTP id q9-20020a05622a030900b002dd53e9ae80mr4212624qtw.557.1646924682840;
Thu, 10 Mar 2022 07:04:42 -0800 (PST)
X-Received: by 2002:a54:4611:0:b0:2d9:a01a:4be7 with SMTP id
p17-20020a544611000000b002d9a01a4be7mr3658603oip.270.1646924682449; Thu, 10
Mar 2022 07:04:42 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 10 Mar 2022 07:04:42 -0800 (PST)
In-Reply-To: <t0aokj$nnc$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:fb00:39b9:586a:ab26:74d1;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:fb00:39b9:586a:ab26:74d1
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Thu, 10 Mar 2022 15:04:42 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 66
 by: robf...@gmail.com - Thu, 10 Mar 2022 15:04 UTC

On Wednesday, March 9, 2022 at 12:37:26 PM UTC-5, BGB wrote:
> On 3/9/2022 10:20 AM, EricP wrote:

> > Note that operating systems such as Linux the hierarchical page table is
> > part of its design. To port it to Power with its hash/inverted page table
> > they kept the hierarchical table managed in memory as before,
> > and move valid PTE entries into and out of the hash table.
> >
> > In effect, the hierarchical table is still the OS page table,
> > and the hash table is treated like a giant software managed TLB.
> > Software could implement, say, a FIFO mechanism to rotate a working set
> > of Present hierarchical PTE's through hash PTE slots,
> > similar to how a HW TLB might rotate through its entries.
> >
> > In this arrangement, Address Space ID's (ASID) for the hash table entries
> > would be very important for efficiency, as without them a process switch
> > would require invalidating all hash table entries just like a real TLB.
> >
Seems kind of kludgy. I am wondering why now this approach was used.

I have coded in the bus interface unit the capability to use either hashed or
hierarchical page tables. I was not so insane as to ignore the popularity of
hierarchical page tables. Can probably use both types at the same time.
Hashed tables are a better way to go for a large address
space in my opinion. They use less memory and may have faster look-ups.

> OK. This gives more context for what I was confused about in my response.
>
> In my case, I use a software managed TLB, with hierarchical page tables,
> and no sort of intermediate stage representation.
>
> TLB miss interrupt occurs, handler walks the page table, and loads the
> resulting entry into the TLB. Basic design for the mechanism was
> more-or-less lifted from SH-4, but with the major difference of going
> from a 64-entry fully associative TLB to a 64x or 256x 4-way TLB.
>
> Arguably, a 6-way or 8-way TLB could be better, but more expensive.
>
>
> I had considered potentially using B-Trees for the upper levels (instead
> of hierarchical nesting) for the 96-bit mode, mostly because as-is this
> leads to a "very sparse" 8-level table, and a B-Tree could be more
> compact (with more traditional nested page tables for the lower 2 or 3
> levels).
>
> Though, there are other (potentially simpler/cheaper) ways to deal with
> this.
>
> This would not really effect the hardware in my case.

Page tables are walked via hardware for Thor. Software only gets involved on a
page fault. While software managing the TLB is probably comparable in
performance and more flexible than hardware, TLB updates with hardware
walking are almost invisible to software.

The TLB can be updated via software, its almost possible then to use a software
managed TLB. I guess if the TLB miss signal were fed into the interrupt system,
then maybe things could have the option of software management.

I get the feeling there are too many choices available in Thor. Maybe a configuration
utility would be better than building in everything.

Wondering what entity to place share counts, access keys, and privilege levels in. I
had these in the PTE but took them out. There is only one set required for each page
of memory.

Re: Packing Hash Tables

<t0df7v$afv$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24135&group=comp.arch#24135

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
Date: Thu, 10 Mar 2022 12:15:21 -0600
Organization: A noiseless patient Spider
Lines: 176
Message-ID: <t0df7v$afv$1@dont-email.me>
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 10 Mar 2022 18:15:27 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="ee42f2d046a766e1e323b9bf0d816cc9";
logging-data="10751"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18+FdzGtLHa8Nm2JtGbN58L"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:kyAZ6oyevkF6khJpx4ztKWskZ/8=
In-Reply-To: <056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com>
Content-Language: en-US
 by: BGB - Thu, 10 Mar 2022 18:15 UTC

On 3/10/2022 9:04 AM, robf...@gmail.com wrote:
> On Wednesday, March 9, 2022 at 12:37:26 PM UTC-5, BGB wrote:
>> On 3/9/2022 10:20 AM, EricP wrote:
>
>>> Note that operating systems such as Linux the hierarchical page table is
>>> part of its design. To port it to Power with its hash/inverted page table
>>> they kept the hierarchical table managed in memory as before,
>>> and move valid PTE entries into and out of the hash table.
>>>
>>> In effect, the hierarchical table is still the OS page table,
>>> and the hash table is treated like a giant software managed TLB.
>>> Software could implement, say, a FIFO mechanism to rotate a working set
>>> of Present hierarchical PTE's through hash PTE slots,
>>> similar to how a HW TLB might rotate through its entries.
>>>
>>> In this arrangement, Address Space ID's (ASID) for the hash table entries
>>> would be very important for efficiency, as without them a process switch
>>> would require invalidating all hash table entries just like a real TLB.
>>>
> Seems kind of kludgy. I am wondering why now this approach was used.
>
> I have coded in the bus interface unit the capability to use either hashed or
> hierarchical page tables. I was not so insane as to ignore the popularity of
> hierarchical page tables. Can probably use both types at the same time.
> Hashed tables are a better way to go for a large address
> space in my opinion. They use less memory and may have faster look-ups.
>

Hashed tables could also make sense as an alternative to B-Trees.

A B-Tree lookup is likely to be relatively complicated and expensive if
compared to a hash lookup.

It is possible that I could also combine a hashed top-level with
2-levels of nested page table (optionally collapsing down the upper 6
levels into a single page).

>
>> OK. This gives more context for what I was confused about in my response.
>>
>> In my case, I use a software managed TLB, with hierarchical page tables,
>> and no sort of intermediate stage representation.
>>
>> TLB miss interrupt occurs, handler walks the page table, and loads the
>> resulting entry into the TLB. Basic design for the mechanism was
>> more-or-less lifted from SH-4, but with the major difference of going
>> from a 64-entry fully associative TLB to a 64x or 256x 4-way TLB.
>>
>> Arguably, a 6-way or 8-way TLB could be better, but more expensive.
>>
>>
>> I had considered potentially using B-Trees for the upper levels (instead
>> of hierarchical nesting) for the 96-bit mode, mostly because as-is this
>> leads to a "very sparse" 8-level table, and a B-Tree could be more
>> compact (with more traditional nested page tables for the lower 2 or 3
>> levels).
>>
>> Though, there are other (potentially simpler/cheaper) ways to deal with
>> this.
>>
>> This would not really effect the hardware in my case.
>
> Page tables are walked via hardware for Thor. Software only gets involved on a
> page fault. While software managing the TLB is probably comparable in
> performance and more flexible than hardware, TLB updates with hardware
> walking are almost invisible to software.
>

This was one annoyance with BJX2 with the use of software managed TLB:
Getting the software TLB miss interrupts stable was a major source of
annoyance, as there were a whole lot of "little things" that can go
wrong in these mechanisms.

The L1 caches needing to be able to stop what they are doing, then
quietly transfer control to an ISR, then resume normal operation without
any adverse side effects, was a bit of a pain to debug.

Some of the cases were infrequent, so one might to get Doom or Quake
launched (in a simulation) before the crash happens, where firing these
up inside a Verilog simulation is not exactly a fast process (the time
issue is also a major reason why I am using LZ compressed binaries and
WADs).

As noted, the MMU is temporarily disabled within ISRs, partly because
there is no way for an ISR to deal with a TLB miss (unlike some CPU
architectures, ISRs are not re-entrant in my case).

Similarly, the TLB miss handler would have to resolve page-fault
scenarios (for virtual memory), without access to the main filesystem
driver (it has to bypass the driver and access the SDCard directly), but
does also mean that a page fault (to virtual memory) during some other
SDcard IO request would be "potentially fatal".

....

Nevermind whether or not it makes sense to put a pagefile on an SDcard,
there is nowhere else to put it in this case.

What to do with a page-fault to an unmapped address (eg: not backed to
the page file) is currently an unresolved issue, but this is more a
software thing than a hardware thing (in a real OS, would probably
terminate the process or similar here; for the time being, the only real
solution is to trigger a breakpoint).

I may need to consider going and adding preemptive multitasking, as it
looks like most "real OS" stuff would depend on this (cooperative
multitasking doesn't scale particularly well, or leave any real good
solution for "OK, this program has gone into an infinite loop or
crashed", which basically leaves no real solution otherwise).

The debugging was getting bad enough that I almost started to debate
whether it would be worthwhile to add hardware page walking and not have
to deal with this as much (but would still need a Page Fault handler, so
it is not entirely avoidable).

Sorta spent most of February beating on trying to debug stuff related to
the L1 Caches and their interaction with the TLB Miss ISR, partly as
"poking at something" in the L1 D$ set off an unholy bees nest of bugs.

Likewise, much of my past debugging effort here was more focused on
"quick and dirty" fixes which typically don't really resolve the
underlying issues; but, trying to figure out "why" something is going
wrong is often a lot harder than adding "make it work" hacks (and then
notice the bug hasn't gone away, it has merely moved to a new location).

I still can't say (with much confidence) that the TLB is now stable, but
in my testing at the moment, it appears to work.

Though, this debugging effort wasn't helped with Verilator also having
the annoying behavior that adding and removing $display statements can
sometimes change the behavior of the logic...

> The TLB can be updated via software, its almost possible then to use a software
> managed TLB. I guess if the TLB miss signal were fed into the interrupt system,
> then maybe things could have the option of software management.
>

Could be.

In my case, TLB miss triggers an interrupt, and the ISR deals with it.

In the ISR, there is a 'TEA' register that holds the address of whatever
caused the interrupt (target address for TLB miss), and an 'EXSR'
register where the low bits encode which interrupt had occurred (as a
16-bit number divided into multiple bit fields).

> I get the feeling there are too many choices available in Thor. Maybe a configuration
> utility would be better than building in everything.
>
> Wondering what entity to place share counts, access keys, and privilege levels in. I
> had these in the PTE but took them out. There is only one set required for each page
> of memory.
>

I have this stuff in the PTEs/TLBEs.

There is also an ACL Cache with an ACL Miss interrupt as well.

The software side of things isn't really developed all that well yet, as
most of the time thus far had gone more into trying to get things stable
enough that the address translation and TLB miss interrupts weren't
prone to crash stuff.

I seem to have this part stable now, so now it should be possible to
start making forward progress in this area.

Re: Packing Hash Tables

<3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24141&group=comp.arch#24141

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:7f0b:0:b0:2e0:e49:e785 with SMTP id f11-20020ac87f0b000000b002e00e49e785mr6620702qtk.424.1646963913044;
Thu, 10 Mar 2022 17:58:33 -0800 (PST)
X-Received: by 2002:a05:6870:f297:b0:c5:570:32da with SMTP id
u23-20020a056870f29700b000c5057032damr4433613oap.106.1646963912692; Thu, 10
Mar 2022 17:58:32 -0800 (PST)
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 10 Mar 2022 17:58:32 -0800 (PST)
In-Reply-To: <t0df7v$afv$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:fb00:39b9:586a:ab26:74d1;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:fb00:39b9:586a:ab26:74d1
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Fri, 11 Mar 2022 01:58:33 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 187
 by: robf...@gmail.com - Fri, 11 Mar 2022 01:58 UTC

On Thursday, March 10, 2022 at 1:15:31 PM UTC-5, BGB wrote:
> On 3/10/2022 9:04 AM, robf...@gmail.com wrote:
> > On Wednesday, March 9, 2022 at 12:37:26 PM UTC-5, BGB wrote:
> >> On 3/9/2022 10:20 AM, EricP wrote:
> >
> >>> Note that operating systems such as Linux the hierarchical page table is
> >>> part of its design. To port it to Power with its hash/inverted page table
> >>> they kept the hierarchical table managed in memory as before,
> >>> and move valid PTE entries into and out of the hash table.
> >>>
> >>> In effect, the hierarchical table is still the OS page table,
> >>> and the hash table is treated like a giant software managed TLB.
> >>> Software could implement, say, a FIFO mechanism to rotate a working set
> >>> of Present hierarchical PTE's through hash PTE slots,
> >>> similar to how a HW TLB might rotate through its entries.
> >>>
> >>> In this arrangement, Address Space ID's (ASID) for the hash table entries
> >>> would be very important for efficiency, as without them a process switch
> >>> would require invalidating all hash table entries just like a real TLB.
> >>>
> > Seems kind of kludgy. I am wondering why now this approach was used.
> >
> > I have coded in the bus interface unit the capability to use either hashed or
> > hierarchical page tables. I was not so insane as to ignore the popularity of
> > hierarchical page tables. Can probably use both types at the same time.
> > Hashed tables are a better way to go for a large address
> > space in my opinion. They use less memory and may have faster look-ups.
> >
> Hashed tables could also make sense as an alternative to B-Trees.
>
> A B-Tree lookup is likely to be relatively complicated and expensive if
> compared to a hash lookup.
>
> It is possible that I could also combine a hashed top-level with
> 2-levels of nested page table (optionally collapsing down the upper 6
> levels into a single page).
> >
> >> OK. This gives more context for what I was confused about in my response.
> >>
> >> In my case, I use a software managed TLB, with hierarchical page tables,
> >> and no sort of intermediate stage representation.
> >>
> >> TLB miss interrupt occurs, handler walks the page table, and loads the
> >> resulting entry into the TLB. Basic design for the mechanism was
> >> more-or-less lifted from SH-4, but with the major difference of going
> >> from a 64-entry fully associative TLB to a 64x or 256x 4-way TLB.
> >>
> >> Arguably, a 6-way or 8-way TLB could be better, but more expensive.
> >>
> >>
> >> I had considered potentially using B-Trees for the upper levels (instead
> >> of hierarchical nesting) for the 96-bit mode, mostly because as-is this
> >> leads to a "very sparse" 8-level table, and a B-Tree could be more
> >> compact (with more traditional nested page tables for the lower 2 or 3
> >> levels).
> >>
> >> Though, there are other (potentially simpler/cheaper) ways to deal with
> >> this.
> >>
> >> This would not really effect the hardware in my case.
> >
> > Page tables are walked via hardware for Thor. Software only gets involved on a
> > page fault. While software managing the TLB is probably comparable in
> > performance and more flexible than hardware, TLB updates with hardware
> > walking are almost invisible to software.
> >
> This was one annoyance with BJX2 with the use of software managed TLB:
> Getting the software TLB miss interrupts stable was a major source of
> annoyance, as there were a whole lot of "little things" that can go
> wrong in these mechanisms.
>
> The L1 caches needing to be able to stop what they are doing, then
> quietly transfer control to an ISR, then resume normal operation without
> any adverse side effects, was a bit of a pain to debug.
>
> Some of the cases were infrequent, so one might to get Doom or Quake
> launched (in a simulation) before the crash happens, where firing these
> up inside a Verilog simulation is not exactly a fast process (the time
> issue is also a major reason why I am using LZ compressed binaries and
> WADs).
>
>
> As noted, the MMU is temporarily disabled within ISRs, partly because
> there is no way for an ISR to deal with a TLB miss (unlike some CPU
> architectures, ISRs are not re-entrant in my case).
>
>
> Similarly, the TLB miss handler would have to resolve page-fault
> scenarios (for virtual memory), without access to the main filesystem
> driver (it has to bypass the driver and access the SDCard directly), but
> does also mean that a page fault (to virtual memory) during some other
> SDcard IO request would be "potentially fatal".
>
> ...
>
> Nevermind whether or not it makes sense to put a pagefile on an SDcard,
> there is nowhere else to put it in this case.
>
>
> What to do with a page-fault to an unmapped address (eg: not backed to
> the page file) is currently an unresolved issue, but this is more a
> software thing than a hardware thing (in a real OS, would probably
> terminate the process or similar here; for the time being, the only real
> solution is to trigger a breakpoint).
>
> I may need to consider going and adding preemptive multitasking, as it
> looks like most "real OS" stuff would depend on this (cooperative
> multitasking doesn't scale particularly well, or leave any real good
> solution for "OK, this program has gone into an infinite loop or
> crashed", which basically leaves no real solution otherwise).
>
>
>
> The debugging was getting bad enough that I almost started to debate
> whether it would be worthwhile to add hardware page walking and not have
> to deal with this as much (but would still need a Page Fault handler, so
> it is not entirely avoidable).
>
>
> Sorta spent most of February beating on trying to debug stuff related to
> the L1 Caches and their interaction with the TLB Miss ISR, partly as
> "poking at something" in the L1 D$ set off an unholy bees nest of bugs.
>
> Likewise, much of my past debugging effort here was more focused on
> "quick and dirty" fixes which typically don't really resolve the
> underlying issues; but, trying to figure out "why" something is going
> wrong is often a lot harder than adding "make it work" hacks (and then
> notice the bug hasn't gone away, it has merely moved to a new location).
>
>
> I still can't say (with much confidence) that the TLB is now stable, but
> in my testing at the moment, it appears to work.
>
> Though, this debugging effort wasn't helped with Verilator also having
> the annoying behavior that adding and removing $display statements can
> sometimes change the behavior of the logic...
> > The TLB can be updated via software, its almost possible then to use a software
> > managed TLB. I guess if the TLB miss signal were fed into the interrupt system,
> > then maybe things could have the option of software management.
> >
> Could be.
>
> In my case, TLB miss triggers an interrupt, and the ISR deals with it.
>
> In the ISR, there is a 'TEA' register that holds the address of whatever
> caused the interrupt (target address for TLB miss), and an 'EXSR'
> register where the low bits encode which interrupt had occurred (as a
> 16-bit number divided into multiple bit fields).
> > I get the feeling there are too many choices available in Thor. Maybe a configuration
> > utility would be better than building in everything.
> >
> > Wondering what entity to place share counts, access keys, and privilege levels in. I
> > had these in the PTE but took them out. There is only one set required for each page
> > of memory.
> >
> I have this stuff in the PTEs/TLBEs.
>
> There is also an ACL Cache with an ACL Miss interrupt as well.
>
> The software side of things isn't really developed all that well yet, as
> most of the time thus far had gone more into trying to get things stable
> enough that the address translation and TLB miss interrupts weren't
> prone to crash stuff.
>
> I seem to have this part stable now, so now it should be possible to
> start making forward progress in this area.

For Thor, the TLBE / PTE is 128-bits in size with no room to spare without having the
share-counts, etc. included in it. A larger PTE would really complicate things.
The TLBE value can be loaded or stored from / to the TLB using a single
128-bit wide register. Anything wider means using a pair of registers. I could
maybe reduce the size of the page number fields a few bits. The fields are 48-bit.
I had the thought that since there only needs to be one copy of the information it
could be stored in another entity. But this means potentially adding to the memory
access time since another table is accessed after the physical page number is
known.


Click here to read the complete article
Re: Packing Hash Tables

<t0ennl$ald$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24143&group=comp.arch#24143

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
Date: Thu, 10 Mar 2022 23:46:22 -0600
Organization: A noiseless patient Spider
Lines: 304
Message-ID: <t0ennl$ald$1@dont-email.me>
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com>
<t0df7v$afv$1@dont-email.me>
<3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 11 Mar 2022 05:46:29 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="e8d2320aa770a4595efb7be99d1ecbe4";
logging-data="10925"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/kZqbAgE3rakp1SV9eFJgv"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:9FIafDrtRvD30zgeUV/kyhD8hps=
In-Reply-To: <3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com>
Content-Language: en-US
 by: BGB - Fri, 11 Mar 2022 05:46 UTC

On 3/10/2022 7:58 PM, robf...@gmail.com wrote:
> On Thursday, March 10, 2022 at 1:15:31 PM UTC-5, BGB wrote:
>> On 3/10/2022 9:04 AM, robf...@gmail.com wrote:
>>> On Wednesday, March 9, 2022 at 12:37:26 PM UTC-5, BGB wrote:
>>>> On 3/9/2022 10:20 AM, EricP wrote:
>>>
>>>>> Note that operating systems such as Linux the hierarchical page table is
>>>>> part of its design. To port it to Power with its hash/inverted page table
>>>>> they kept the hierarchical table managed in memory as before,
>>>>> and move valid PTE entries into and out of the hash table.
>>>>>
>>>>> In effect, the hierarchical table is still the OS page table,
>>>>> and the hash table is treated like a giant software managed TLB.
>>>>> Software could implement, say, a FIFO mechanism to rotate a working set
>>>>> of Present hierarchical PTE's through hash PTE slots,
>>>>> similar to how a HW TLB might rotate through its entries.
>>>>>
>>>>> In this arrangement, Address Space ID's (ASID) for the hash table entries
>>>>> would be very important for efficiency, as without them a process switch
>>>>> would require invalidating all hash table entries just like a real TLB.
>>>>>
>>> Seems kind of kludgy. I am wondering why now this approach was used.
>>>
>>> I have coded in the bus interface unit the capability to use either hashed or
>>> hierarchical page tables. I was not so insane as to ignore the popularity of
>>> hierarchical page tables. Can probably use both types at the same time.
>>> Hashed tables are a better way to go for a large address
>>> space in my opinion. They use less memory and may have faster look-ups.
>>>
>> Hashed tables could also make sense as an alternative to B-Trees.
>>
>> A B-Tree lookup is likely to be relatively complicated and expensive if
>> compared to a hash lookup.
>>
>> It is possible that I could also combine a hashed top-level with
>> 2-levels of nested page table (optionally collapsing down the upper 6
>> levels into a single page).
>>>
>>>> OK. This gives more context for what I was confused about in my response.
>>>>
>>>> In my case, I use a software managed TLB, with hierarchical page tables,
>>>> and no sort of intermediate stage representation.
>>>>
>>>> TLB miss interrupt occurs, handler walks the page table, and loads the
>>>> resulting entry into the TLB. Basic design for the mechanism was
>>>> more-or-less lifted from SH-4, but with the major difference of going
>>>> from a 64-entry fully associative TLB to a 64x or 256x 4-way TLB.
>>>>
>>>> Arguably, a 6-way or 8-way TLB could be better, but more expensive.
>>>>
>>>>
>>>> I had considered potentially using B-Trees for the upper levels (instead
>>>> of hierarchical nesting) for the 96-bit mode, mostly because as-is this
>>>> leads to a "very sparse" 8-level table, and a B-Tree could be more
>>>> compact (with more traditional nested page tables for the lower 2 or 3
>>>> levels).
>>>>
>>>> Though, there are other (potentially simpler/cheaper) ways to deal with
>>>> this.
>>>>
>>>> This would not really effect the hardware in my case.
>>>
>>> Page tables are walked via hardware for Thor. Software only gets involved on a
>>> page fault. While software managing the TLB is probably comparable in
>>> performance and more flexible than hardware, TLB updates with hardware
>>> walking are almost invisible to software.
>>>
>> This was one annoyance with BJX2 with the use of software managed TLB:
>> Getting the software TLB miss interrupts stable was a major source of
>> annoyance, as there were a whole lot of "little things" that can go
>> wrong in these mechanisms.
>>
>> The L1 caches needing to be able to stop what they are doing, then
>> quietly transfer control to an ISR, then resume normal operation without
>> any adverse side effects, was a bit of a pain to debug.
>>
>> Some of the cases were infrequent, so one might to get Doom or Quake
>> launched (in a simulation) before the crash happens, where firing these
>> up inside a Verilog simulation is not exactly a fast process (the time
>> issue is also a major reason why I am using LZ compressed binaries and
>> WADs).
>>
>>
>> As noted, the MMU is temporarily disabled within ISRs, partly because
>> there is no way for an ISR to deal with a TLB miss (unlike some CPU
>> architectures, ISRs are not re-entrant in my case).
>>
>>
>> Similarly, the TLB miss handler would have to resolve page-fault
>> scenarios (for virtual memory), without access to the main filesystem
>> driver (it has to bypass the driver and access the SDCard directly), but
>> does also mean that a page fault (to virtual memory) during some other
>> SDcard IO request would be "potentially fatal".
>>
>> ...
>>
>> Nevermind whether or not it makes sense to put a pagefile on an SDcard,
>> there is nowhere else to put it in this case.
>>
>>
>> What to do with a page-fault to an unmapped address (eg: not backed to
>> the page file) is currently an unresolved issue, but this is more a
>> software thing than a hardware thing (in a real OS, would probably
>> terminate the process or similar here; for the time being, the only real
>> solution is to trigger a breakpoint).
>>
>> I may need to consider going and adding preemptive multitasking, as it
>> looks like most "real OS" stuff would depend on this (cooperative
>> multitasking doesn't scale particularly well, or leave any real good
>> solution for "OK, this program has gone into an infinite loop or
>> crashed", which basically leaves no real solution otherwise).
>>
>>
>>
>> The debugging was getting bad enough that I almost started to debate
>> whether it would be worthwhile to add hardware page walking and not have
>> to deal with this as much (but would still need a Page Fault handler, so
>> it is not entirely avoidable).
>>
>>
>> Sorta spent most of February beating on trying to debug stuff related to
>> the L1 Caches and their interaction with the TLB Miss ISR, partly as
>> "poking at something" in the L1 D$ set off an unholy bees nest of bugs.
>>
>> Likewise, much of my past debugging effort here was more focused on
>> "quick and dirty" fixes which typically don't really resolve the
>> underlying issues; but, trying to figure out "why" something is going
>> wrong is often a lot harder than adding "make it work" hacks (and then
>> notice the bug hasn't gone away, it has merely moved to a new location).
>>
>>
>> I still can't say (with much confidence) that the TLB is now stable, but
>> in my testing at the moment, it appears to work.
>>
>> Though, this debugging effort wasn't helped with Verilator also having
>> the annoying behavior that adding and removing $display statements can
>> sometimes change the behavior of the logic...
>>> The TLB can be updated via software, its almost possible then to use a software
>>> managed TLB. I guess if the TLB miss signal were fed into the interrupt system,
>>> then maybe things could have the option of software management.
>>>
>> Could be.
>>
>> In my case, TLB miss triggers an interrupt, and the ISR deals with it.
>>
>> In the ISR, there is a 'TEA' register that holds the address of whatever
>> caused the interrupt (target address for TLB miss), and an 'EXSR'
>> register where the low bits encode which interrupt had occurred (as a
>> 16-bit number divided into multiple bit fields).
>>> I get the feeling there are too many choices available in Thor. Maybe a configuration
>>> utility would be better than building in everything.
>>>
>>> Wondering what entity to place share counts, access keys, and privilege levels in. I
>>> had these in the PTE but took them out. There is only one set required for each page
>>> of memory.
>>>
>> I have this stuff in the PTEs/TLBEs.
>>
>> There is also an ACL Cache with an ACL Miss interrupt as well.
>>
>> The software side of things isn't really developed all that well yet, as
>> most of the time thus far had gone more into trying to get things stable
>> enough that the address translation and TLB miss interrupts weren't
>> prone to crash stuff.
>>
>> I seem to have this part stable now, so now it should be possible to
>> start making forward progress in this area.
>
> For Thor, the TLBE / PTE is 128-bits in size with no room to spare without having the
> share-counts, etc. included in it. A larger PTE would really complicate things.
> The TLBE value can be loaded or stored from / to the TLB using a single
> 128-bit wide register. Anything wider means using a pair of registers. I could
> maybe reduce the size of the page number fields a few bits. The fields are 48-bit.
> I had the thought that since there only needs to be one copy of the information it
> could be stored in another entity. But this means potentially adding to the memory
> access time since another table is accessed after the physical page number is
> known.
>


Click here to read the complete article
Re: Packing Hash Tables

<c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24194&group=comp.arch#24194

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:5aa1:0:b0:435:9404:bff5 with SMTP id u1-20020ad45aa1000000b004359404bff5mr13183189qvg.126.1647138642705;
Sat, 12 Mar 2022 18:30:42 -0800 (PST)
X-Received: by 2002:a4a:3e02:0:b0:320:fdab:dcfd with SMTP id
t2-20020a4a3e02000000b00320fdabdcfdmr7779117oot.16.1647138642342; Sat, 12 Mar
2022 18:30:42 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!nntp.club.cc.cmu.edu!45.76.7.193.MISMATCH!3.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 12 Mar 2022 18:30:42 -0800 (PST)
In-Reply-To: <t0ennl$ald$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:fb00:ddea:71ed:547c:c824;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:fb00:ddea:71ed:547c:c824
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me>
<3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com> <t0ennl$ald$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Sun, 13 Mar 2022 02:30:42 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 324
 by: robf...@gmail.com - Sun, 13 Mar 2022 02:30 UTC

On Friday, March 11, 2022 at 12:46:33 AM UTC-5, BGB wrote:
> On 3/10/2022 7:58 PM, robf...@gmail.com wrote:
> > On Thursday, March 10, 2022 at 1:15:31 PM UTC-5, BGB wrote:
> >> On 3/10/2022 9:04 AM, robf...@gmail.com wrote:
> >>> On Wednesday, March 9, 2022 at 12:37:26 PM UTC-5, BGB wrote:
> >>>> On 3/9/2022 10:20 AM, EricP wrote:
> >>>
> >>>>> Note that operating systems such as Linux the hierarchical page table is
> >>>>> part of its design. To port it to Power with its hash/inverted page table
> >>>>> they kept the hierarchical table managed in memory as before,
> >>>>> and move valid PTE entries into and out of the hash table.
> >>>>>
> >>>>> In effect, the hierarchical table is still the OS page table,
> >>>>> and the hash table is treated like a giant software managed TLB.
> >>>>> Software could implement, say, a FIFO mechanism to rotate a working set
> >>>>> of Present hierarchical PTE's through hash PTE slots,
> >>>>> similar to how a HW TLB might rotate through its entries.
> >>>>>
> >>>>> In this arrangement, Address Space ID's (ASID) for the hash table entries
> >>>>> would be very important for efficiency, as without them a process switch
> >>>>> would require invalidating all hash table entries just like a real TLB.
> >>>>>
> >>> Seems kind of kludgy. I am wondering why now this approach was used.
> >>>
> >>> I have coded in the bus interface unit the capability to use either hashed or
> >>> hierarchical page tables. I was not so insane as to ignore the popularity of
> >>> hierarchical page tables. Can probably use both types at the same time.
> >>> Hashed tables are a better way to go for a large address
> >>> space in my opinion. They use less memory and may have faster look-ups.
> >>>
> >> Hashed tables could also make sense as an alternative to B-Trees.
> >>
> >> A B-Tree lookup is likely to be relatively complicated and expensive if
> >> compared to a hash lookup.
> >>
> >> It is possible that I could also combine a hashed top-level with
> >> 2-levels of nested page table (optionally collapsing down the upper 6
> >> levels into a single page).
> >>>
> >>>> OK. This gives more context for what I was confused about in my response.
> >>>>
> >>>> In my case, I use a software managed TLB, with hierarchical page tables,
> >>>> and no sort of intermediate stage representation.
> >>>>
> >>>> TLB miss interrupt occurs, handler walks the page table, and loads the
> >>>> resulting entry into the TLB. Basic design for the mechanism was
> >>>> more-or-less lifted from SH-4, but with the major difference of going
> >>>> from a 64-entry fully associative TLB to a 64x or 256x 4-way TLB.
> >>>>
> >>>> Arguably, a 6-way or 8-way TLB could be better, but more expensive.
> >>>>
> >>>>
> >>>> I had considered potentially using B-Trees for the upper levels (instead
> >>>> of hierarchical nesting) for the 96-bit mode, mostly because as-is this
> >>>> leads to a "very sparse" 8-level table, and a B-Tree could be more
> >>>> compact (with more traditional nested page tables for the lower 2 or 3
> >>>> levels).
> >>>>
> >>>> Though, there are other (potentially simpler/cheaper) ways to deal with
> >>>> this.
> >>>>
> >>>> This would not really effect the hardware in my case.
> >>>
> >>> Page tables are walked via hardware for Thor. Software only gets involved on a
> >>> page fault. While software managing the TLB is probably comparable in
> >>> performance and more flexible than hardware, TLB updates with hardware
> >>> walking are almost invisible to software.
> >>>
> >> This was one annoyance with BJX2 with the use of software managed TLB:
> >> Getting the software TLB miss interrupts stable was a major source of
> >> annoyance, as there were a whole lot of "little things" that can go
> >> wrong in these mechanisms.
> >>
> >> The L1 caches needing to be able to stop what they are doing, then
> >> quietly transfer control to an ISR, then resume normal operation without
> >> any adverse side effects, was a bit of a pain to debug.
> >>
> >> Some of the cases were infrequent, so one might to get Doom or Quake
> >> launched (in a simulation) before the crash happens, where firing these
> >> up inside a Verilog simulation is not exactly a fast process (the time
> >> issue is also a major reason why I am using LZ compressed binaries and
> >> WADs).
> >>
> >>
> >> As noted, the MMU is temporarily disabled within ISRs, partly because
> >> there is no way for an ISR to deal with a TLB miss (unlike some CPU
> >> architectures, ISRs are not re-entrant in my case).
> >>
> >>
> >> Similarly, the TLB miss handler would have to resolve page-fault
> >> scenarios (for virtual memory), without access to the main filesystem
> >> driver (it has to bypass the driver and access the SDCard directly), but
> >> does also mean that a page fault (to virtual memory) during some other
> >> SDcard IO request would be "potentially fatal".
> >>
> >> ...
> >>
> >> Nevermind whether or not it makes sense to put a pagefile on an SDcard,
> >> there is nowhere else to put it in this case.
> >>
> >>
> >> What to do with a page-fault to an unmapped address (eg: not backed to
> >> the page file) is currently an unresolved issue, but this is more a
> >> software thing than a hardware thing (in a real OS, would probably
> >> terminate the process or similar here; for the time being, the only real
> >> solution is to trigger a breakpoint).
> >>
> >> I may need to consider going and adding preemptive multitasking, as it
> >> looks like most "real OS" stuff would depend on this (cooperative
> >> multitasking doesn't scale particularly well, or leave any real good
> >> solution for "OK, this program has gone into an infinite loop or
> >> crashed", which basically leaves no real solution otherwise).
> >>
> >>
> >>
> >> The debugging was getting bad enough that I almost started to debate
> >> whether it would be worthwhile to add hardware page walking and not have
> >> to deal with this as much (but would still need a Page Fault handler, so
> >> it is not entirely avoidable).
> >>
> >>
> >> Sorta spent most of February beating on trying to debug stuff related to
> >> the L1 Caches and their interaction with the TLB Miss ISR, partly as
> >> "poking at something" in the L1 D$ set off an unholy bees nest of bugs.
> >>
> >> Likewise, much of my past debugging effort here was more focused on
> >> "quick and dirty" fixes which typically don't really resolve the
> >> underlying issues; but, trying to figure out "why" something is going
> >> wrong is often a lot harder than adding "make it work" hacks (and then
> >> notice the bug hasn't gone away, it has merely moved to a new location).
> >>
> >>
> >> I still can't say (with much confidence) that the TLB is now stable, but
> >> in my testing at the moment, it appears to work.
> >>
> >> Though, this debugging effort wasn't helped with Verilator also having
> >> the annoying behavior that adding and removing $display statements can
> >> sometimes change the behavior of the logic...
> >>> The TLB can be updated via software, its almost possible then to use a software
> >>> managed TLB. I guess if the TLB miss signal were fed into the interrupt system,
> >>> then maybe things could have the option of software management.
> >>>
> >> Could be.
> >>
> >> In my case, TLB miss triggers an interrupt, and the ISR deals with it.
> >>
> >> In the ISR, there is a 'TEA' register that holds the address of whatever
> >> caused the interrupt (target address for TLB miss), and an 'EXSR'
> >> register where the low bits encode which interrupt had occurred (as a
> >> 16-bit number divided into multiple bit fields).
> >>> I get the feeling there are too many choices available in Thor. Maybe a configuration
> >>> utility would be better than building in everything.
> >>>
> >>> Wondering what entity to place share counts, access keys, and privilege levels in. I
> >>> had these in the PTE but took them out. There is only one set required for each page
> >>> of memory.
> >>>
> >> I have this stuff in the PTEs/TLBEs.
> >>
> >> There is also an ACL Cache with an ACL Miss interrupt as well.
> >>
> >> The software side of things isn't really developed all that well yet, as
> >> most of the time thus far had gone more into trying to get things stable
> >> enough that the address translation and TLB miss interrupts weren't
> >> prone to crash stuff.
> >>
> >> I seem to have this part stable now, so now it should be possible to
> >> start making forward progress in this area.
> >
> > For Thor, the TLBE / PTE is 128-bits in size with no room to spare without having the
> > share-counts, etc. included in it. A larger PTE would really complicate things.
> > The TLBE value can be loaded or stored from / to the TLB using a single
> > 128-bit wide register. Anything wider means using a pair of registers. I could
> > maybe reduce the size of the page number fields a few bits. The fields are 48-bit.
> > I had the thought that since there only needs to be one copy of the information it
> > could be stored in another entity. But this means potentially adding to the memory
> > access time since another table is accessed after the physical page number is
> > known.
> >
> I also have 128-bit TLBE's, with fields for 48 bit addresses.
>
> Though the address fields are actually 36 bits, because one can cut off
> the low-order bits for the address (and because the "base form"
> addressing is 48-bit).
>
> While there is a 96-bit virtual-address mode, as noted, it works by
> using TLBE's in pairs to get more bits (at the cost of associativity).
> It mostly just extends the virtual address by 48 bits, with most of the
> rest of the bits being unused.
>
> I ended up implementing the 256-bit TLB load via a "double pumping"
> strategy (load both halves and then the MMU sticks them together).
>
>
>
> Would be a little bit harder with a full 64-bit virtual address space
> though.
>
>
> Copy/Paste
>
> TLBE:
> * (127:112) = VUGID / ACLID
> * (111: 76) = Virtual Address Page
> * ( 75: 64) = KR Access
> * ( 63: 48) = ASID
> * ( 47: 12) = Physical Address Page
> * ( 11: 0) = Page Control bits
>
> VUGID:
> * (15:10) = VGID / VUID(Swap)
> * ( 9: 0) = VUID / VGID(Swap)
> * May be interpreted as a PID or ACLID.
> ** In these cases, VUGID is treated as a single 16-bit unit.
>
> ASID:
> * (15:10) = AGID (0=Shared, Global)
> * ( 9: 0) = APID (0=Shared, Group)
>
> KR Access:
> * (11): Other X
> * (10): Other W
> * ( 9): Other R
> * ( 8): Group X (VUGID Check Only)
> * ( 7): Group W (VUGID Check Only)
> * ( 6): Group R (VUGID Check Only)
> * ( 5): User X (VUGID Check Only)
> * ( 4): User W (VUGID Check Only)
> * ( 3): User R (VUGID Check Only)
> * ( 2): VUGID Mode 2
> * ( 1): VUGID Mode 1
> * ( 0): VUGID Mode 0
>
> The VUGID Mode bits basically encode what sort of check to perform
> (VUGID or ACL style).
>
>
> These include, say:
> Disabled, VUGID, VUGID (Swap), or ACL.
>
> In VUGID mode, it does User/Group/Other checks, Swap switches
> User/Group. ACL: Use ACL (in TLBE), Exact 16-bit match only (ACLE).
> > I do not know much about ACL except that I suspect it can be quite complicated to do
> > with hardware. Is there ACL for individual pages of memory? There can be users or
> > groups of users and system processes all with separate access requirements. I suspect
> > there would be some limits engineered into ACL. For instance, limited numbers of
> > groups and users to keep the reference fields a reasonable size. Can you elaborate on
> > the ACL?
> >
> These are per-page checks.
>
>
> The ACL checking doesn't really add all that much cost or complexity
> compared with the TLB itself.
>
> Main costs are mostly the need to spend TLB bits on it, and the addition
> of a Keyring Register (KRR).
>
> Most of the harder parts of the ACL Check (namely, performing the ACL
> lookups) would be handled in software, so all the hardware really needs
> to do is check whether the keyring entries are in the ACL Cache, and (it
> they hit) what sort of access is allowed to the page in question (the
> access bits are OR'ed together).
>
>
> ACLE:
> * (63:44): Reserved / MBZ
> * (43:32): KR Access Mode
> * (31:16): KRR VUGID / APID
> * (15: 0): TLB VUGID / ACLID
>
> So, one basically does associative checks between the entries in the
> ACLE cache, and the entries in KRR.
>
> Because KRR changes relatively infrequently, the ACLE can be pretty
> small (ideally, it is larger than the KRR though, so 6 or 8 entries
> would be better than 4, but I am using 4 entries for now).
>
>
> The combined result of VUGID/ACL, and Basic checks are combined into a
> single bit-mask saying what access is allowed (in a given operating mode
> and at a given time). The L1 Caches will use these bits during memory
> access.
> > In my PTE eight bits are used for access control to indicate read/write/execute/cache
> > for either user or system. Maybe these bits would be better used to refer to an ACL. If
> > I were to trim four bits off the page number fields that means a 16-bit ACL reference
> > field could be supported.
> OK.
>
> I was able to fit both Basic checks and ACL/VUGID checks.
>
> The Basic checks are used ye olde User/Supervisor and similar (R/W/X)
> and a few bits which effects caching or combinations which are used
> encode alternate special behaviors (Privledged-Execute and
> Execute-As-ACLID).
>
>
> ...
I cannot quite get my head around how the ACL comes into play, but I have provided for
an ACL in TLBE.


Click here to read the complete article
Re: Packing Hash Tables

<5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24195&group=comp.arch#24195

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:2446:b0:67b:a0:86f7 with SMTP id h6-20020a05620a244600b0067b00a086f7mr10968693qkn.744.1647139454717;
Sat, 12 Mar 2022 18:44:14 -0800 (PST)
X-Received: by 2002:a9d:72c6:0:b0:5af:42ef:bb7c with SMTP id
d6-20020a9d72c6000000b005af42efbb7cmr8523979otk.96.1647139454432; Sat, 12 Mar
2022 18:44:14 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!3.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 12 Mar 2022 18:44:14 -0800 (PST)
In-Reply-To: <c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:a1c9:13c:a4f8:21cd;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:a1c9:13c:a4f8:21cd
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me>
<3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com> <t0ennl$ald$1@dont-email.me>
<c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 13 Mar 2022 02:44:14 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 481
 by: MitchAlsup - Sun, 13 Mar 2022 02:44 UTC

On Saturday, March 12, 2022 at 8:30:44 PM UTC-6, robf...@gmail.com wrote:
> On Friday, March 11, 2022 at 12:46:33 AM UTC-5, BGB wrote:
> > On 3/10/2022 7:58 PM, robf...@gmail.com wrote:
> > > On Thursday, March 10, 2022 at 1:15:31 PM UTC-5, BGB wrote:
> > >> On 3/10/2022 9:04 AM, robf...@gmail.com wrote:
> > >>> On Wednesday, March 9, 2022 at 12:37:26 PM UTC-5, BGB wrote:
> > >>>> On 3/9/2022 10:20 AM, EricP wrote:
> > >>>
> > >>>>> Note that operating systems such as Linux the hierarchical page table is
> > >>>>> part of its design. To port it to Power with its hash/inverted page table
> > >>>>> they kept the hierarchical table managed in memory as before,
> > >>>>> and move valid PTE entries into and out of the hash table.
> > >>>>>
> > >>>>> In effect, the hierarchical table is still the OS page table,
> > >>>>> and the hash table is treated like a giant software managed TLB.
> > >>>>> Software could implement, say, a FIFO mechanism to rotate a working set
> > >>>>> of Present hierarchical PTE's through hash PTE slots,
> > >>>>> similar to how a HW TLB might rotate through its entries.
> > >>>>>
> > >>>>> In this arrangement, Address Space ID's (ASID) for the hash table entries
> > >>>>> would be very important for efficiency, as without them a process switch
> > >>>>> would require invalidating all hash table entries just like a real TLB.
> > >>>>>
> > >>> Seems kind of kludgy. I am wondering why now this approach was used..
> > >>>
> > >>> I have coded in the bus interface unit the capability to use either hashed or
> > >>> hierarchical page tables. I was not so insane as to ignore the popularity of
> > >>> hierarchical page tables. Can probably use both types at the same time.
> > >>> Hashed tables are a better way to go for a large address
> > >>> space in my opinion. They use less memory and may have faster look-ups.
> > >>>
> > >> Hashed tables could also make sense as an alternative to B-Trees.
> > >>
> > >> A B-Tree lookup is likely to be relatively complicated and expensive if
> > >> compared to a hash lookup.
> > >>
> > >> It is possible that I could also combine a hashed top-level with
> > >> 2-levels of nested page table (optionally collapsing down the upper 6
> > >> levels into a single page).
> > >>>
> > >>>> OK. This gives more context for what I was confused about in my response.
> > >>>>
> > >>>> In my case, I use a software managed TLB, with hierarchical page tables,
> > >>>> and no sort of intermediate stage representation.
> > >>>>
> > >>>> TLB miss interrupt occurs, handler walks the page table, and loads the
> > >>>> resulting entry into the TLB. Basic design for the mechanism was
> > >>>> more-or-less lifted from SH-4, but with the major difference of going
> > >>>> from a 64-entry fully associative TLB to a 64x or 256x 4-way TLB.
> > >>>>
> > >>>> Arguably, a 6-way or 8-way TLB could be better, but more expensive..
> > >>>>
> > >>>>
> > >>>> I had considered potentially using B-Trees for the upper levels (instead
> > >>>> of hierarchical nesting) for the 96-bit mode, mostly because as-is this
> > >>>> leads to a "very sparse" 8-level table, and a B-Tree could be more
> > >>>> compact (with more traditional nested page tables for the lower 2 or 3
> > >>>> levels).
> > >>>>
> > >>>> Though, there are other (potentially simpler/cheaper) ways to deal with
> > >>>> this.
> > >>>>
> > >>>> This would not really effect the hardware in my case.
> > >>>
> > >>> Page tables are walked via hardware for Thor. Software only gets involved on a
> > >>> page fault. While software managing the TLB is probably comparable in
> > >>> performance and more flexible than hardware, TLB updates with hardware
> > >>> walking are almost invisible to software.
> > >>>
> > >> This was one annoyance with BJX2 with the use of software managed TLB:
> > >> Getting the software TLB miss interrupts stable was a major source of
> > >> annoyance, as there were a whole lot of "little things" that can go
> > >> wrong in these mechanisms.
> > >>
> > >> The L1 caches needing to be able to stop what they are doing, then
> > >> quietly transfer control to an ISR, then resume normal operation without
> > >> any adverse side effects, was a bit of a pain to debug.
> > >>
> > >> Some of the cases were infrequent, so one might to get Doom or Quake
> > >> launched (in a simulation) before the crash happens, where firing these
> > >> up inside a Verilog simulation is not exactly a fast process (the time
> > >> issue is also a major reason why I am using LZ compressed binaries and
> > >> WADs).
> > >>
> > >>
> > >> As noted, the MMU is temporarily disabled within ISRs, partly because
> > >> there is no way for an ISR to deal with a TLB miss (unlike some CPU
> > >> architectures, ISRs are not re-entrant in my case).
> > >>
> > >>
> > >> Similarly, the TLB miss handler would have to resolve page-fault
> > >> scenarios (for virtual memory), without access to the main filesystem
> > >> driver (it has to bypass the driver and access the SDCard directly), but
> > >> does also mean that a page fault (to virtual memory) during some other
> > >> SDcard IO request would be "potentially fatal".
> > >>
> > >> ...
> > >>
> > >> Nevermind whether or not it makes sense to put a pagefile on an SDcard,
> > >> there is nowhere else to put it in this case.
> > >>
> > >>
> > >> What to do with a page-fault to an unmapped address (eg: not backed to
> > >> the page file) is currently an unresolved issue, but this is more a
> > >> software thing than a hardware thing (in a real OS, would probably
> > >> terminate the process or similar here; for the time being, the only real
> > >> solution is to trigger a breakpoint).
> > >>
> > >> I may need to consider going and adding preemptive multitasking, as it
> > >> looks like most "real OS" stuff would depend on this (cooperative
> > >> multitasking doesn't scale particularly well, or leave any real good
> > >> solution for "OK, this program has gone into an infinite loop or
> > >> crashed", which basically leaves no real solution otherwise).
> > >>
> > >>
> > >>
> > >> The debugging was getting bad enough that I almost started to debate
> > >> whether it would be worthwhile to add hardware page walking and not have
> > >> to deal with this as much (but would still need a Page Fault handler, so
> > >> it is not entirely avoidable).
> > >>
> > >>
> > >> Sorta spent most of February beating on trying to debug stuff related to
> > >> the L1 Caches and their interaction with the TLB Miss ISR, partly as
> > >> "poking at something" in the L1 D$ set off an unholy bees nest of bugs.
> > >>
> > >> Likewise, much of my past debugging effort here was more focused on
> > >> "quick and dirty" fixes which typically don't really resolve the
> > >> underlying issues; but, trying to figure out "why" something is going
> > >> wrong is often a lot harder than adding "make it work" hacks (and then
> > >> notice the bug hasn't gone away, it has merely moved to a new location).
> > >>
> > >>
> > >> I still can't say (with much confidence) that the TLB is now stable, but
> > >> in my testing at the moment, it appears to work.
> > >>
> > >> Though, this debugging effort wasn't helped with Verilator also having
> > >> the annoying behavior that adding and removing $display statements can
> > >> sometimes change the behavior of the logic...
> > >>> The TLB can be updated via software, its almost possible then to use a software
> > >>> managed TLB. I guess if the TLB miss signal were fed into the interrupt system,
> > >>> then maybe things could have the option of software management.
> > >>>
> > >> Could be.
> > >>
> > >> In my case, TLB miss triggers an interrupt, and the ISR deals with it.
> > >>
> > >> In the ISR, there is a 'TEA' register that holds the address of whatever
> > >> caused the interrupt (target address for TLB miss), and an 'EXSR'
> > >> register where the low bits encode which interrupt had occurred (as a
> > >> 16-bit number divided into multiple bit fields).
> > >>> I get the feeling there are too many choices available in Thor. Maybe a configuration
> > >>> utility would be better than building in everything.
> > >>>
> > >>> Wondering what entity to place share counts, access keys, and privilege levels in. I
> > >>> had these in the PTE but took them out. There is only one set required for each page
> > >>> of memory.
> > >>>
> > >> I have this stuff in the PTEs/TLBEs.
> > >>
> > >> There is also an ACL Cache with an ACL Miss interrupt as well.
> > >>
> > >> The software side of things isn't really developed all that well yet, as
> > >> most of the time thus far had gone more into trying to get things stable
> > >> enough that the address translation and TLB miss interrupts weren't
> > >> prone to crash stuff.
> > >>
> > >> I seem to have this part stable now, so now it should be possible to
> > >> start making forward progress in this area.
> > >
> > > For Thor, the TLBE / PTE is 128-bits in size with no room to spare without having the
> > > share-counts, etc. included in it. A larger PTE would really complicate things.
> > > The TLBE value can be loaded or stored from / to the TLB using a single
> > > 128-bit wide register. Anything wider means using a pair of registers.. I could
> > > maybe reduce the size of the page number fields a few bits. The fields are 48-bit.
> > > I had the thought that since there only needs to be one copy of the information it
> > > could be stored in another entity. But this means potentially adding to the memory
> > > access time since another table is accessed after the physical page number is
> > > known.
> > >
> > I also have 128-bit TLBE's, with fields for 48 bit addresses.
> >
> > Though the address fields are actually 36 bits, because one can cut off
> > the low-order bits for the address (and because the "base form"
> > addressing is 48-bit).
> >
> > While there is a 96-bit virtual-address mode, as noted, it works by
> > using TLBE's in pairs to get more bits (at the cost of associativity).
> > It mostly just extends the virtual address by 48 bits, with most of the
> > rest of the bits being unused.
> >
> > I ended up implementing the 256-bit TLB load via a "double pumping"
> > strategy (load both halves and then the MMU sticks them together).
> >
> >
> >
> > Would be a little bit harder with a full 64-bit virtual address space
> > though.
> >
> >
> > Copy/Paste
> >
> > TLBE:
> > * (127:112) = VUGID / ACLID
> > * (111: 76) = Virtual Address Page
> > * ( 75: 64) = KR Access
> > * ( 63: 48) = ASID
> > * ( 47: 12) = Physical Address Page
> > * ( 11: 0) = Page Control bits
> >
> > VUGID:
> > * (15:10) = VGID / VUID(Swap)
> > * ( 9: 0) = VUID / VGID(Swap)
> > * May be interpreted as a PID or ACLID.
> > ** In these cases, VUGID is treated as a single 16-bit unit.
> >
> > ASID:
> > * (15:10) = AGID (0=Shared, Global)
> > * ( 9: 0) = APID (0=Shared, Group)
> >
> > KR Access:
> > * (11): Other X
> > * (10): Other W
> > * ( 9): Other R
> > * ( 8): Group X (VUGID Check Only)
> > * ( 7): Group W (VUGID Check Only)
> > * ( 6): Group R (VUGID Check Only)
> > * ( 5): User X (VUGID Check Only)
> > * ( 4): User W (VUGID Check Only)
> > * ( 3): User R (VUGID Check Only)
> > * ( 2): VUGID Mode 2
> > * ( 1): VUGID Mode 1
> > * ( 0): VUGID Mode 0
> >
> > The VUGID Mode bits basically encode what sort of check to perform
> > (VUGID or ACL style).
> >
> >
> > These include, say:
> > Disabled, VUGID, VUGID (Swap), or ACL.
> >
> > In VUGID mode, it does User/Group/Other checks, Swap switches
> > User/Group. ACL: Use ACL (in TLBE), Exact 16-bit match only (ACLE).
> > > I do not know much about ACL except that I suspect it can be quite complicated to do
> > > with hardware. Is there ACL for individual pages of memory? There can be users or
> > > groups of users and system processes all with separate access requirements. I suspect
> > > there would be some limits engineered into ACL. For instance, limited numbers of
> > > groups and users to keep the reference fields a reasonable size. Can you elaborate on
> > > the ACL?
> > >
> > These are per-page checks.
> >
> >
> > The ACL checking doesn't really add all that much cost or complexity
> > compared with the TLB itself.
> >
> > Main costs are mostly the need to spend TLB bits on it, and the addition
> > of a Keyring Register (KRR).
> >
> > Most of the harder parts of the ACL Check (namely, performing the ACL
> > lookups) would be handled in software, so all the hardware really needs
> > to do is check whether the keyring entries are in the ACL Cache, and (it
> > they hit) what sort of access is allowed to the page in question (the
> > access bits are OR'ed together).
> >
> >
> > ACLE:
> > * (63:44): Reserved / MBZ
> > * (43:32): KR Access Mode
> > * (31:16): KRR VUGID / APID
> > * (15: 0): TLB VUGID / ACLID
> >
> > So, one basically does associative checks between the entries in the
> > ACLE cache, and the entries in KRR.
> >
> > Because KRR changes relatively infrequently, the ACLE can be pretty
> > small (ideally, it is larger than the KRR though, so 6 or 8 entries
> > would be better than 4, but I am using 4 entries for now).
> >
> >
> > The combined result of VUGID/ACL, and Basic checks are combined into a
> > single bit-mask saying what access is allowed (in a given operating mode
> > and at a given time). The L1 Caches will use these bits during memory
> > access.
> > > In my PTE eight bits are used for access control to indicate read/write/execute/cache
> > > for either user or system. Maybe these bits would be better used to refer to an ACL. If
> > > I were to trim four bits off the page number fields that means a 16-bit ACL reference
> > > field could be supported.
> > OK.
> >
> > I was able to fit both Basic checks and ACL/VUGID checks.
> >
> > The Basic checks are used ye olde User/Supervisor and similar (R/W/X)
> > and a few bits which effects caching or combinations which are used
> > encode alternate special behaviors (Privledged-Execute and
> > Execute-As-ACLID).
> >
> >
> > ...
> I cannot quite get my head around how the ACL comes into play, but I have provided for
> an ACL in TLBE.
>
> I decided to use 160-bit PTEs. I really wanted to support at least a full 76-bit virtual
> address translation and things just would not fit into a 128-bit PTE. For a hashed table
> this works just fine. Six are packed into 1024 bit with 64-bits to spare. However, for a
> hierarchical table 160-bit PTEs are bit nightmarish.
<
My 66000 has 63-bit virtual address translates into a 66-bit universal address {hint:
MMI/O, Configuration, and ROM are not part of the 64-bit DRAM address space}.
Yet I fit each PTP and PTE into one (1) 64-bit container, and I can skip levels in the
tables so small applications (like cat) can fit in 23-bit virtual address and have 1
layer of PTPs and 1 layer of PTEs. I even have bits left over......
<
What is driving force for 76-bit virtual address ?
<
Secondly, even with a 76-bit virtual address, you should only be needing 90-bits±
for a PTE. Seems to me that the structure of you PTE needs rethought.
<
So, what is the structure of your PTE ?
<
> Just thinking to myself here. Since 160-bit PTEs do not pack a power of two into a
> memory page, that means either increasing the PTE size to 256 bits wasting a lot of
> space, Or doing a high-speed divide by a constant to calculate the page and entry number
> within a page. How practical is the high-speed divide? 204 PTEs will fit into a 4096-byte
> page. That means dividing the address by 204 instead of 256 to find the page number.
> This divide can be done as a multiply by constant reciprocal. 204**-1 * 4096 = 20. 12 bits
> was chosen to have a few extra bits of precision for the remainder. So, take the address
> multiply by 20 then shift right 12 times to get the page number. Multiply by 20 was
<
Notice that multiply by 20 = 4×5 so the only multiply is by 5, then shift up by 4 and down by 12
(or just down by 8).
<
> chosen also because its relatively painless to do. Then compute the entry number as the
> remainder. Multiply the quotient by 204 and subtract it from the original address to get the
> entry number. Example: entry #7000 = 20*7000 = 140000. 140000 >> 12 = 34 which is the
> page number. 34 * 204 = 6936. 7000-6936 = 64. So, entry 64 in page 34 is the desired
> entry. Multiply by 204=multiply by 128 + multiply by 64 + multiply by 8 which is just shifting
> plus a 3- entry adder. Seems straightforward enough. The question is, should it be done?
<
Seems to be addressing the first 2 question should come before this one.
>
> TLB entries are much larger, 224 bits now. They include the PTE plus a 32-bit access key
> from the access rights table and a 32-bit access counter. The access counter is used by
> the clock algorithm which is implemented in hardware. The TLB entry gets read and written
> with every access, so I figured it may as well be updating a counter at the same time.
> Another hardware state machine takes care of aging the counters.
<
Why is the access key in the PTE?--it should be a part of the page itself.
<
Seems to me you are putting useful things in the wrong places.


Click here to read the complete article
Re: Packing Hash Tables

<6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24196&group=comp.arch#24196

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:2f04:0:b0:663:397d:7051 with SMTP id v4-20020a372f04000000b00663397d7051mr11090033qkh.333.1647142617595;
Sat, 12 Mar 2022 19:36:57 -0800 (PST)
X-Received: by 2002:a05:6870:1688:b0:da:b3f:321d with SMTP id
j8-20020a056870168800b000da0b3f321dmr9078411oae.205.1647142617071; Sat, 12
Mar 2022 19:36:57 -0800 (PST)
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 12 Mar 2022 19:36:56 -0800 (PST)
In-Reply-To: <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:fb00:ddea:71ed:547c:c824;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:fb00:ddea:71ed:547c:c824
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me>
<3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com> <t0ennl$ald$1@dont-email.me>
<c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com> <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Sun, 13 Mar 2022 03:36:57 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 553
 by: robf...@gmail.com - Sun, 13 Mar 2022 03:36 UTC

On Saturday, March 12, 2022 at 9:44:17 PM UTC-5, MitchAlsup wrote:
> On Saturday, March 12, 2022 at 8:30:44 PM UTC-6, robf...@gmail.com wrote:
> > On Friday, March 11, 2022 at 12:46:33 AM UTC-5, BGB wrote:
> > > On 3/10/2022 7:58 PM, robf...@gmail.com wrote:
> > > > On Thursday, March 10, 2022 at 1:15:31 PM UTC-5, BGB wrote:
> > > >> On 3/10/2022 9:04 AM, robf...@gmail.com wrote:
> > > >>> On Wednesday, March 9, 2022 at 12:37:26 PM UTC-5, BGB wrote:
> > > >>>> On 3/9/2022 10:20 AM, EricP wrote:
> > > >>>
> > > >>>>> Note that operating systems such as Linux the hierarchical page table is
> > > >>>>> part of its design. To port it to Power with its hash/inverted page table
> > > >>>>> they kept the hierarchical table managed in memory as before,
> > > >>>>> and move valid PTE entries into and out of the hash table.
> > > >>>>>
> > > >>>>> In effect, the hierarchical table is still the OS page table,
> > > >>>>> and the hash table is treated like a giant software managed TLB..
> > > >>>>> Software could implement, say, a FIFO mechanism to rotate a working set
> > > >>>>> of Present hierarchical PTE's through hash PTE slots,
> > > >>>>> similar to how a HW TLB might rotate through its entries.
> > > >>>>>
> > > >>>>> In this arrangement, Address Space ID's (ASID) for the hash table entries
> > > >>>>> would be very important for efficiency, as without them a process switch
> > > >>>>> would require invalidating all hash table entries just like a real TLB.
> > > >>>>>
> > > >>> Seems kind of kludgy. I am wondering why now this approach was used.
> > > >>>
> > > >>> I have coded in the bus interface unit the capability to use either hashed or
> > > >>> hierarchical page tables. I was not so insane as to ignore the popularity of
> > > >>> hierarchical page tables. Can probably use both types at the same time.
> > > >>> Hashed tables are a better way to go for a large address
> > > >>> space in my opinion. They use less memory and may have faster look-ups.
> > > >>>
> > > >> Hashed tables could also make sense as an alternative to B-Trees.
> > > >>
> > > >> A B-Tree lookup is likely to be relatively complicated and expensive if
> > > >> compared to a hash lookup.
> > > >>
> > > >> It is possible that I could also combine a hashed top-level with
> > > >> 2-levels of nested page table (optionally collapsing down the upper 6
> > > >> levels into a single page).
> > > >>>
> > > >>>> OK. This gives more context for what I was confused about in my response.
> > > >>>>
> > > >>>> In my case, I use a software managed TLB, with hierarchical page tables,
> > > >>>> and no sort of intermediate stage representation.
> > > >>>>
> > > >>>> TLB miss interrupt occurs, handler walks the page table, and loads the
> > > >>>> resulting entry into the TLB. Basic design for the mechanism was
> > > >>>> more-or-less lifted from SH-4, but with the major difference of going
> > > >>>> from a 64-entry fully associative TLB to a 64x or 256x 4-way TLB..
> > > >>>>
> > > >>>> Arguably, a 6-way or 8-way TLB could be better, but more expensive.
> > > >>>>
> > > >>>>
> > > >>>> I had considered potentially using B-Trees for the upper levels (instead
> > > >>>> of hierarchical nesting) for the 96-bit mode, mostly because as-is this
> > > >>>> leads to a "very sparse" 8-level table, and a B-Tree could be more
> > > >>>> compact (with more traditional nested page tables for the lower 2 or 3
> > > >>>> levels).
> > > >>>>
> > > >>>> Though, there are other (potentially simpler/cheaper) ways to deal with
> > > >>>> this.
> > > >>>>
> > > >>>> This would not really effect the hardware in my case.
> > > >>>
> > > >>> Page tables are walked via hardware for Thor. Software only gets involved on a
> > > >>> page fault. While software managing the TLB is probably comparable in
> > > >>> performance and more flexible than hardware, TLB updates with hardware
> > > >>> walking are almost invisible to software.
> > > >>>
> > > >> This was one annoyance with BJX2 with the use of software managed TLB:
> > > >> Getting the software TLB miss interrupts stable was a major source of
> > > >> annoyance, as there were a whole lot of "little things" that can go
> > > >> wrong in these mechanisms.
> > > >>
> > > >> The L1 caches needing to be able to stop what they are doing, then
> > > >> quietly transfer control to an ISR, then resume normal operation without
> > > >> any adverse side effects, was a bit of a pain to debug.
> > > >>
> > > >> Some of the cases were infrequent, so one might to get Doom or Quake
> > > >> launched (in a simulation) before the crash happens, where firing these
> > > >> up inside a Verilog simulation is not exactly a fast process (the time
> > > >> issue is also a major reason why I am using LZ compressed binaries and
> > > >> WADs).
> > > >>
> > > >>
> > > >> As noted, the MMU is temporarily disabled within ISRs, partly because
> > > >> there is no way for an ISR to deal with a TLB miss (unlike some CPU
> > > >> architectures, ISRs are not re-entrant in my case).
> > > >>
> > > >>
> > > >> Similarly, the TLB miss handler would have to resolve page-fault
> > > >> scenarios (for virtual memory), without access to the main filesystem
> > > >> driver (it has to bypass the driver and access the SDCard directly), but
> > > >> does also mean that a page fault (to virtual memory) during some other
> > > >> SDcard IO request would be "potentially fatal".
> > > >>
> > > >> ...
> > > >>
> > > >> Nevermind whether or not it makes sense to put a pagefile on an SDcard,
> > > >> there is nowhere else to put it in this case.
> > > >>
> > > >>
> > > >> What to do with a page-fault to an unmapped address (eg: not backed to
> > > >> the page file) is currently an unresolved issue, but this is more a
> > > >> software thing than a hardware thing (in a real OS, would probably
> > > >> terminate the process or similar here; for the time being, the only real
> > > >> solution is to trigger a breakpoint).
> > > >>
> > > >> I may need to consider going and adding preemptive multitasking, as it
> > > >> looks like most "real OS" stuff would depend on this (cooperative
> > > >> multitasking doesn't scale particularly well, or leave any real good
> > > >> solution for "OK, this program has gone into an infinite loop or
> > > >> crashed", which basically leaves no real solution otherwise).
> > > >>
> > > >>
> > > >>
> > > >> The debugging was getting bad enough that I almost started to debate
> > > >> whether it would be worthwhile to add hardware page walking and not have
> > > >> to deal with this as much (but would still need a Page Fault handler, so
> > > >> it is not entirely avoidable).
> > > >>
> > > >>
> > > >> Sorta spent most of February beating on trying to debug stuff related to
> > > >> the L1 Caches and their interaction with the TLB Miss ISR, partly as
> > > >> "poking at something" in the L1 D$ set off an unholy bees nest of bugs.
> > > >>
> > > >> Likewise, much of my past debugging effort here was more focused on
> > > >> "quick and dirty" fixes which typically don't really resolve the
> > > >> underlying issues; but, trying to figure out "why" something is going
> > > >> wrong is often a lot harder than adding "make it work" hacks (and then
> > > >> notice the bug hasn't gone away, it has merely moved to a new location).
> > > >>
> > > >>
> > > >> I still can't say (with much confidence) that the TLB is now stable, but
> > > >> in my testing at the moment, it appears to work.
> > > >>
> > > >> Though, this debugging effort wasn't helped with Verilator also having
> > > >> the annoying behavior that adding and removing $display statements can
> > > >> sometimes change the behavior of the logic...
> > > >>> The TLB can be updated via software, its almost possible then to use a software
> > > >>> managed TLB. I guess if the TLB miss signal were fed into the interrupt system,
> > > >>> then maybe things could have the option of software management.
> > > >>>
> > > >> Could be.
> > > >>
> > > >> In my case, TLB miss triggers an interrupt, and the ISR deals with it.
> > > >>
> > > >> In the ISR, there is a 'TEA' register that holds the address of whatever
> > > >> caused the interrupt (target address for TLB miss), and an 'EXSR'
> > > >> register where the low bits encode which interrupt had occurred (as a
> > > >> 16-bit number divided into multiple bit fields).
> > > >>> I get the feeling there are too many choices available in Thor. Maybe a configuration
> > > >>> utility would be better than building in everything.
> > > >>>
> > > >>> Wondering what entity to place share counts, access keys, and privilege levels in. I
> > > >>> had these in the PTE but took them out. There is only one set required for each page
> > > >>> of memory.
> > > >>>
> > > >> I have this stuff in the PTEs/TLBEs.
> > > >>
> > > >> There is also an ACL Cache with an ACL Miss interrupt as well.
> > > >>
> > > >> The software side of things isn't really developed all that well yet, as
> > > >> most of the time thus far had gone more into trying to get things stable
> > > >> enough that the address translation and TLB miss interrupts weren't
> > > >> prone to crash stuff.
> > > >>
> > > >> I seem to have this part stable now, so now it should be possible to
> > > >> start making forward progress in this area.
> > > >
> > > > For Thor, the TLBE / PTE is 128-bits in size with no room to spare without having the
> > > > share-counts, etc. included in it. A larger PTE would really complicate things.
> > > > The TLBE value can be loaded or stored from / to the TLB using a single
> > > > 128-bit wide register. Anything wider means using a pair of registers. I could
> > > > maybe reduce the size of the page number fields a few bits. The fields are 48-bit.
> > > > I had the thought that since there only needs to be one copy of the information it
> > > > could be stored in another entity. But this means potentially adding to the memory
> > > > access time since another table is accessed after the physical page number is
> > > > known.
> > > >
> > > I also have 128-bit TLBE's, with fields for 48 bit addresses.
> > >
> > > Though the address fields are actually 36 bits, because one can cut off
> > > the low-order bits for the address (and because the "base form"
> > > addressing is 48-bit).
> > >
> > > While there is a 96-bit virtual-address mode, as noted, it works by
> > > using TLBE's in pairs to get more bits (at the cost of associativity)..
> > > It mostly just extends the virtual address by 48 bits, with most of the
> > > rest of the bits being unused.
> > >
> > > I ended up implementing the 256-bit TLB load via a "double pumping"
> > > strategy (load both halves and then the MMU sticks them together).
> > >
> > >
> > >
> > > Would be a little bit harder with a full 64-bit virtual address space
> > > though.
> > >
> > >
> > > Copy/Paste
> > >
> > > TLBE:
> > > * (127:112) = VUGID / ACLID
> > > * (111: 76) = Virtual Address Page
> > > * ( 75: 64) = KR Access
> > > * ( 63: 48) = ASID
> > > * ( 47: 12) = Physical Address Page
> > > * ( 11: 0) = Page Control bits
> > >
> > > VUGID:
> > > * (15:10) = VGID / VUID(Swap)
> > > * ( 9: 0) = VUID / VGID(Swap)
> > > * May be interpreted as a PID or ACLID.
> > > ** In these cases, VUGID is treated as a single 16-bit unit.
> > >
> > > ASID:
> > > * (15:10) = AGID (0=Shared, Global)
> > > * ( 9: 0) = APID (0=Shared, Group)
> > >
> > > KR Access:
> > > * (11): Other X
> > > * (10): Other W
> > > * ( 9): Other R
> > > * ( 8): Group X (VUGID Check Only)
> > > * ( 7): Group W (VUGID Check Only)
> > > * ( 6): Group R (VUGID Check Only)
> > > * ( 5): User X (VUGID Check Only)
> > > * ( 4): User W (VUGID Check Only)
> > > * ( 3): User R (VUGID Check Only)
> > > * ( 2): VUGID Mode 2
> > > * ( 1): VUGID Mode 1
> > > * ( 0): VUGID Mode 0
> > >
> > > The VUGID Mode bits basically encode what sort of check to perform
> > > (VUGID or ACL style).
> > >
> > >
> > > These include, say:
> > > Disabled, VUGID, VUGID (Swap), or ACL.
> > >
> > > In VUGID mode, it does User/Group/Other checks, Swap switches
> > > User/Group. ACL: Use ACL (in TLBE), Exact 16-bit match only (ACLE).
> > > > I do not know much about ACL except that I suspect it can be quite complicated to do
> > > > with hardware. Is there ACL for individual pages of memory? There can be users or
> > > > groups of users and system processes all with separate access requirements. I suspect
> > > > there would be some limits engineered into ACL. For instance, limited numbers of
> > > > groups and users to keep the reference fields a reasonable size. Can you elaborate on
> > > > the ACL?
> > > >
> > > These are per-page checks.
> > >
> > >
> > > The ACL checking doesn't really add all that much cost or complexity
> > > compared with the TLB itself.
> > >
> > > Main costs are mostly the need to spend TLB bits on it, and the addition
> > > of a Keyring Register (KRR).
> > >
> > > Most of the harder parts of the ACL Check (namely, performing the ACL
> > > lookups) would be handled in software, so all the hardware really needs
> > > to do is check whether the keyring entries are in the ACL Cache, and (it
> > > they hit) what sort of access is allowed to the page in question (the
> > > access bits are OR'ed together).
> > >
> > >
> > > ACLE:
> > > * (63:44): Reserved / MBZ
> > > * (43:32): KR Access Mode
> > > * (31:16): KRR VUGID / APID
> > > * (15: 0): TLB VUGID / ACLID
> > >
> > > So, one basically does associative checks between the entries in the
> > > ACLE cache, and the entries in KRR.
> > >
> > > Because KRR changes relatively infrequently, the ACLE can be pretty
> > > small (ideally, it is larger than the KRR though, so 6 or 8 entries
> > > would be better than 4, but I am using 4 entries for now).
> > >
> > >
> > > The combined result of VUGID/ACL, and Basic checks are combined into a
> > > single bit-mask saying what access is allowed (in a given operating mode
> > > and at a given time). The L1 Caches will use these bits during memory
> > > access.
> > > > In my PTE eight bits are used for access control to indicate read/write/execute/cache
> > > > for either user or system. Maybe these bits would be better used to refer to an ACL. If
> > > > I were to trim four bits off the page number fields that means a 16-bit ACL reference
> > > > field could be supported.
> > > OK.
> > >
> > > I was able to fit both Basic checks and ACL/VUGID checks.
> > >
> > > The Basic checks are used ye olde User/Supervisor and similar (R/W/X)
> > > and a few bits which effects caching or combinations which are used
> > > encode alternate special behaviors (Privledged-Execute and
> > > Execute-As-ACLID).
> > >
> > >
> > > ...
> > I cannot quite get my head around how the ACL comes into play, but I have provided for
> > an ACL in TLBE.
> >
> > I decided to use 160-bit PTEs. I really wanted to support at least a full 76-bit virtual
> > address translation and things just would not fit into a 128-bit PTE. For a hashed table
> > this works just fine. Six are packed into 1024 bit with 64-bits to spare. However, for a
> > hierarchical table 160-bit PTEs are bit nightmarish.
> <
> My 66000 has 63-bit virtual address translates into a 66-bit universal address {hint:
> MMI/O, Configuration, and ROM are not part of the 64-bit DRAM address space}.
> Yet I fit each PTP and PTE into one (1) 64-bit container, and I can skip levels in the
> tables so small applications (like cat) can fit in 23-bit virtual address and have 1
> layer of PTPs and 1 layer of PTEs. I even have bits left over......
> <
> What is driving force for 76-bit virtual address ?
> <
> Secondly, even with a 76-bit virtual address, you should only be needing 90-bits±
> for a PTE. Seems to me that the structure of you PTE needs rethought.
> <
> So, what is the structure of your PTE ?
> <
> > Just thinking to myself here. Since 160-bit PTEs do not pack a power of two into a
> > memory page, that means either increasing the PTE size to 256 bits wasting a lot of
> > space, Or doing a high-speed divide by a constant to calculate the page and entry number
> > within a page. How practical is the high-speed divide? 204 PTEs will fit into a 4096-byte
> > page. That means dividing the address by 204 instead of 256 to find the page number.
> > This divide can be done as a multiply by constant reciprocal. 204**-1 * 4096 = 20. 12 bits
> > was chosen to have a few extra bits of precision for the remainder. So, take the address
> > multiply by 20 then shift right 12 times to get the page number. Multiply by 20 was
> <
> Notice that multiply by 20 = 4×5 so the only multiply is by 5, then shift up by 4 and down by 12
> (or just down by 8).
> <
> > chosen also because its relatively painless to do. Then compute the entry number as the
> > remainder. Multiply the quotient by 204 and subtract it from the original address to get the
> > entry number. Example: entry #7000 = 20*7000 = 140000. 140000 >> 12 = 34 which is the
> > page number. 34 * 204 = 6936. 7000-6936 = 64. So, entry 64 in page 34 is the desired
> > entry. Multiply by 204=multiply by 128 + multiply by 64 + multiply by 8 which is just shifting
> > plus a 3- entry adder. Seems straightforward enough. The question is, should it be done?
> <
> Seems to be addressing the first 2 question should come before this one.
> >
> > TLB entries are much larger, 224 bits now. They include the PTE plus a 32-bit access key
> > from the access rights table and a 32-bit access counter. The access counter is used by
> > the clock algorithm which is implemented in hardware. The TLB entry gets read and written
> > with every access, so I figured it may as well be updating a counter at the same time.
> > Another hardware state machine takes care of aging the counters.
> <
> Why is the access key in the PTE?--it should be a part of the page itself..
> <
> Seems to me you are putting useful things in the wrong places.


Click here to read the complete article
Re: Packing Hash Tables

<3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24222&group=comp.arch#24222

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:1011:b0:2dd:5b59:66ed with SMTP id d17-20020a05622a101100b002dd5b5966edmr15641130qte.550.1647190112331;
Sun, 13 Mar 2022 09:48:32 -0700 (PDT)
X-Received: by 2002:a05:6870:204c:b0:da:b3f:2b86 with SMTP id
l12-20020a056870204c00b000da0b3f2b86mr15128021oad.293.1647190111579; Sun, 13
Mar 2022 09:48:31 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 13 Mar 2022 09:48:31 -0700 (PDT)
In-Reply-To: <6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:4821:eab3:120a:265;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:4821:eab3:120a:265
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me>
<3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com> <t0ennl$ald$1@dont-email.me>
<c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com> <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>
<6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 13 Mar 2022 16:48:32 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 167
 by: MitchAlsup - Sun, 13 Mar 2022 16:48 UTC

On Saturday, March 12, 2022 at 9:36:59 PM UTC-6, robf...@gmail.com wrote:
> On Saturday, March 12, 2022 at 9:44:17 PM UTC-5, MitchAlsup wrote:
> > On Saturday, March 12, 2022 at 8:30:44 PM UTC-6, robf...@gmail.com wrote:

> > <
> > My 66000 has 63-bit virtual address translates into a 66-bit universal address {hint:
> > MMI/O, Configuration, and ROM are not part of the 64-bit DRAM address space}.
> > Yet I fit each PTP and PTE into one (1) 64-bit container, and I can skip levels in the
> > tables so small applications (like cat) can fit in 23-bit virtual address and have 1
> > layer of PTPs and 1 layer of PTEs. I even have bits left over......
> > <
> > What is driving force for 76-bit virtual address ?
> > <
> > Secondly, even with a 76-bit virtual address, you should only be needing 90-bits±
> > for a PTE. Seems to me that the structure of you PTE needs rethought.
> > <
> > So, what is the structure of your PTE ?
> > <
> > > Just thinking to myself here. Since 160-bit PTEs do not pack a power of two into a
> > > memory page, that means either increasing the PTE size to 256 bits wasting a lot of
> > > space, Or doing a high-speed divide by a constant to calculate the page and entry number
> > > within a page. How practical is the high-speed divide? 204 PTEs will fit into a 4096-byte
> > > page. That means dividing the address by 204 instead of 256 to find the page number.
> > > This divide can be done as a multiply by constant reciprocal. 204**-1 * 4096 = 20. 12 bits
> > > was chosen to have a few extra bits of precision for the remainder. So, take the address
> > > multiply by 20 then shift right 12 times to get the page number. Multiply by 20 was
> > <
> > Notice that multiply by 20 = 4×5 so the only multiply is by 5, then shift up by 4 and down by 12
> > (or just down by 8).
> > <
> > > chosen also because its relatively painless to do. Then compute the entry number as the
> > > remainder. Multiply the quotient by 204 and subtract it from the original address to get the
> > > entry number. Example: entry #7000 = 20*7000 = 140000. 140000 >> 12 = 34 which is the
> > > page number. 34 * 204 = 6936. 7000-6936 = 64. So, entry 64 in page 34 is the desired
> > > entry. Multiply by 204=multiply by 128 + multiply by 64 + multiply by 8 which is just shifting
> > > plus a 3- entry adder. Seems straightforward enough. The question is, should it be done?
> > <
> > Seems to be addressing the first 2 question should come before this one..
> > >
> > > TLB entries are much larger, 224 bits now. They include the PTE plus a 32-bit access key
> > > from the access rights table and a 32-bit access counter. The access counter is used by
> > > the clock algorithm which is implemented in hardware. The TLB entry gets read and written
> > > with every access, so I figured it may as well be updating a counter at the same time.
> > > Another hardware state machine takes care of aging the counters.
> > <
> > Why is the access key in the PTE?--it should be a part of the page itself.
> > <
> > Seems to me you are putting useful things in the wrong places.
>
> > What is driving force for 76-bit virtual address ?
> I am not quite sure why 76 bits, but my gut tells me that may not be enough. Emulating quantum
> and molecular computers? IDK. It does allow the memory allocator to just keep travelling forwards.
> The physical address needs to be stored in the PTE too. That and the virtual address take over 100
> bits. Add in the ASID and a few access control bits and there is no room left for the future.
> > So, what is the structure of your PTE ?
> The PTE contains mainly the address translation.
> Structure contains:
> PPN, (52 bits)
> VPN, (64 bits)
> ASID (12 bits)
> Global translation bit
> Valid bit
> Dirty bit
> Page size bit
> Read / Write / Execute / Cacheble bits for system and user modes (8 bits)
> Bounce count (6 bits) (for inverted page table defrag)
> 15 bits unused.
<
My 66000 Page Pointer {either a PTP or a PTE}
<
PPN<63..13>
RWE<12..10>
LVL<9..7>
Where<6..5> {L1, L2, allocated, unCacheable}
used<2>
mod<1>
present<0>
<
LVL = 001 means PTE, otherwise PTP
LVL on the pointer to PTE indicates size of the page
<
A PTE to an unCacheable page uses where<6..5> as the space of access
......{DRAM, MMI/O, Configuration, ROM} all which can be 64-bits.
<
ASID comes from a thread control register
My 66000 supports translation indirection so a service provider can access
into the service caller--this gets rid of any need for Global/Application or s/u
<
VPN is used to access PPN so it is stored in TLB but is not part of page pointing.
Pages larger than smallest, use the PPN field to enumerate the number of pages
this super page pointer points at. So, if you have a 8MB translation you can set
it up to contain 1MB if that is what is really needed.
<
There are 2 tables of translation, one used by Guest OS, the other used by
HyperVisor as a nested page table.
<
> > Why is the access key in the PTE?--it should be a part of the page itself.
<
I don't use a keyed protection scheme. I could work one in, but I would associate
each page with a key, and place the key in the thread control registers.
<
> The access key is not in the PTE. It does appear the TLB entry though along with an access
> counter. I have another table called ART that contains the access key, share count, privilege
> level and few other bits. Bits for which there is only a single copy per page. The ART is managed
> by software. Bits from ART are loaded automatically into the TLBE when needed.
<
> > Seems to me you are putting useful things in the wrong places.
<
> I have been studying MMUs on the web with the hopes of sorting things out.. Most of them do
> not include as much information in the PTE / TLBE or use smaller fields. To me it looks like
> some are legacy designs, inheriting from 32-bit perhaps. I have not found much mentioned
> about what to do with the per page information.
<
Almost everything you find on the web has been damaged by some kind of association
with x86 or other 32-bit MMUs.
<
> My design is using 128-bit registers, mainly to support quad precision 128-bit decimal floating-
> point.
<
How big is you int, your long, and your pointer ?

Re: Packing Hash Tables

<t0lbdc$fj6$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24226&group=comp.arch#24226

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
Date: Sun, 13 Mar 2022 12:59:00 -0500
Organization: A noiseless patient Spider
Lines: 430
Message-ID: <t0lbdc$fj6$1@dont-email.me>
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com>
<t0df7v$afv$1@dont-email.me>
<3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com>
<t0ennl$ald$1@dont-email.me>
<c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com>
<5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 13 Mar 2022 17:59:08 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="b88b4f910aab5cbd08a21e340d435a9f";
logging-data="15974"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+f9+e/Z2S+dqBqWwo3lOkN"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:zJRfziuAg8UQpTtVnf28GcRO6hE=
In-Reply-To: <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>
Content-Language: en-US
 by: BGB - Sun, 13 Mar 2022 17:59 UTC

On 3/12/2022 8:44 PM, MitchAlsup wrote:
> On Saturday, March 12, 2022 at 8:30:44 PM UTC-6, robf...@gmail.com wrote:
>> On Friday, March 11, 2022 at 12:46:33 AM UTC-5, BGB wrote:
>>> On 3/10/2022 7:58 PM, robf...@gmail.com wrote:
>>>> On Thursday, March 10, 2022 at 1:15:31 PM UTC-5, BGB wrote:
>>>>> On 3/10/2022 9:04 AM, robf...@gmail.com wrote:
>>>>>> On Wednesday, March 9, 2022 at 12:37:26 PM UTC-5, BGB wrote:
>>>>>>> On 3/9/2022 10:20 AM, EricP wrote:
>>>>>>
>>>>>>>> Note that operating systems such as Linux the hierarchical page table is
>>>>>>>> part of its design. To port it to Power with its hash/inverted page table
>>>>>>>> they kept the hierarchical table managed in memory as before,
>>>>>>>> and move valid PTE entries into and out of the hash table.
>>>>>>>>
>>>>>>>> In effect, the hierarchical table is still the OS page table,
>>>>>>>> and the hash table is treated like a giant software managed TLB.
>>>>>>>> Software could implement, say, a FIFO mechanism to rotate a working set
>>>>>>>> of Present hierarchical PTE's through hash PTE slots,
>>>>>>>> similar to how a HW TLB might rotate through its entries.
>>>>>>>>
>>>>>>>> In this arrangement, Address Space ID's (ASID) for the hash table entries
>>>>>>>> would be very important for efficiency, as without them a process switch
>>>>>>>> would require invalidating all hash table entries just like a real TLB.
>>>>>>>>
>>>>>> Seems kind of kludgy. I am wondering why now this approach was used.
>>>>>>
>>>>>> I have coded in the bus interface unit the capability to use either hashed or
>>>>>> hierarchical page tables. I was not so insane as to ignore the popularity of
>>>>>> hierarchical page tables. Can probably use both types at the same time.
>>>>>> Hashed tables are a better way to go for a large address
>>>>>> space in my opinion. They use less memory and may have faster look-ups.
>>>>>>
>>>>> Hashed tables could also make sense as an alternative to B-Trees.
>>>>>
>>>>> A B-Tree lookup is likely to be relatively complicated and expensive if
>>>>> compared to a hash lookup.
>>>>>
>>>>> It is possible that I could also combine a hashed top-level with
>>>>> 2-levels of nested page table (optionally collapsing down the upper 6
>>>>> levels into a single page).
>>>>>>
>>>>>>> OK. This gives more context for what I was confused about in my response.
>>>>>>>
>>>>>>> In my case, I use a software managed TLB, with hierarchical page tables,
>>>>>>> and no sort of intermediate stage representation.
>>>>>>>
>>>>>>> TLB miss interrupt occurs, handler walks the page table, and loads the
>>>>>>> resulting entry into the TLB. Basic design for the mechanism was
>>>>>>> more-or-less lifted from SH-4, but with the major difference of going
>>>>>>> from a 64-entry fully associative TLB to a 64x or 256x 4-way TLB.
>>>>>>>
>>>>>>> Arguably, a 6-way or 8-way TLB could be better, but more expensive.
>>>>>>>
>>>>>>>
>>>>>>> I had considered potentially using B-Trees for the upper levels (instead
>>>>>>> of hierarchical nesting) for the 96-bit mode, mostly because as-is this
>>>>>>> leads to a "very sparse" 8-level table, and a B-Tree could be more
>>>>>>> compact (with more traditional nested page tables for the lower 2 or 3
>>>>>>> levels).
>>>>>>>
>>>>>>> Though, there are other (potentially simpler/cheaper) ways to deal with
>>>>>>> this.
>>>>>>>
>>>>>>> This would not really effect the hardware in my case.
>>>>>>
>>>>>> Page tables are walked via hardware for Thor. Software only gets involved on a
>>>>>> page fault. While software managing the TLB is probably comparable in
>>>>>> performance and more flexible than hardware, TLB updates with hardware
>>>>>> walking are almost invisible to software.
>>>>>>
>>>>> This was one annoyance with BJX2 with the use of software managed TLB:
>>>>> Getting the software TLB miss interrupts stable was a major source of
>>>>> annoyance, as there were a whole lot of "little things" that can go
>>>>> wrong in these mechanisms.
>>>>>
>>>>> The L1 caches needing to be able to stop what they are doing, then
>>>>> quietly transfer control to an ISR, then resume normal operation without
>>>>> any adverse side effects, was a bit of a pain to debug.
>>>>>
>>>>> Some of the cases were infrequent, so one might to get Doom or Quake
>>>>> launched (in a simulation) before the crash happens, where firing these
>>>>> up inside a Verilog simulation is not exactly a fast process (the time
>>>>> issue is also a major reason why I am using LZ compressed binaries and
>>>>> WADs).
>>>>>
>>>>>
>>>>> As noted, the MMU is temporarily disabled within ISRs, partly because
>>>>> there is no way for an ISR to deal with a TLB miss (unlike some CPU
>>>>> architectures, ISRs are not re-entrant in my case).
>>>>>
>>>>>
>>>>> Similarly, the TLB miss handler would have to resolve page-fault
>>>>> scenarios (for virtual memory), without access to the main filesystem
>>>>> driver (it has to bypass the driver and access the SDCard directly), but
>>>>> does also mean that a page fault (to virtual memory) during some other
>>>>> SDcard IO request would be "potentially fatal".
>>>>>
>>>>> ...
>>>>>
>>>>> Nevermind whether or not it makes sense to put a pagefile on an SDcard,
>>>>> there is nowhere else to put it in this case.
>>>>>
>>>>>
>>>>> What to do with a page-fault to an unmapped address (eg: not backed to
>>>>> the page file) is currently an unresolved issue, but this is more a
>>>>> software thing than a hardware thing (in a real OS, would probably
>>>>> terminate the process or similar here; for the time being, the only real
>>>>> solution is to trigger a breakpoint).
>>>>>
>>>>> I may need to consider going and adding preemptive multitasking, as it
>>>>> looks like most "real OS" stuff would depend on this (cooperative
>>>>> multitasking doesn't scale particularly well, or leave any real good
>>>>> solution for "OK, this program has gone into an infinite loop or
>>>>> crashed", which basically leaves no real solution otherwise).
>>>>>
>>>>>
>>>>>
>>>>> The debugging was getting bad enough that I almost started to debate
>>>>> whether it would be worthwhile to add hardware page walking and not have
>>>>> to deal with this as much (but would still need a Page Fault handler, so
>>>>> it is not entirely avoidable).
>>>>>
>>>>>
>>>>> Sorta spent most of February beating on trying to debug stuff related to
>>>>> the L1 Caches and their interaction with the TLB Miss ISR, partly as
>>>>> "poking at something" in the L1 D$ set off an unholy bees nest of bugs.
>>>>>
>>>>> Likewise, much of my past debugging effort here was more focused on
>>>>> "quick and dirty" fixes which typically don't really resolve the
>>>>> underlying issues; but, trying to figure out "why" something is going
>>>>> wrong is often a lot harder than adding "make it work" hacks (and then
>>>>> notice the bug hasn't gone away, it has merely moved to a new location).
>>>>>
>>>>>
>>>>> I still can't say (with much confidence) that the TLB is now stable, but
>>>>> in my testing at the moment, it appears to work.
>>>>>
>>>>> Though, this debugging effort wasn't helped with Verilator also having
>>>>> the annoying behavior that adding and removing $display statements can
>>>>> sometimes change the behavior of the logic...
>>>>>> The TLB can be updated via software, its almost possible then to use a software
>>>>>> managed TLB. I guess if the TLB miss signal were fed into the interrupt system,
>>>>>> then maybe things could have the option of software management.
>>>>>>
>>>>> Could be.
>>>>>
>>>>> In my case, TLB miss triggers an interrupt, and the ISR deals with it.
>>>>>
>>>>> In the ISR, there is a 'TEA' register that holds the address of whatever
>>>>> caused the interrupt (target address for TLB miss), and an 'EXSR'
>>>>> register where the low bits encode which interrupt had occurred (as a
>>>>> 16-bit number divided into multiple bit fields).
>>>>>> I get the feeling there are too many choices available in Thor. Maybe a configuration
>>>>>> utility would be better than building in everything.
>>>>>>
>>>>>> Wondering what entity to place share counts, access keys, and privilege levels in. I
>>>>>> had these in the PTE but took them out. There is only one set required for each page
>>>>>> of memory.
>>>>>>
>>>>> I have this stuff in the PTEs/TLBEs.
>>>>>
>>>>> There is also an ACL Cache with an ACL Miss interrupt as well.
>>>>>
>>>>> The software side of things isn't really developed all that well yet, as
>>>>> most of the time thus far had gone more into trying to get things stable
>>>>> enough that the address translation and TLB miss interrupts weren't
>>>>> prone to crash stuff.
>>>>>
>>>>> I seem to have this part stable now, so now it should be possible to
>>>>> start making forward progress in this area.
>>>>
>>>> For Thor, the TLBE / PTE is 128-bits in size with no room to spare without having the
>>>> share-counts, etc. included in it. A larger PTE would really complicate things.
>>>> The TLBE value can be loaded or stored from / to the TLB using a single
>>>> 128-bit wide register. Anything wider means using a pair of registers. I could
>>>> maybe reduce the size of the page number fields a few bits. The fields are 48-bit.
>>>> I had the thought that since there only needs to be one copy of the information it
>>>> could be stored in another entity. But this means potentially adding to the memory
>>>> access time since another table is accessed after the physical page number is
>>>> known.
>>>>
>>> I also have 128-bit TLBE's, with fields for 48 bit addresses.
>>>
>>> Though the address fields are actually 36 bits, because one can cut off
>>> the low-order bits for the address (and because the "base form"
>>> addressing is 48-bit).
>>>
>>> While there is a 96-bit virtual-address mode, as noted, it works by
>>> using TLBE's in pairs to get more bits (at the cost of associativity).
>>> It mostly just extends the virtual address by 48 bits, with most of the
>>> rest of the bits being unused.
>>>
>>> I ended up implementing the 256-bit TLB load via a "double pumping"
>>> strategy (load both halves and then the MMU sticks them together).
>>>
>>>
>>>
>>> Would be a little bit harder with a full 64-bit virtual address space
>>> though.
>>>
>>>
>>> Copy/Paste
>>>
>>> TLBE:
>>> * (127:112) = VUGID / ACLID
>>> * (111: 76) = Virtual Address Page
>>> * ( 75: 64) = KR Access
>>> * ( 63: 48) = ASID
>>> * ( 47: 12) = Physical Address Page
>>> * ( 11: 0) = Page Control bits
>>>
>>> VUGID:
>>> * (15:10) = VGID / VUID(Swap)
>>> * ( 9: 0) = VUID / VGID(Swap)
>>> * May be interpreted as a PID or ACLID.
>>> ** In these cases, VUGID is treated as a single 16-bit unit.
>>>
>>> ASID:
>>> * (15:10) = AGID (0=Shared, Global)
>>> * ( 9: 0) = APID (0=Shared, Group)
>>>
>>> KR Access:
>>> * (11): Other X
>>> * (10): Other W
>>> * ( 9): Other R
>>> * ( 8): Group X (VUGID Check Only)
>>> * ( 7): Group W (VUGID Check Only)
>>> * ( 6): Group R (VUGID Check Only)
>>> * ( 5): User X (VUGID Check Only)
>>> * ( 4): User W (VUGID Check Only)
>>> * ( 3): User R (VUGID Check Only)
>>> * ( 2): VUGID Mode 2
>>> * ( 1): VUGID Mode 1
>>> * ( 0): VUGID Mode 0
>>>
>>> The VUGID Mode bits basically encode what sort of check to perform
>>> (VUGID or ACL style).
>>>
>>>
>>> These include, say:
>>> Disabled, VUGID, VUGID (Swap), or ACL.
>>>
>>> In VUGID mode, it does User/Group/Other checks, Swap switches
>>> User/Group. ACL: Use ACL (in TLBE), Exact 16-bit match only (ACLE).
>>>> I do not know much about ACL except that I suspect it can be quite complicated to do
>>>> with hardware. Is there ACL for individual pages of memory? There can be users or
>>>> groups of users and system processes all with separate access requirements. I suspect
>>>> there would be some limits engineered into ACL. For instance, limited numbers of
>>>> groups and users to keep the reference fields a reasonable size. Can you elaborate on
>>>> the ACL?
>>>>
>>> These are per-page checks.
>>>
>>>
>>> The ACL checking doesn't really add all that much cost or complexity
>>> compared with the TLB itself.
>>>
>>> Main costs are mostly the need to spend TLB bits on it, and the addition
>>> of a Keyring Register (KRR).
>>>
>>> Most of the harder parts of the ACL Check (namely, performing the ACL
>>> lookups) would be handled in software, so all the hardware really needs
>>> to do is check whether the keyring entries are in the ACL Cache, and (it
>>> they hit) what sort of access is allowed to the page in question (the
>>> access bits are OR'ed together).
>>>
>>>
>>> ACLE:
>>> * (63:44): Reserved / MBZ
>>> * (43:32): KR Access Mode
>>> * (31:16): KRR VUGID / APID
>>> * (15: 0): TLB VUGID / ACLID
>>>
>>> So, one basically does associative checks between the entries in the
>>> ACLE cache, and the entries in KRR.
>>>
>>> Because KRR changes relatively infrequently, the ACLE can be pretty
>>> small (ideally, it is larger than the KRR though, so 6 or 8 entries
>>> would be better than 4, but I am using 4 entries for now).
>>>
>>>
>>> The combined result of VUGID/ACL, and Basic checks are combined into a
>>> single bit-mask saying what access is allowed (in a given operating mode
>>> and at a given time). The L1 Caches will use these bits during memory
>>> access.
>>>> In my PTE eight bits are used for access control to indicate read/write/execute/cache
>>>> for either user or system. Maybe these bits would be better used to refer to an ACL. If
>>>> I were to trim four bits off the page number fields that means a 16-bit ACL reference
>>>> field could be supported.
>>> OK.
>>>
>>> I was able to fit both Basic checks and ACL/VUGID checks.
>>>
>>> The Basic checks are used ye olde User/Supervisor and similar (R/W/X)
>>> and a few bits which effects caching or combinations which are used
>>> encode alternate special behaviors (Privledged-Execute and
>>> Execute-As-ACLID).
>>>
>>>
>>> ...
>> I cannot quite get my head around how the ACL comes into play, but I have provided for
>> an ACL in TLBE.
>>
>> I decided to use 160-bit PTEs. I really wanted to support at least a full 76-bit virtual
>> address translation and things just would not fit into a 128-bit PTE. For a hashed table
>> this works just fine. Six are packed into 1024 bit with 64-bits to spare. However, for a
>> hierarchical table 160-bit PTEs are bit nightmarish.
> <
> My 66000 has 63-bit virtual address translates into a 66-bit universal address {hint:
> MMI/O, Configuration, and ROM are not part of the 64-bit DRAM address space}.
> Yet I fit each PTP and PTE into one (1) 64-bit container, and I can skip levels in the
> tables so small applications (like cat) can fit in 23-bit virtual address and have 1
> layer of PTPs and 1 layer of PTEs. I even have bits left over......
> <
> What is driving force for 76-bit virtual address ?
> <
> Secondly, even with a 76-bit virtual address, you should only be needing 90-bits±
> for a PTE. Seems to me that the structure of you PTE needs rethought.
> <
> So, what is the structure of your PTE ?
> <


Click here to read the complete article
Re: Packing Hash Tables

<FAqXJ.156801$Gojc.137903@fx99.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24227&group=comp.arch#24227

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!feeder.usenetexpress.com!tr2.eu1.usenetexpress.com!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx99.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com> <Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me> <056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me> <3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com> <t0ennl$ald$1@dont-email.me> <c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com> <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com> <6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com> <3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com>
In-Reply-To: <3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 86
Message-ID: <FAqXJ.156801$Gojc.137903@fx99.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sun, 13 Mar 2022 18:19:17 UTC
Date: Sun, 13 Mar 2022 14:18:58 -0400
X-Received-Bytes: 5031
 by: EricP - Sun, 13 Mar 2022 18:18 UTC

MitchAlsup wrote:
> On Saturday, March 12, 2022 at 9:36:59 PM UTC-6, robf...@gmail.com wrote:
>> I am not quite sure why 76 bits, but my gut tells me that may not be enough. Emulating quantum
>> and molecular computers? IDK. It does allow the memory allocator to just keep travelling forwards.
>> The physical address needs to be stored in the PTE too. That and the virtual address take over 100
>> bits. Add in the ASID and a few access control bits and there is no room left for the future.
>>> So, what is the structure of your PTE ?
>> The PTE contains mainly the address translation.
>> Structure contains:
>> PPN, (52 bits)
>> VPN, (64 bits)
>> ASID (12 bits)
>> Global translation bit
>> Valid bit
>> Dirty bit
>> Page size bit
>> Read / Write / Execute / Cacheble bits for system and user modes (8 bits)
>> Bounce count (6 bits) (for inverted page table defrag)
>> 15 bits unused.
> <
> My 66000 Page Pointer {either a PTP or a PTE}
> <
> PPN<63..13>
> RWE<12..10>
> LVL<9..7>
> Where<6..5> {L1, L2, allocated, unCacheable}
> used<2>
> mod<1>
> present<0>

You need 3 bits explicitly reserved to OS in a Present pte.
The OS will store status tracking info in those bits.
(In a Not-Present pte all of the other 63 bits are owned by OS
and its shashes all kinds of info in there.)

> <
> LVL = 001 means PTE, otherwise PTP
> LVL on the pointer to PTE indicates size of the page
> <
> A PTE to an unCacheable page uses where<6..5> as the space of access
> ......{DRAM, MMI/O, Configuration, ROM} all which can be 64-bits.
> <
> ASID comes from a thread control register
> My 66000 supports translation indirection so a service provider can access
> into the service caller--this gets rid of any need for Global/Application or s/u
> <
> VPN is used to access PPN so it is stored in TLB but is not part of page pointing.
> Pages larger than smallest, use the PPN field to enumerate the number of pages
> this super page pointer points at. So, if you have a 8MB translation you can set
> it up to contain 1MB if that is what is really needed.
> <
> There are 2 tables of translation, one used by Guest OS, the other used by
> HyperVisor as a nested page table.
> <
>>> Why is the access key in the PTE?--it should be a part of the page itself.
> <
> I don't use a keyed protection scheme. I could work one in, but I would associate
> each page with a key, and place the key in the thread control registers.
> <
>> The access key is not in the PTE. It does appear the TLB entry though along with an access
>> counter. I have another table called ART that contains the access key, share count, privilege
>> level and few other bits. Bits for which there is only a single copy per page. The ART is managed
>> by software. Bits from ART are loaded automatically into the TLBE when needed.
> <
>>> Seems to me you are putting useful things in the wrong places.
> <
>> I have been studying MMUs on the web with the hopes of sorting things out.. Most of them do
>> not include as much information in the PTE / TLBE or use smaller fields. To me it looks like
>> some are legacy designs, inheriting from 32-bit perhaps. I have not found much mentioned
>> about what to do with the per page information.

The hierarchical page tables also have an array of structs indexed by
Physical Page Number (PPN) that stores the OS defined status info.
In Linux this called "struct page".

Mel Gorman's 2007 book "Understanding the Linux Virtual Memory Manager"
is open access, available on-line in HTML or as PDF's.
https://www.kernel.org/doc/gorman/html/understand
https://www.kernel.org/doc/gorman/pdf

Description of "struct page" descriptor:

2.4 Pages
https://www.kernel.org/doc/gorman/html/understand/understand005.html#toc15

Re: Packing Hash Tables

<t0li73$t01$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24230&group=comp.arch#24230

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
Date: Sun, 13 Mar 2022 14:55:08 -0500
Organization: A noiseless patient Spider
Lines: 170
Message-ID: <t0li73$t01$1@dont-email.me>
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com>
<t0df7v$afv$1@dont-email.me>
<3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com>
<t0ennl$ald$1@dont-email.me>
<c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com>
<5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>
<6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com>
<FAqXJ.156801$Gojc.137903@fx99.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 13 Mar 2022 19:55:15 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="b88b4f910aab5cbd08a21e340d435a9f";
logging-data="29697"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Xsnstlhizv2piXFJ6wn3Z"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:myoD334fil8SpcoSbYZTkZRD2Uo=
In-Reply-To: <FAqXJ.156801$Gojc.137903@fx99.iad>
Content-Language: en-US
 by: BGB - Sun, 13 Mar 2022 19:55 UTC

On 3/13/2022 1:18 PM, EricP wrote:
> MitchAlsup wrote:
>> On Saturday, March 12, 2022 at 9:36:59 PM UTC-6, robf...@gmail.com wrote:
>>> I am not quite sure why 76 bits, but my gut tells me that may not be
>>> enough. Emulating quantum and molecular computers? IDK. It does allow
>>> the memory allocator to just keep travelling forwards. The physical
>>> address needs to be stored in the PTE too. That and the virtual
>>> address take over 100 bits. Add in the ASID and a few access control
>>> bits and there is no room left for the future.
>>>> So, what is the structure of your PTE ?
>>> The PTE contains mainly the address translation. Structure contains:
>>> PPN, (52 bits) VPN, (64 bits) ASID (12 bits) Global translation bit
>>> Valid bit Dirty bit Page size bit Read / Write / Execute / Cacheble
>>> bits for system and user modes (8 bits) Bounce count (6 bits) (for
>>> inverted page table defrag) 15 bits unused.
>> <
>> My 66000 Page Pointer {either a PTP or a PTE}
>> <
>> PPN<63..13>
>> RWE<12..10>
>> LVL<9..7>
>> Where<6..5>     {L1, L2, allocated, unCacheable}
>> used<2>
>> mod<1>
>> present<0>
>
> You need 3 bits explicitly reserved to OS in a Present pte.
> The OS will store status tracking info in those bits.
> (In a Not-Present pte all of the other 63 bits are owned by OS
> and its shashes all kinds of info in there.)
>

I have 4 "User" bits in my case, mostly intended to be used for
implementing virtual memory and similar.

For non-present pages, one may want to encode whether or not the address
bits encode an index into the pagefile, ...

The low 2 bits encode a page mode:
00: Not Present
01: Present, Exclusive
10: High Half (Of an XTLB 2-part TLBE)
11: Present, Shared

>> <
>> LVL = 001 means PTE, otherwise PTP
>> LVL on the pointer to PTE indicates size of the page
>> <
>> A PTE to an unCacheable page uses where<6..5> as the space of access
>> ......{DRAM, MMI/O, Configuration, ROM} all which can be 64-bits.
>> <
>> ASID comes from a thread control register
>> My 66000 supports translation indirection so a service provider can
>> access
>> into the service caller--this gets rid of any need for
>> Global/Application or s/u
>> <
>> VPN is used to access PPN so it is stored in TLB but is not part of
>> page pointing.
>> Pages larger than smallest, use the PPN field to enumerate the number
>> of pages
>> this super page pointer points at. So, if you have a 8MB translation
>> you can set
>> it up to contain 1MB if that is what is really needed.
>> <
>> There are 2 tables of translation, one used by Guest OS, the other
>> used by
>> HyperVisor as a nested page table.
>> <
>>>> Why is the access key in the PTE?--it should be a part of the page
>>>> itself.
>> <
>> I don't use a keyed protection scheme. I could work one in, but I
>> would associate
>> each page with a key, and place the key in the thread control registers.
>> <
>>> The access key is not in the PTE. It does appear the TLB entry though
>>> along with an access counter. I have another table called ART that
>>> contains the access key, share count, privilege level and few other
>>> bits. Bits for which there is only a single copy per page. The ART is
>>> managed by software. Bits from ART are loaded automatically into the
>>> TLBE when needed.
>> <
>>>> Seems to me you are putting useful things in the wrong places.
>> <
>>> I have been studying MMUs on the web with the hopes of sorting things
>>> out.. Most of them do not include as much information in the PTE /
>>> TLBE or use smaller fields. To me it looks like some are legacy
>>> designs, inheriting from 32-bit perhaps. I have not found much
>>> mentioned about what to do with the per page information.
>
> The hierarchical page tables also have an array of structs indexed by
> Physical Page Number (PPN) that stores the OS defined status info.
> In Linux this called "struct page".
>
> Mel Gorman's 2007 book "Understanding the Linux Virtual Memory Manager"
> is open access, available on-line in HTML or as PDF's.
> https://www.kernel.org/doc/gorman/html/understand
> https://www.kernel.org/doc/gorman/pdf
>
> Description of "struct page" descriptor:
>
> 2.4 Pages
> https://www.kernel.org/doc/gorman/html/understand/understand005.html#toc15
>
>

Hmm, I had merely been tracking (low-level) page status with bitmaps and
rovers...

Things like mmaps were handled instead using sorted arrays assigned to
the virtual address space, which would represent the mmaps in terms of
spans.

Things like pagefile have a struct for each page in the region that can
be mapped into the pagefile.

This struct appears superficially similar to the one in the linked page,
but differs some in that it uses index numbers rather than pointers for
things like hash chaining and LRU.

It is possible that another difference is that I am currently only
reserving a fairly modest part of the address space for use by "pagefile
backed memory", whereas Linux throws most of its RAM at this (operating
primarily out of virtual memory).

Though, this latter point may change if I switch programs over to using
virtual memory as their primary backing memory.

I may be reaching the point where I can start experimenting with moving
the userland over pagefile backed memory. If this works, I could
probably get by with throwing, say, 50 or 75% of the RAM space at this
problem (the remaining RAM being mostly non-pagefile memory, either
physically-mapped or direct-remapping, *).

*: Where the physical address is remapped to a new location, but it
functions more like a conventional allocation, and is not backed by the
pagefile.

Current virtual memory layout is currently something like:
0000_00000000..0000_00FFFFFF: NULL Region (Unmapped)
0000_01000000..0000_3FFFFFFF: Physical Map (DRAM and similar).
0000_40000000..0000_BFFFFFFF: Low Virtual, Direct Mappings
0000_C0000000..0000_EFFFFFFF: Unused for now
0000_F0000000..0000_FFFFFFFF: Low MMIO Range
0001_00000000..7FFF_FFFFFFFF: High Virtual, Pagefile Backed

Not currently using 96-bit addressing.

So, basic idea being for now that everything using pagefile backed
memory will go into virtual addresses above the 4GB mark.

This would need to be different to support programs built with 32-bit
pointers, but even with a pagefile, these would be reasonably cramped in
an a shared address space.

Say:
0000_40000000..0000_7FFFFFFF: Low Virtual, Direct Mappings
0000_80000000..0000_BFFFFFFF: Low Virtual, Pagefile Backed

With all of the processes using 32-bit pointers mostly needing to fit
within a shared 1GB or 2GB of address space, and thus not much better
than in a NOMMU situation.

....

Re: Packing Hash Tables

<f90eee35-433d-47a6-9d9b-bbbc56e80a07n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24231&group=comp.arch#24231

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ad4:5dce:0:b0:43b:930:9235 with SMTP id m14-20020ad45dce000000b0043b09309235mr9955855qvh.100.1647201783502;
Sun, 13 Mar 2022 13:03:03 -0700 (PDT)
X-Received: by 2002:a05:6870:4346:b0:da:b3f:2b23 with SMTP id
x6-20020a056870434600b000da0b3f2b23mr16569260oah.194.1647201783244; Sun, 13
Mar 2022 13:03:03 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 13 Mar 2022 13:03:03 -0700 (PDT)
In-Reply-To: <FAqXJ.156801$Gojc.137903@fx99.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:4821:eab3:120a:265;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:4821:eab3:120a:265
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me>
<3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com> <t0ennl$ald$1@dont-email.me>
<c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com> <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>
<6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com> <3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com>
<FAqXJ.156801$Gojc.137903@fx99.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <f90eee35-433d-47a6-9d9b-bbbc56e80a07n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 13 Mar 2022 20:03:03 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 19
 by: MitchAlsup - Sun, 13 Mar 2022 20:03 UTC

On Sunday, March 13, 2022 at 1:19:23 PM UTC-5, EricP wrote:
> MitchAlsup wrote:

> The hierarchical page tables also have an array of structs indexed by
> Physical Page Number (PPN) that stores the OS defined status info.
> In Linux this called "struct page".
>
> Mel Gorman's 2007 book "Understanding the Linux Virtual Memory Manager"
> is open access, available on-line in HTML or as PDF's.
> https://www.kernel.org/doc/gorman/html/understand
> https://www.kernel.org/doc/gorman/pdf
>
> Description of "struct page" descriptor:
>
> 2.4 Pages
> https://www.kernel.org/doc/gorman/html/understand/understand005.html#toc15
<
Is there something equivalent to this with
a) 64-bit virtual-physical addresses
b) nested page tables

Re: Packing Hash Tables

<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24232&group=comp.arch#24232

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:1434:b0:67d:40a2:da33 with SMTP id k20-20020a05620a143400b0067d40a2da33mr12804476qkj.93.1647202154294;
Sun, 13 Mar 2022 13:09:14 -0700 (PDT)
X-Received: by 2002:a9d:6e89:0:b0:5b2:4c01:2210 with SMTP id
a9-20020a9d6e89000000b005b24c012210mr9660134otr.85.1647202154030; Sun, 13 Mar
2022 13:09:14 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 13 Mar 2022 13:09:13 -0700 (PDT)
In-Reply-To: <FAqXJ.156801$Gojc.137903@fx99.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:4821:eab3:120a:265;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:4821:eab3:120a:265
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me>
<3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com> <t0ennl$ald$1@dont-email.me>
<c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com> <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>
<6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com> <3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com>
<FAqXJ.156801$Gojc.137903@fx99.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 13 Mar 2022 20:09:14 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 91
 by: MitchAlsup - Sun, 13 Mar 2022 20:09 UTC

On Sunday, March 13, 2022 at 1:19:23 PM UTC-5, EricP wrote:
> MitchAlsup wrote:
> > On Saturday, March 12, 2022 at 9:36:59 PM UTC-6, robf...@gmail.com wrote:
> >> I am not quite sure why 76 bits, but my gut tells me that may not be enough. Emulating quantum
> >> and molecular computers? IDK. It does allow the memory allocator to just keep travelling forwards.
> >> The physical address needs to be stored in the PTE too. That and the virtual address take over 100
> >> bits. Add in the ASID and a few access control bits and there is no room left for the future.
> >>> So, what is the structure of your PTE ?
> >> The PTE contains mainly the address translation.
> >> Structure contains:
> >> PPN, (52 bits)
> >> VPN, (64 bits)
> >> ASID (12 bits)
> >> Global translation bit
> >> Valid bit
> >> Dirty bit
> >> Page size bit
> >> Read / Write / Execute / Cacheble bits for system and user modes (8 bits)
> >> Bounce count (6 bits) (for inverted page table defrag)
> >> 15 bits unused.
> > <
> > My 66000 Page Pointer {either a PTP or a PTE}
> > <
> > PPN<63..13>
> > RWE<12..10>
> > LVL<9..7>
> > Where<6..5> {L1, L2, allocated, unCacheable}
> > used<2>
> > mod<1>
> > present<0>
<
> You need 3 bits explicitly reserved to OS in a Present pte.
<
What are these 3-bits used for?
<
> The OS will store status tracking info in those bits.
<
What kind of status/tracking?
<
> (In a Not-Present pte all of the other 63 bits are owned by OS
> and its shashes all kinds of info in there.)
<
Sure
<
> > <
> > LVL = 001 means PTE, otherwise PTP
> > LVL on the pointer to PTE indicates size of the page
> > <
> > A PTE to an unCacheable page uses where<6..5> as the space of access
> > ......{DRAM, MMI/O, Configuration, ROM} all which can be 64-bits.
> > <
> > ASID comes from a thread control register
> > My 66000 supports translation indirection so a service provider can access
> > into the service caller--this gets rid of any need for Global/Application or s/u
> > <
> > VPN is used to access PPN so it is stored in TLB but is not part of page pointing.
> > Pages larger than smallest, use the PPN field to enumerate the number of pages
> > this super page pointer points at. So, if you have a 8MB translation you can set
> > it up to contain 1MB if that is what is really needed.
> > <
> > There are 2 tables of translation, one used by Guest OS, the other used by
> > HyperVisor as a nested page table.
> > <
> >>> Why is the access key in the PTE?--it should be a part of the page itself.
> > <
> > I don't use a keyed protection scheme. I could work one in, but I would associate
> > each page with a key, and place the key in the thread control registers.
> > <
> >> The access key is not in the PTE. It does appear the TLB entry though along with an access
> >> counter. I have another table called ART that contains the access key, share count, privilege
> >> level and few other bits. Bits for which there is only a single copy per page. The ART is managed
> >> by software. Bits from ART are loaded automatically into the TLBE when needed.
> > <
> >>> Seems to me you are putting useful things in the wrong places.
> > <
> >> I have been studying MMUs on the web with the hopes of sorting things out.. Most of them do
> >> not include as much information in the PTE / TLBE or use smaller fields. To me it looks like
> >> some are legacy designs, inheriting from 32-bit perhaps. I have not found much mentioned
> >> about what to do with the per page information.
> The hierarchical page tables also have an array of structs indexed by
> Physical Page Number (PPN) that stores the OS defined status info.
> In Linux this called "struct page".
>
> Mel Gorman's 2007 book "Understanding the Linux Virtual Memory Manager"
> is open access, available on-line in HTML or as PDF's.
> https://www.kernel.org/doc/gorman/html/understand
> https://www.kernel.org/doc/gorman/pdf
>
> Description of "struct page" descriptor:
>
> 2.4 Pages
> https://www.kernel.org/doc/gorman/html/understand/understand005.html#toc15

Re: Packing Hash Tables

<dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24235&group=comp.arch#24235

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:14c8:b0:2e1:d626:66ea with SMTP id u8-20020a05622a14c800b002e1d62666eamr242083qtx.58.1647209978416;
Sun, 13 Mar 2022 15:19:38 -0700 (PDT)
X-Received: by 2002:a9d:72c6:0:b0:5af:42ef:bb7c with SMTP id
d6-20020a9d72c6000000b005af42efbb7cmr9820508otk.96.1647209978104; Sun, 13 Mar
2022 15:19:38 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 13 Mar 2022 15:19:37 -0700 (PDT)
In-Reply-To: <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:fb00:c9e:2dd2:2814:bc1;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:fb00:c9e:2dd2:2814:bc1
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me>
<3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com> <t0ennl$ald$1@dont-email.me>
<c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com> <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>
<6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com> <3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com>
<FAqXJ.156801$Gojc.137903@fx99.iad> <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Sun, 13 Mar 2022 22:19:38 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 155
 by: robf...@gmail.com - Sun, 13 Mar 2022 22:19 UTC

On Sunday, March 13, 2022 at 4:09:16 PM UTC-4, MitchAlsup wrote:
> On Sunday, March 13, 2022 at 1:19:23 PM UTC-5, EricP wrote:
> > MitchAlsup wrote:
> > > On Saturday, March 12, 2022 at 9:36:59 PM UTC-6, robf...@gmail.com wrote:
> > >> I am not quite sure why 76 bits, but my gut tells me that may not be enough. Emulating quantum
> > >> and molecular computers? IDK. It does allow the memory allocator to just keep travelling forwards.
> > >> The physical address needs to be stored in the PTE too. That and the virtual address take over 100
> > >> bits. Add in the ASID and a few access control bits and there is no room left for the future.
> > >>> So, what is the structure of your PTE ?
> > >> The PTE contains mainly the address translation.
> > >> Structure contains:
> > >> PPN, (52 bits)
> > >> VPN, (64 bits)
> > >> ASID (12 bits)
> > >> Global translation bit
> > >> Valid bit
> > >> Dirty bit
> > >> Page size bit
> > >> Read / Write / Execute / Cacheble bits for system and user modes (8 bits)
> > >> Bounce count (6 bits) (for inverted page table defrag)
> > >> 15 bits unused.
> > > <
> > > My 66000 Page Pointer {either a PTP or a PTE}
> > > <
> > > PPN<63..13>
> > > RWE<12..10>
> > > LVL<9..7>
> > > Where<6..5> {L1, L2, allocated, unCacheable}
> > > used<2>
> > > mod<1>
> > > present<0>
> <
> > You need 3 bits explicitly reserved to OS in a Present pte.
> <
> What are these 3-bits used for?
> <
> > The OS will store status tracking info in those bits.
> <
> What kind of status/tracking?
> <
> > (In a Not-Present pte all of the other 63 bits are owned by OS
> > and its shashes all kinds of info in there.)
> <
> Sure
> <
> > > <
> > > LVL = 001 means PTE, otherwise PTP
> > > LVL on the pointer to PTE indicates size of the page
> > > <
> > > A PTE to an unCacheable page uses where<6..5> as the space of access
> > > ......{DRAM, MMI/O, Configuration, ROM} all which can be 64-bits.
> > > <
> > > ASID comes from a thread control register
> > > My 66000 supports translation indirection so a service provider can access
> > > into the service caller--this gets rid of any need for Global/Application or s/u
> > > <
> > > VPN is used to access PPN so it is stored in TLB but is not part of page pointing.
> > > Pages larger than smallest, use the PPN field to enumerate the number of pages
> > > this super page pointer points at. So, if you have a 8MB translation you can set
> > > it up to contain 1MB if that is what is really needed.
> > > <
> > > There are 2 tables of translation, one used by Guest OS, the other used by
> > > HyperVisor as a nested page table.
> > > <
> > >>> Why is the access key in the PTE?--it should be a part of the page itself.
> > > <
> > > I don't use a keyed protection scheme. I could work one in, but I would associate
> > > each page with a key, and place the key in the thread control registers.
> > > <
> > >> The access key is not in the PTE. It does appear the TLB entry though along with an access
> > >> counter. I have another table called ART that contains the access key, share count, privilege
> > >> level and few other bits. Bits for which there is only a single copy per page. The ART is managed
> > >> by software. Bits from ART are loaded automatically into the TLBE when needed.
> > > <
> > >>> Seems to me you are putting useful things in the wrong places.
> > > <
> > >> I have been studying MMUs on the web with the hopes of sorting things out.. Most of them do
> > >> not include as much information in the PTE / TLBE or use smaller fields. To me it looks like
> > >> some are legacy designs, inheriting from 32-bit perhaps. I have not found much mentioned
> > >> about what to do with the per page information.
> > The hierarchical page tables also have an array of structs indexed by
> > Physical Page Number (PPN) that stores the OS defined status info.
> > In Linux this called "struct page".
> >
> > Mel Gorman's 2007 book "Understanding the Linux Virtual Memory Manager"
> > is open access, available on-line in HTML or as PDF's.
> > https://www.kernel.org/doc/gorman/html/understand
> > https://www.kernel.org/doc/gorman/pdf
> >
> > Description of "struct page" descriptor:
> >
> > 2.4 Pages
> > https://www.kernel.org/doc/gorman/html/understand/understand005.html#toc15

I had forgotten the hierarchical page tables do not need to store the virtual address / ASID.
Without the VPN things will easily fit into 96 bits. Still stuck with the issue of non-power of
two addressing unless the PTE size is increased to 128-bit or bits are trimmed from the PPN.

Storing the level in the entry seems like a good idea. With extra bits available in a page, the
level could be stored there. All PTEs in the page would have the same level.. I assume the
level is for debug / integrity. Could also use the bounce count field to store the level.

I had sort of wanted to keep the same PTE format for both types of tables, but I see now it
does not make sense to do this. It would waste too much space in a hierarchical table. PDEs
are only 64-bit since they only contain a reference to page and a couple of other bits.

>How big is you int, your long, and your pointer ?

An int is 64-bits. A long is 128-bits. And a short is 32-bits. Pointers are 64-bit. The pointer size
may be adjustable. Number of address bits in use can be specified for the compile.

I added the idea of regions to the MMU which are related to the PMA checks. There may be up
to eight regions defined. Currently only four are used, ROM, IO, scratchpad, and RAM. The
‘where’ field is computed based on the physical address. The physical address looks up a
region in the region table and returns the PMA. The PMA includes memory type and accessibility.

Re: Packing Hash Tables

<GuuXJ.159207$aT3.85936@fx09.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24237&group=comp.arch#24237

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!newsfeed.xs4all.nl!newsfeed8.news.xs4all.nl!peer02.ams1!peer.ams1.xlned.com!news.xlned.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx09.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com> <Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me> <056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me> <3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com> <t0ennl$ald$1@dont-email.me> <c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com> <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com> <6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com> <3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com> <FAqXJ.156801$Gojc.137903@fx99.iad> <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
In-Reply-To: <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 33
Message-ID: <GuuXJ.159207$aT3.85936@fx09.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sun, 13 Mar 2022 22:45:58 UTC
Date: Sun, 13 Mar 2022 18:42:58 -0400
X-Received-Bytes: 2297
 by: EricP - Sun, 13 Mar 2022 22:42 UTC

MitchAlsup wrote:
> On Sunday, March 13, 2022 at 1:19:23 PM UTC-5, EricP wrote:
>> MitchAlsup wrote:
>>> <
>>> My 66000 Page Pointer {either a PTP or a PTE}
>>> <
>>> PPN<63..13>
>>> RWE<12..10>
>>> LVL<9..7>
>>> Where<6..5> {L1, L2, allocated, unCacheable}
>>> used<2>
>>> mod<1>
>>> present<0>
> <
>> You need 3 bits explicitly reserved to OS in a Present pte.
> <
> What are these 3-bits used for?
> <
>> The OS will store status tracking info in those bits.
> <
> What kind of status/tracking?

Intel in 32-bit x86 PTE defined bits 11:9 as ignored.
x64 defines 4kB 64-bit PTE defines bits 11:9, 58:52 as ignored.
There are also PTE bits that are defined as Reserved, Must Be Zero.

In Linux for x64 it defines 4 OS bits of which 3 are used:
https://01.org/blogs/dave/2020/linux-consumption-x86-page-table-bits

Windows x64 has CopyOnWrite, Prototype and Write in ignored bits.
Write is used for handling certain race conditions in managing Dirty bit.

Re: Packing Hash Tables

<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24238&group=comp.arch#24238

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:620a:45a4:b0:67d:76ab:c9bd with SMTP id bp36-20020a05620a45a400b0067d76abc9bdmr8520136qkb.179.1647211774693;
Sun, 13 Mar 2022 15:49:34 -0700 (PDT)
X-Received: by 2002:a05:6808:2018:b0:2ec:c22b:15b8 with SMTP id
q24-20020a056808201800b002ecc22b15b8mr3669866oiw.136.1647211774375; Sun, 13
Mar 2022 15:49:34 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 13 Mar 2022 15:49:34 -0700 (PDT)
In-Reply-To: <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:4821:eab3:120a:265;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:4821:eab3:120a:265
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me>
<3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com> <t0ennl$ald$1@dont-email.me>
<c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com> <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>
<6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com> <3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com>
<FAqXJ.156801$Gojc.137903@fx99.iad> <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
<dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sun, 13 Mar 2022 22:49:34 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 92
 by: MitchAlsup - Sun, 13 Mar 2022 22:49 UTC

On Sunday, March 13, 2022 at 5:19:40 PM UTC-5, robf...@gmail.com wrote:
> On Sunday, March 13, 2022 at 4:09:16 PM UTC-4, MitchAlsup wrote:
> > On Sunday, March 13, 2022 at 1:19:23 PM UTC-5, EricP wrote:
> > > MitchAlsup wrote:

> > > The hierarchical page tables also have an array of structs indexed by
> > > Physical Page Number (PPN) that stores the OS defined status info.
> > > In Linux this called "struct page".
> > >
> > > Mel Gorman's 2007 book "Understanding the Linux Virtual Memory Manager"
> > > is open access, available on-line in HTML or as PDF's.
> > > https://www.kernel.org/doc/gorman/html/understand
> > > https://www.kernel.org/doc/gorman/pdf
> > >
> > > Description of "struct page" descriptor:
> > >
> > > 2.4 Pages
> > > https://www.kernel.org/doc/gorman/html/understand/understand005.html#toc15
> I had forgotten the hierarchical page tables do not need to store the virtual address / ASID.
> Without the VPN things will easily fit into 96 bits. Still stuck with the issue of non-power of
> two addressing unless the PTE size is increased to 128-bit or bits are trimmed from the PPN.
>
> Storing the level in the entry seems like a good idea. With extra bits available in a page, the
> level could be stored there. All PTEs in the page would have the same level. I assume the
> level is for debug / integrity. Could also use the bounce count field to store the level.
<
Not quite::
<
LVL of the pointer to the page of translation entries indicates the address space covered by each
entry. But, you could have a LVL = 011 indicating a 8GB page; some entries could point at an
8MB entry (LVL = 010) some could point at an 8K page (LVL=001).
<
The important thing an LVL brings is skipping of levels in the translation hierarchy, saving the
number of accesses used to refill the TLB, and the number of pages used to for translations.
<
A small program like "cat" can be compiled with a 23-bit address space and use 1 page of translation
overhead; typically with 5 entries: .text, .bss, .data, .stack, .heap.
>
> I had sort of wanted to keep the same PTE format for both types of tables, but I see now it
> does not make sense to do this. It would waste too much space in a hierarchical table. PDEs
> are only 64-bit since they only contain a reference to page and a couple of other bits.
> >How big is you int, your long, and your pointer ?
> An int is 64-bits. A long is 128-bits. And a short is 32-bits. Pointers are 64-bit.
<
How does one access a 16-bit container?
<
< The pointer size
> may be adjustable. Number of address bits in use can be specified for the compile.
>
> I added the idea of regions to the MMU which are related to the PMA checks. There may be up
> to eight regions defined. Currently only four are used, ROM, IO, scratchpad, and RAM. The
<
I stole essentially the same thing from already available fields. When a page is unCacheable
(C=00) the used&modified bits indicate the space {00->DRAM, 01->MMI/O, 10->configuration,
11->ROM} each of these spaces can be 64-bits in size.
<
In addition, we know:: configuration space is unCacheable and strongly ordered; MMI/O is
strongly ordered per device, and sequentially consistent; ROM is cacheable, unordered,
without coherence traffic, and not writeable.
<
> ‘where’ field is computed based on the physical address. The physical address looks up a
> region in the region table and returns the PMA. The PMA includes memory type and accessibility.
<
PMA = Post Male Anxiety ?

Re: Packing Hash Tables

<d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24239&group=comp.arch#24239

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:14d0:b0:2e0:64e7:3920 with SMTP id u16-20020a05622a14d000b002e064e73920mr16529350qtx.61.1647213047077;
Sun, 13 Mar 2022 16:10:47 -0700 (PDT)
X-Received: by 2002:a05:6820:1508:b0:320:f886:9e01 with SMTP id
ay8-20020a056820150800b00320f8869e01mr8960680oob.26.1647213046830; Sun, 13
Mar 2022 16:10:46 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sun, 13 Mar 2022 16:10:46 -0700 (PDT)
In-Reply-To: <d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:fb00:c9e:2dd2:2814:bc1;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:fb00:c9e:2dd2:2814:bc1
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me>
<3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com> <t0ennl$ald$1@dont-email.me>
<c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com> <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>
<6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com> <3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com>
<FAqXJ.156801$Gojc.137903@fx99.iad> <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
<dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com> <d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Sun, 13 Mar 2022 23:10:47 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 114
 by: robf...@gmail.com - Sun, 13 Mar 2022 23:10 UTC

On Sunday, March 13, 2022 at 6:49:36 PM UTC-4, MitchAlsup wrote:
> On Sunday, March 13, 2022 at 5:19:40 PM UTC-5, robf...@gmail.com wrote:
> > On Sunday, March 13, 2022 at 4:09:16 PM UTC-4, MitchAlsup wrote:
> > > On Sunday, March 13, 2022 at 1:19:23 PM UTC-5, EricP wrote:
> > > > MitchAlsup wrote:
>
> > > > The hierarchical page tables also have an array of structs indexed by
> > > > Physical Page Number (PPN) that stores the OS defined status info.
> > > > In Linux this called "struct page".
> > > >
> > > > Mel Gorman's 2007 book "Understanding the Linux Virtual Memory Manager"
> > > > is open access, available on-line in HTML or as PDF's.
> > > > https://www.kernel.org/doc/gorman/html/understand
> > > > https://www.kernel.org/doc/gorman/pdf
> > > >
> > > > Description of "struct page" descriptor:
> > > >
> > > > 2.4 Pages
> > > > https://www.kernel.org/doc/gorman/html/understand/understand005.html#toc15
> > I had forgotten the hierarchical page tables do not need to store the virtual address / ASID.
> > Without the VPN things will easily fit into 96 bits. Still stuck with the issue of non-power of
> > two addressing unless the PTE size is increased to 128-bit or bits are trimmed from the PPN.
> >
> > Storing the level in the entry seems like a good idea. With extra bits available in a page, the
> > level could be stored there. All PTEs in the page would have the same level. I assume the
> > level is for debug / integrity. Could also use the bounce count field to store the level.
> <
> Not quite::
> <
> LVL of the pointer to the page of translation entries indicates the address space covered by each
> entry. But, you could have a LVL = 011 indicating a 8GB page; some entries could point at an
> 8MB entry (LVL = 010) some could point at an 8K page (LVL=001).
> <
> The important thing an LVL brings is skipping of levels in the translation hierarchy, saving the
> number of accesses used to refill the TLB, and the number of pages used to for translations.
> <
> A small program like "cat" can be compiled with a 23-bit address space and use 1 page of translation
> overhead; typically with 5 entries: .text, .bss, .data, .stack, .heap.
> >
> > I had sort of wanted to keep the same PTE format for both types of tables, but I see now it
> > does not make sense to do this. It would waste too much space in a hierarchical table. PDEs
> > are only 64-bit since they only contain a reference to page and a couple of other bits.
> > >How big is you int, your long, and your pointer ?
> > An int is 64-bits. A long is 128-bits. And a short is 32-bits. Pointers are 64-bit.
> <
> How does one access a 16-bit container?
> <
> < The pointer size
> > may be adjustable. Number of address bits in use can be specified for the compile.
> >
> > I added the idea of regions to the MMU which are related to the PMA checks. There may be up
> > to eight regions defined. Currently only four are used, ROM, IO, scratchpad, and RAM. The
> <
> I stole essentially the same thing from already available fields. When a page is unCacheable
> (C=00) the used&modified bits indicate the space {00->DRAM, 01->MMI/O, 10->configuration,
> 11->ROM} each of these spaces can be 64-bits in size.
> <
> In addition, we know:: configuration space is unCacheable and strongly ordered; MMI/O is
> strongly ordered per device, and sequentially consistent; ROM is cacheable, unordered,
> without coherence traffic, and not writeable.
> <
> > ‘where’ field is computed based on the physical address.. The physical address looks up a
> > region in the region table and returns the PMA. The PMA includes memory type and accessibility.
> <
> PMA = Post Male Anxiety ?

>The important thing an LVL brings is skipping of levels in the translation hierarchy, saving the
>number of accesses used to refill the TLB, and the number of pages used to for translations.

I just have the number of table levels specified in the PTBR. The PTBR points to the highest level PD needed.

>How does one access a 16-bit container?

By specifying int:16 as the type. The compiler supports size specs for ints of 8,16,32,64, and 128. Bytes
are also accessible as ‘byte’. I think chars are int:16. I think there is also a bit type to support bit arrays,
but this is experimental.
int:16 my_index;

> PMA = Post Male Anxiety ?

PMA = Physical Memory Attributes, stolen from RISCV docs.

Re: Packing Hash Tables

<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24240&group=comp.arch#24240

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:6214:21c7:b0:435:3600:c1e3 with SMTP id d7-20020a05621421c700b004353600c1e3mr17246064qvh.127.1647261598374;
Mon, 14 Mar 2022 05:39:58 -0700 (PDT)
X-Received: by 2002:a9d:4e99:0:b0:5b2:54f4:75e7 with SMTP id
v25-20020a9d4e99000000b005b254f475e7mr10870334otk.94.1647261598028; Mon, 14
Mar 2022 05:39:58 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 14 Mar 2022 05:39:57 -0700 (PDT)
In-Reply-To: <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=99.251.79.92; posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 99.251.79.92
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me>
<056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me>
<3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com> <t0ennl$ald$1@dont-email.me>
<c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com> <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>
<6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com> <3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com>
<FAqXJ.156801$Gojc.137903@fx99.iad> <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
<dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com> <d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>
<d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com>
Subject: Re: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Mon, 14 Mar 2022 12:39:58 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 120
 by: robf...@gmail.com - Mon, 14 Mar 2022 12:39 UTC

On Sunday, March 13, 2022 at 7:10:48 PM UTC-4, robf...@gmail.com wrote:
> On Sunday, March 13, 2022 at 6:49:36 PM UTC-4, MitchAlsup wrote:
> > On Sunday, March 13, 2022 at 5:19:40 PM UTC-5, robf...@gmail.com wrote:
> > > On Sunday, March 13, 2022 at 4:09:16 PM UTC-4, MitchAlsup wrote:
> > > > On Sunday, March 13, 2022 at 1:19:23 PM UTC-5, EricP wrote:
> > > > > MitchAlsup wrote:
> >
> > > > > The hierarchical page tables also have an array of structs indexed by
> > > > > Physical Page Number (PPN) that stores the OS defined status info..
> > > > > In Linux this called "struct page".
> > > > >
> > > > > Mel Gorman's 2007 book "Understanding the Linux Virtual Memory Manager"
> > > > > is open access, available on-line in HTML or as PDF's.
> > > > > https://www.kernel.org/doc/gorman/html/understand
> > > > > https://www.kernel.org/doc/gorman/pdf
> > > > >
> > > > > Description of "struct page" descriptor:
> > > > >
> > > > > 2.4 Pages
> > > > > https://www.kernel.org/doc/gorman/html/understand/understand005.html#toc15
> > > I had forgotten the hierarchical page tables do not need to store the virtual address / ASID.
> > > Without the VPN things will easily fit into 96 bits. Still stuck with the issue of non-power of
> > > two addressing unless the PTE size is increased to 128-bit or bits are trimmed from the PPN.
> > >
> > > Storing the level in the entry seems like a good idea. With extra bits available in a page, the
> > > level could be stored there. All PTEs in the page would have the same level. I assume the
> > > level is for debug / integrity. Could also use the bounce count field to store the level.
> > <
> > Not quite::
> > <
> > LVL of the pointer to the page of translation entries indicates the address space covered by each
> > entry. But, you could have a LVL = 011 indicating a 8GB page; some entries could point at an
> > 8MB entry (LVL = 010) some could point at an 8K page (LVL=001).
> > <
> > The important thing an LVL brings is skipping of levels in the translation hierarchy, saving the
> > number of accesses used to refill the TLB, and the number of pages used to for translations.
> > <
> > A small program like "cat" can be compiled with a 23-bit address space and use 1 page of translation
> > overhead; typically with 5 entries: .text, .bss, .data, .stack, .heap.
> > >
> > > I had sort of wanted to keep the same PTE format for both types of tables, but I see now it
> > > does not make sense to do this. It would waste too much space in a hierarchical table. PDEs
> > > are only 64-bit since they only contain a reference to page and a couple of other bits.
> > > >How big is you int, your long, and your pointer ?
> > > An int is 64-bits. A long is 128-bits. And a short is 32-bits. Pointers are 64-bit.
> > <
> > How does one access a 16-bit container?
> > <
> > < The pointer size
> > > may be adjustable. Number of address bits in use can be specified for the compile.
> > >
> > > I added the idea of regions to the MMU which are related to the PMA checks. There may be up
> > > to eight regions defined. Currently only four are used, ROM, IO, scratchpad, and RAM. The
> > <
> > I stole essentially the same thing from already available fields. When a page is unCacheable
> > (C=00) the used&modified bits indicate the space {00->DRAM, 01->MMI/O, 10->configuration,
> > 11->ROM} each of these spaces can be 64-bits in size.
> > <
> > In addition, we know:: configuration space is unCacheable and strongly ordered; MMI/O is
> > strongly ordered per device, and sequentially consistent; ROM is cacheable, unordered,
> > without coherence traffic, and not writeable.
> > <
> > > ‘where’ field is computed based on the physical address. The physical address looks up a
> > > region in the region table and returns the PMA. The PMA includes memory type and accessibility.
> > <
> > PMA = Post Male Anxiety ?
>
> >The important thing an LVL brings is skipping of levels in the translation hierarchy, saving the
> >number of accesses used to refill the TLB, and the number of pages used to for translations.
> I just have the number of table levels specified in the PTBR. The PTBR points to the highest level PD needed.
> >How does one access a 16-bit container?
> By specifying int:16 as the type. The compiler supports size specs for ints of 8,16,32,64, and 128. Bytes
> are also accessible as ‘byte’. I think chars are int:16. I think there is also a bit type to support bit arrays,
> but this is experimental.
> int:16 my_index;
> > PMA = Post Male Anxiety ?
> PMA = Physical Memory Attributes, stolen from RISCV docs.

Borrowed the level idea for my PT. I was just wondering what happens when a page at the same LVL as the
current is pointed to? Seems like it could form a linked list or a loop, or is a fault generated? At the moment
I have the code forcing the LVL down a level if it is the same or higher than the current LVL.

Re: Packing Hash Tables

<dnIXJ.116337$LN2.56249@fx13.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24247&group=comp.arch#24247

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!feeder5.feed.usenet.farm!feeder1.feed.usenet.farm!feed.usenet.farm!peer02.ams4!peer.am4.highwinds-media.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx13.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com> <Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me> <056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me> <3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com> <t0ennl$ald$1@dont-email.me> <c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com> <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com> <6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com> <3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com> <FAqXJ.156801$Gojc.137903@fx99.iad> <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com> <d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com> <de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com>
In-Reply-To: <de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 60
Message-ID: <dnIXJ.116337$LN2.56249@fx13.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Mon, 14 Mar 2022 14:33:45 UTC
Date: Mon, 14 Mar 2022 10:33:19 -0400
X-Received-Bytes: 4554
 by: EricP - Mon, 14 Mar 2022 14:33 UTC

robf...@gmail.com wrote:
> On Sunday, March 13, 2022 at 7:10:48 PM UTC-4, robf...@gmail.com wrote:
>> On Sunday, March 13, 2022 at 6:49:36 PM UTC-4, MitchAlsup wrote:
>>> The important thing an LVL brings is skipping of levels in the translation hierarchy, saving the
>>> number of accesses used to refill the TLB, and the number of pages used to for translations.
>> I just have the number of table levels specified in the PTBR. The PTBR points to the highest level PD needed.
>
> Borrowed the level idea for my PT. I was just wondering what happens when a page at the same LVL as the
> current is pointed to? Seems like it could form a linked list or a loop, or is a fault generated? At the moment
> I have the code forcing the LVL down a level if it is the same or higher than the current LVL.

The OS is responsible for not creating page table loops.
That said, the HW table walker is hard wired to only walk N levels
and then quit so it shouldn't break the hardware.
What this would accomplish is a variation of what I describe below,
albeit a somewhat useless variant.

One thing to note about skipping PT levels is that OS can reuse
interior page table levels to access outer levels as data.

One question arises is exactly how does an OS edit the page tables?
That is, the OS needs to be able to read and write PTE's.
In order to do that, those page table pages (PTP) need to be
inside the current virtual space. Which means there need to be
level-1 PTE's that point to those PTP's as level-0 data.

It turns out there is already just such a set of PTE's:
the level-2 PTP's point to level-1 PTP.
So OS can reuse a PTL-2 PTP as a PTL-1 PTP,
making the PTP that level-1 points to accessible as level-0 data.

PTL3->PTL2a->PTL1->Data
|->PTL2b->PTL1

By adding a PTE to level-2 that point to a different level-2 page,
it makes that page act as a level-1 page, giving read/write access
as data to the level-1 page that it point to.

All of this should be compatible with level skipping.
If level-6 says skip to level-2, and level-2 reuses a PTL2 PTP
as a PTL1 PTP, then the table walker knows its is at level-2
and should only walk 1 more step.
A potential problem with level skipping is if it somehow skipped off the
end of the tree. That could only happen if the OS constructed an illegal
page table, so it could trigger a page fault with a unique error code.

All of this should be compatible with the bottom-up TLB-miss
table walking optimization, whereby on TLB-miss it walks backwards
up the table levels from level-1 towards level-6,
checking for interior level PTE caching.

I don't know if either Windows or Linux use this table management method.
If I was designing an OS paging subsystem I certainly would.

So what would a PTP that contains a PTE that points to itself do?
The table walker would loop N times, and end up with a virtual
address mapped to that same PTP as data, allowing access to itself.
As I said, not all that useful.

Re: Packing Hash Tables

<3IKXJ.97682$jxu4.605@fx02.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=24250&group=comp.arch#24250

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!news.freedyn.de!newsreader4.netcologne.de!news.netcologne.de!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx02.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com> <Ct4WJ.121302$z688.64338@fx35.iad> <t0aokj$nnc$1@dont-email.me> <056ffd21-d834-405b-976e-46c09755f949n@googlegroups.com> <t0df7v$afv$1@dont-email.me> <3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com> <t0ennl$ald$1@dont-email.me> <c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com> <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com> <6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com> <3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com> <FAqXJ.156801$Gojc.137903@fx99.iad> <f90eee35-433d-47a6-9d9b-bbbc56e80a07n@googlegroups.com>
In-Reply-To: <f90eee35-433d-47a6-9d9b-bbbc56e80a07n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 47
Message-ID: <3IKXJ.97682$jxu4.605@fx02.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Mon, 14 Mar 2022 17:12:31 UTC
Date: Mon, 14 Mar 2022 13:11:39 -0400
X-Received-Bytes: 2827
 by: EricP - Mon, 14 Mar 2022 17:11 UTC

MitchAlsup wrote:
> On Sunday, March 13, 2022 at 1:19:23 PM UTC-5, EricP wrote:
>> MitchAlsup wrote:
>
>> The hierarchical page tables also have an array of structs indexed by
>> Physical Page Number (PPN) that stores the OS defined status info.
>> In Linux this called "struct page".
>>
>> Mel Gorman's 2007 book "Understanding the Linux Virtual Memory Manager"
>> is open access, available on-line in HTML or as PDF's.
>> https://www.kernel.org/doc/gorman/html/understand
>> https://www.kernel.org/doc/gorman/pdf
>>
>> Description of "struct page" descriptor:
>>
>> 2.4 Pages
>> https://www.kernel.org/doc/gorman/html/understand/understand005.html#toc15
> <
> Is there something equivalent to this with
> a) 64-bit virtual-physical addresses
> b) nested page tables

There is kernel documentation but is seems to be API oriented
rather than architecture or design oriented.

For example, I can't find how they map and edit the process page tables.
Or exactly how fork handles page tables.

The Linux Kernel documentation
https://www.kernel.org/doc/html/latest/index.html

Linux Memory Management Documentation
https://www.kernel.org/doc/html/latest/vm/index.html

Linux Virtualization Support
https://www.kernel.org/doc/html/latest/virt/index.html

There is some cpu specific info but the stuff I looked at was no help.

CPU Architectures
https://www.kernel.org/doc/html/latest/arch.html
x86-specific Documentation
https://www.kernel.org/doc/html/latest/x86/index.html

Pages:1234
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor