Welcome to novaBBS (click a section below)

mail files register newsreader groups login

Message-ID:

C'est magnifique, mais ce n'est pas l'Informatique. -- Bosquet [on seeing the IBM 4341]

Re: Packing Hash Tables

Subject	Author
Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	BGB
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	EricP
Re: Packing Hash Tables	BGB
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	BGB
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	BGB
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	EricP
Re: Packing Hash Tables	BGB
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	EricP
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	EricP
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	EricP
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	BGB
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	BGB
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	Thomas Koenig
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	Thomas Koenig
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	BGB
Re: Packing Hash Tables	aph
Re: Packing Hash Tables	BGB
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	EricP
Re: Packing Hash Tables	BGB
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	BGB
Re: Packing Hash Tables	Ivan Godard
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	EricP
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	EricP
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	EricP
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	EricP
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	BGB
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	BGB
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	robf...@gmail.com
Re: Packing Hash Tables	Ivan Godard
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	Thomas Koenig
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	Thomas Koenig
Re: Packing Hash Tables	Stephen Fuld
Re: Packing Hash Tables	EricP
Re: Packing Hash Tables	EricP
Re: Packing Hash Tables	MitchAlsup
Re: Packing Hash Tables	EricP
Re: Packing Hash Tables	BGB

Pages:1 234

Re: Packing Hash Tables

<df9ccc34-5820-453c-8e57-46f2193dc7ccn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24292&group=comp.arch#24292

copy link Newsgroups: comp.arch

X-Received: by 2002:a37:485:0:b0:67b:3bd:e14d with SMTP id 127-20020a370485000000b0067b03bde14dmr777070qke.645.1647458048320;
Wed, 16 Mar 2022 12:14:08 -0700 (PDT)
X-Received: by 2002:a9d:72c6:0:b0:5af:42ef:bb7c with SMTP id
d6-20020a9d72c6000000b005af42efbb7cmr520805otk.96.1647458048044; Wed, 16 Mar
2022 12:14:08 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!3.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 16 Mar 2022 12:14:07 -0700 (PDT)
In-Reply-To: <t0tbte$1tk$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:847d:d7bf:6931:a6b7;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:847d:d7bf:6931:a6b7
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<t0ennl$ald$1@dont-email.me> <c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com>
<5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com> <6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com> <FAqXJ.156801$Gojc.137903@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me> <3TbYJ.121927$GjY3.90377@fx01.iad>
<ddba6add-19ab-43b6-b98f-58110a886f8cn@googlegroups.com> <6d38f64c-ff74-45f9-9ee5-3d3619518d15n@googlegroups.com>
<t0tbte$1tk$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <df9ccc34-5820-453c-8e57-46f2193dc7ccn@googlegroups.com>
Subject: Re: Packing Hash Tables
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 16 Mar 2022 19:14:08 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 83

by: MitchAlsup - Wed, 16 Mar 2022 19:14 UTC

On Wednesday, March 16, 2022 at 1:56:50 PM UTC-5, Ivan Godard wrote:
> On 3/16/2022 8:57 AM, MitchAlsup wrote:
> > On Tuesday, March 15, 2022 at 9:46:07 PM UTC-5, robf...@gmail.com wrote:
> >> On Tuesday, March 15, 2022 at 10:24:03 PM UTC-4, EricP wrote:
> >>> BGB wrote:
> >>>> On 3/15/2022 1:08 PM, MitchAlsup wrote:
> >>>>> On Tuesday, March 15, 2022 at 12:23:56 PM UTC-5, EricP wrote:
> >>>>>> What about address translate skipping page table levels?
> >>>>> <
> >>>>> Skipping levels is, in my opinion, a necessity. I was part of the
> >>>>> I/OMMU team
> >>>>> at AMD where we put in level skipping as part of its design. I think
> >>>>> CPU tables
> >>>>> and I/OMMU tables should be one and the same, along with the "a
> >>>>> pointer is a
> >>>>> pointer" philosophy. I/O device should get virtual address of
> >>>>> originating thread,
> >>>>> I/O DMA should be translated thought page tables of originating
> >>>>> thread, I/O
> >>>>> completion interrupt should be delivered to Guest OS hosting
> >>>>> originating thread.
> >>>>
> >>>> One drawback of level skipping though, is that one can either ignore
> >>>> bits (effectively multi-mapping the skipped levels), or assumes them to
> >>>> be zero (doesn't work for randomized addresses), neither of which are
> >>>> ideal in a large+sparse ASLR scenario.
> >>> As I mention below, it can assume the missing values are 0's
> >>> or have a PTE bit to specify the value.
> >>>
> >>> A PTE bit allows one to differentiate between whether the
> >>> page is part of a grow-up or grow-down memory section.
> >>> E.g. heap areas expand upward, stack expand downwards.
> >>>>>> Another issue is what the fill values should be that were skipped.
> >>>>>> MMU can assume the the address bits skipped were all 0's,
> >>>>>> or the PTE could have a bit to prescribe the skipped value as 0 or 1.
> >>>>> <
> >>>>> I am following the rules of canonicality AMD uses when skipping levels
> >>>>> in the I/O MMU.
> >>> More feature bits also puts more presure on the PTE which is already full.
> >>
> >>> A PTE bit allows one to differentiate between whether the
> >>> page is part of a grow-up or grow-down memory section.
> >>> E.g. heap areas expand upward, stack expand downwards.
> > <
> >> I wonder if the address could be sign-extended, or multi-bit sign-extended?
> >> One may want to have ROM / IO mapped to virtual addresses that are the
> >> same as physical ones, high in memory.
> > <
> > I put ROM in its own 64-bit physical address space.
> > I put MMI/O in its own 64-bit physical address space.
> > I put configuration in its own 64-bit physical address space.
> > Leaving all 64-bits for real memory (DRAM).
<
> As opposed to mapping virtual to physical and then picking off slices of
> physical with apertures that map physical to device?
<
Yes, My 66000 Transport supports a 66-bit universal address space.
<
There are different consistency and coherence rules for these different
spaces, too. ROM ends up cacheable, but does not use cache coherence
messages--lowering overhead. Configuration space accesses are strongly
ordered. While MMI/O is sequentially consistent per Bus:Device,Function.
<
Also note: memory moves that do not cross a page boundary are ATOMIC
(all sent in 1 payload across the transport). So core[k] can feed device[j]
an entire "read disk[i]sector[m] to VA" in a single payload, eliminating another
source of kernel locks--and minimizing the necessity to lean on sequentially
consistency for guarantees of working.
<
This also minimizes HyperVisor overhead when simulating devices on
behalf of non-paravirtualized Guest OSs. HV gets the entire transaction
at once, instead of building it up MMI/O write by MMI/O write.
>
> The problem with embedding subspaces (which is what apertures recognize)
> into the TLB is that it prevents dynamically adding more spaces. Say you
> have two kinds of RAM, on disjoint pins but both with device addresses
> starting at zero. That's just another aperture, but in the TLB its a new
> chip.
>
> Granted that's not something done by bog-standard usage. I may have been
> perverted by experience with partionable pinsets.
<
It is more about recognizing 'empty' encodings once some other encoding is
recognized, then using those bits for other purposes. Entropy.

Re: Packing Hash Tables

<t0thlc$db1$1@newsreader4.netcologne.de>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24293&group=comp.arch#24293

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd4-d5e7-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
Date: Wed, 16 Mar 2022 20:34:52 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <t0thlc$db1$1@newsreader4.netcologne.de>
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com>
<6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com>
<FAqXJ.156801$Gojc.137903@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
<dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>
<d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com>
<db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad>
<f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me> <3TbYJ.121927$GjY3.90377@fx01.iad>
<ddba6add-19ab-43b6-b98f-58110a886f8cn@googlegroups.com>
<6d38f64c-ff74-45f9-9ee5-3d3619518d15n@googlegroups.com>
<t0tbte$1tk$1@dont-email.me>
<df9ccc34-5820-453c-8e57-46f2193dc7ccn@googlegroups.com>
Injection-Date: Wed, 16 Mar 2022 20:34:52 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd4-d5e7-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd4:d5e7:0:7285:c2ff:fe6c:992d";
logging-data="13665"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Wed, 16 Mar 2022 20:34 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:

> There are different consistency and coherence rules for these different
> spaces, too. ROM ends up cacheable, but does not use cache coherence
> messages--lowering overhead.

Could an operating system also choose to apply this to executables
and shared libraries?

Re: Packing Hash Tables

<aa10821c-28f2-4a7c-b2e8-e82c2f0afd19n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24294&group=comp.arch#24294

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:214:b0:2e1:a8cf:959f with SMTP id b20-20020a05622a021400b002e1a8cf959fmr1430616qtx.300.1647463353358;
Wed, 16 Mar 2022 13:42:33 -0700 (PDT)
X-Received: by 2002:a05:6870:a2d0:b0:d9:ae66:b8e2 with SMTP id
w16-20020a056870a2d000b000d9ae66b8e2mr4192313oak.7.1647463353135; Wed, 16 Mar
2022 13:42:33 -0700 (PDT)
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 16 Mar 2022 13:42:32 -0700 (PDT)
In-Reply-To: <t0thlc$db1$1@newsreader4.netcologne.de>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:847d:d7bf:6931:a6b7;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:847d:d7bf:6931:a6b7
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com> <6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com> <FAqXJ.156801$Gojc.137903@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me> <3TbYJ.121927$GjY3.90377@fx01.iad>
<ddba6add-19ab-43b6-b98f-58110a886f8cn@googlegroups.com> <6d38f64c-ff74-45f9-9ee5-3d3619518d15n@googlegroups.com>
<t0tbte$1tk$1@dont-email.me> <df9ccc34-5820-453c-8e57-46f2193dc7ccn@googlegroups.com>
<t0thlc$db1$1@newsreader4.netcologne.de>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <aa10821c-28f2-4a7c-b2e8-e82c2f0afd19n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 16 Mar 2022 20:42:33 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 16

by: MitchAlsup - Wed, 16 Mar 2022 20:42 UTC

On Wednesday, March 16, 2022 at 3:34:55 PM UTC-5, Thomas Koenig wrote:
> MitchAlsup <Mitch...@aol.com> schrieb:
> > There are different consistency and coherence rules for these different
> > spaces, too. ROM ends up cacheable, but does not use cache coherence
> > messages--lowering overhead.
<
> Could an operating system also choose to apply this to executables
> and shared libraries?
<
I am hoping that certain C-library functions are in ROM for exactly these reasons.
under the caveat that ROM is approximately as fast as DRAM.
<
One cannot avoid the coherence traffic in DRAM because HW does not know
that a page will be swapped out to DISK/SSD/over-the-network. With ROM
you actually do know this. {Well, it is possible to swap ROM to disk, but the
ROM still contains the same data, and will continue to contain the same
data even if disk writes over ROM. DRAM does not have this property.}

Re: Packing Hash Tables

<016cc8f6-aa9d-42ae-8d46-ccae157fb572n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24295&group=comp.arch#24295

copy link Newsgroups: comp.arch

X-Received: by 2002:ac8:5d86:0:b0:2e1:b9fd:ec24 with SMTP id d6-20020ac85d86000000b002e1b9fdec24mr1409036qtx.290.1647463496745;
Wed, 16 Mar 2022 13:44:56 -0700 (PDT)
X-Received: by 2002:a9d:5e15:0:b0:5b2:5125:fd09 with SMTP id
d21-20020a9d5e15000000b005b25125fd09mr617872oti.129.1647463496584; Wed, 16
Mar 2022 13:44:56 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!3.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 16 Mar 2022 13:44:56 -0700 (PDT)
In-Reply-To: <448e523a-bd08-414d-8489-f7453f2218edn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:847d:d7bf:6931:a6b7;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:847d:d7bf:6931:a6b7
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<t0df7v$afv$1@dont-email.me> <3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com>
<t0ennl$ald$1@dont-email.me> <c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com>
<5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com> <6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com> <FAqXJ.156801$Gojc.137903@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me> <3cd629e4-3623-4ad2-9e0f-ea9c992325e0n@googlegroups.com>
<t0rjd1$d6i$1@dont-email.me> <448e523a-bd08-414d-8489-f7453f2218edn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <016cc8f6-aa9d-42ae-8d46-ccae157fb572n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Wed, 16 Mar 2022 20:44:56 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 28

by: MitchAlsup - Wed, 16 Mar 2022 20:44 UTC

On Tuesday, March 15, 2022 at 11:33:12 PM UTC-5, robf...@gmail.com wrote:
> On Tuesday, March 15, 2022 at 10:52:20 PM UTC-4, BGB wrote:
> > On 3/15/2022 6:54 PM, MitchAlsup wrote:

> > >>> I started here about 2006.
> I am preferring hash tables over hierarchical tables. They are easier to deal with
> in some ways. They do not need to process in power-of-two addressing so the
> size of entries can be adjusted to suit.
>
This has some minor negative comments about hash tables and
L2 hit rates of cached data (translation tables or otherwise.)
<
https://cs.rice.edu/CS/Architecture/docs/barr-isca10.pdf
>
>
> I am wondering though how often pages of memory are shared?
>
> Using a simple hash table where the table index corresponds to the physical
> address four virtual address entries could be supported at the same cost as
> using a complex hash table that stores both the physical and virtual addresses
> and doubles the size of the table to allow for some sharing. But with the simple
> table sharing would be limited to four shares per page. To get more shares of a
> page pages would need to be replicated at a different physical address. But it
> seems like it could work.
>
> Big users of shared memory are probably the OS and other libraries and frameworks.
>
> Making copies might be one way to help with system integrity. If something
> goes amiss with one copy of the OS, other copies are not affected.

Re: Packing Hash Tables

<bdd23dc9-ab32-4e6f-bc37-b6bfe25adac6n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24296&group=comp.arch#24296

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:120a:b0:2e1:c9ba:e99b with SMTP id y10-20020a05622a120a00b002e1c9bae99bmr1660022qtx.685.1647472071637;
Wed, 16 Mar 2022 16:07:51 -0700 (PDT)
X-Received: by 2002:a05:6870:204c:b0:da:b3f:2b86 with SMTP id
l12-20020a056870204c00b000da0b3f2b86mr4167585oad.293.1647472069883; Wed, 16
Mar 2022 16:07:49 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Wed, 16 Mar 2022 16:07:49 -0700 (PDT)
In-Reply-To: <016cc8f6-aa9d-42ae-8d46-ccae157fb572n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de1:fb00:651e:a88f:285:add6;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de1:fb00:651e:a88f:285:add6
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<t0df7v$afv$1@dont-email.me> <3ca0fe64-f159-4a25-b8a3-cfe5ba7d0093n@googlegroups.com>
<t0ennl$ald$1@dont-email.me> <c88890ab-633e-4be5-8efc-e0bcb8a48be9n@googlegroups.com>
<5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com> <6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com> <FAqXJ.156801$Gojc.137903@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me> <3cd629e4-3623-4ad2-9e0f-ea9c992325e0n@googlegroups.com>
<t0rjd1$d6i$1@dont-email.me> <448e523a-bd08-414d-8489-f7453f2218edn@googlegroups.com>
<016cc8f6-aa9d-42ae-8d46-ccae157fb572n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <bdd23dc9-ab32-4e6f-bc37-b6bfe25adac6n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Wed, 16 Mar 2022 23:07:51 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 49

by: robf...@gmail.com - Wed, 16 Mar 2022 23:07 UTC

On Wednesday, March 16, 2022 at 4:44:58 PM UTC-4, MitchAlsup wrote:
> On Tuesday, March 15, 2022 at 11:33:12 PM UTC-5, robf...@gmail.com wrote:
> > On Tuesday, March 15, 2022 at 10:52:20 PM UTC-4, BGB wrote:
> > > On 3/15/2022 6:54 PM, MitchAlsup wrote:
>
> > > >>> I started here about 2006.
> > I am preferring hash tables over hierarchical tables. They are easier to deal with
> > in some ways. They do not need to process in power-of-two addressing so the
> > size of entries can be adjusted to suit.
> >
> This has some minor negative comments about hash tables and
> L2 hit rates of cached data (translation tables or otherwise.)
> <
> https://cs.rice.edu/CS/Architecture/docs/barr-isca10.pdf
> >
> >
> > I am wondering though how often pages of memory are shared?
> >
> > Using a simple hash table where the table index corresponds to the physical
> > address four virtual address entries could be supported at the same cost as
> > using a complex hash table that stores both the physical and virtual addresses
> > and doubles the size of the table to allow for some sharing. But with the simple
> > table sharing would be limited to four shares per page. To get more shares of a
> > page pages would need to be replicated at a different physical address. But it
> > seems like it could work.
> >
> > Big users of shared memory are probably the OS and other libraries and frameworks.
> >
> > Making copies might be one way to help with system integrity. If something
> > goes amiss with one copy of the OS, other copies are not affected.

Thanks for the pointer to the article. Hash tables might be better used when worried
about the memory consumption.

My PT now caches PDEs. Save about four clocks per PDE lookup plus memory access time.

>Works great if you have 72-bits in a word. But not so great if you have a 64-bit
>PTE mapping from a 64-bit VA to a 64-bit PA.

It can work with as few as two available bits.

I am using 128-bit PTEs, with physical addressing up to 72 bits. Virtual addressing up to
106 bits. PDEs, page directory entries, are 64-bit. 13+13+13+13+13+13+13+16
Decided to go with 64kB pages now that they can be sectioned off into 1kB chunks.
For a small app < 64kB required, the app could be stored in the page table itself so that all
the extra space in the table is not wasted. It might need only three PTEs for code, data,
and stack and they could all point into the page table page.

64, 1kB stacks for threads associated with a process could all be stored in the same page of
memory.

Re: Packing Hash Tables

<t0unlv$5vo$1@newsreader4.netcologne.de>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24297&group=comp.arch#24297

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!newsreader4.netcologne.de!news.netcologne.de!.POSTED.2001-4dd4-d5e7-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de!not-for-mail
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
Date: Thu, 17 Mar 2022 07:23:43 -0000 (UTC)
Organization: news.netcologne.de
Distribution: world
Message-ID: <t0unlv$5vo$1@newsreader4.netcologne.de>
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com>
<FAqXJ.156801$Gojc.137903@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
<dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>
<d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com>
<db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad>
<f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me> <3TbYJ.121927$GjY3.90377@fx01.iad>
<ddba6add-19ab-43b6-b98f-58110a886f8cn@googlegroups.com>
<6d38f64c-ff74-45f9-9ee5-3d3619518d15n@googlegroups.com>
<t0tbte$1tk$1@dont-email.me>
<df9ccc34-5820-453c-8e57-46f2193dc7ccn@googlegroups.com>
<t0thlc$db1$1@newsreader4.netcologne.de>
<aa10821c-28f2-4a7c-b2e8-e82c2f0afd19n@googlegroups.com>
Injection-Date: Thu, 17 Mar 2022 07:23:43 -0000 (UTC)
Injection-Info: newsreader4.netcologne.de; posting-host="2001-4dd4-d5e7-0-7285-c2ff-fe6c-992d.ipv6dyn.netcologne.de:2001:4dd4:d5e7:0:7285:c2ff:fe6c:992d";
logging-data="6136"; mail-complaints-to="abuse@netcologne.de"
User-Agent: slrn/1.0.3 (Linux)

by: Thomas Koenig - Thu, 17 Mar 2022 07:23 UTC

MitchAlsup <MitchAlsup@aol.com> schrieb:
> On Wednesday, March 16, 2022 at 3:34:55 PM UTC-5, Thomas Koenig wrote:
>> MitchAlsup <Mitch...@aol.com> schrieb:
>> > There are different consistency and coherence rules for these different
>> > spaces, too. ROM ends up cacheable, but does not use cache coherence
>> > messages--lowering overhead.
><
>> Could an operating system also choose to apply this to executables
>> and shared libraries?
><
> I am hoping that certain C-library functions are in ROM for exactly these reasons.
> under the caveat that ROM is approximately as fast as DRAM.

Unlikely to happen for general purpose computers - what should
happen in the event of a security upgrade? And should this ROM
be for Linux, or for Windows, or for one of the BSD variants?
One exception is probably the BIOS.

> One cannot avoid the coherence traffic in DRAM because HW does not know
> that a page will be swapped out to DISK/SSD/over-the-network.

That is the point I was trying to make. The operating system should
know, and if there is a mechanism that it could tell the hardware
"I guarantee you that this page will not change", then that could
have advantages. (If the system in question has a BMC, then this
could get more complicated - could My66000 be run under OpenBMC,
for exaple?)

Re: Packing Hash Tables

<t0vm2k$ltn$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24298&group=comp.arch#24298

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
Date: Thu, 17 Mar 2022 09:02:28 -0700
Organization: A noiseless patient Spider
Lines: 38
Message-ID: <t0vm2k$ltn$1@dont-email.me>
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com>
<FAqXJ.156801$Gojc.137903@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
<dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>
<d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com>
<db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad>
<f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me> <3TbYJ.121927$GjY3.90377@fx01.iad>
<ddba6add-19ab-43b6-b98f-58110a886f8cn@googlegroups.com>
<6d38f64c-ff74-45f9-9ee5-3d3619518d15n@googlegroups.com>
<t0tbte$1tk$1@dont-email.me>
<df9ccc34-5820-453c-8e57-46f2193dc7ccn@googlegroups.com>
<t0thlc$db1$1@newsreader4.netcologne.de>
<aa10821c-28f2-4a7c-b2e8-e82c2f0afd19n@googlegroups.com>
<t0unlv$5vo$1@newsreader4.netcologne.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 17 Mar 2022 16:02:28 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="fa22452f4f8d6cf81612c4b862c5b54d";
logging-data="22455"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX199sZcnqIkNKfGv+2hR9UUK1EFrxqUME/E="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.7.0
Cancel-Lock: sha1:pQi7BVhJEuk0JDXisULL9VWAwUw=
In-Reply-To: <t0unlv$5vo$1@newsreader4.netcologne.de>
Content-Language: en-US

by: Stephen Fuld - Thu, 17 Mar 2022 16:02 UTC

On 3/17/2022 12:23 AM, Thomas Koenig wrote:
> MitchAlsup <MitchAlsup@aol.com> schrieb:
>> On Wednesday, March 16, 2022 at 3:34:55 PM UTC-5, Thomas Koenig wrote:
>>> MitchAlsup <Mitch...@aol.com> schrieb:
>>>> There are different consistency and coherence rules for these different
>>>> spaces, too. ROM ends up cacheable, but does not use cache coherence
>>>> messages--lowering overhead.
>> <
>>> Could an operating system also choose to apply this to executables
>>> and shared libraries?
>> <
>> I am hoping that certain C-library functions are in ROM for exactly these reasons.
>> under the caveat that ROM is approximately as fast as DRAM.
>
> Unlikely to happen for general purpose computers - what should
> happen in the event of a security upgrade? And should this ROM
> be for Linux, or for Windows, or for one of the BSD variants?
> One exception is probably the BIOS.
>
>> One cannot avoid the coherence traffic in DRAM because HW does not know
>> that a page will be swapped out to DISK/SSD/over-the-network.
>
> That is the point I was trying to make. The operating system should
> know, and if there is a mechanism that it could tell the hardware
> "I guarantee you that this page will not change", then that could
> have advantages.

I must be missing something. If the page is marked with no write
permission, i.e. a page containing executable code, there should be no
coherence traffic. If the OS decides to do some I/O into that page to
change it for some reason, then ISTM that the OS is responsible for
notifying all the relevant caches that the relevant data should be
invalidated.

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Packing Hash Tables

<pr-dnczU8cmty6n_nZ2dnUU7-efNnZ2d@supernews.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24301&group=comp.arch#24301

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!buffer1.nntp.dca1.giganews.com!nntp.supernews.com!news.supernews.com.POSTED!not-for-mail
NNTP-Posting-Date: Fri, 18 Mar 2022 04:51:44 -0500
Sender: Andrew Haley <aph@zarquon.pink>
From: aph...@littlepinkcloud.invalid
Subject: Re: Packing Hash Tables
Newsgroups: comp.arch
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com> <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com> <6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com> <3c363fd7-4004-4d99-bea7-7e@fx99.iad> <691e9232-da74-43c2-9b07-e6dd201$Gojc.137903@fx99.iad> <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com> <d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com> <de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com> <IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com> <t0r6hu$p5m$1@dont-email.me> <3cd629e4-3623-4ad2-9e0f-ea9c992325e0n@googlegroups.com> <t0rjd1$d6i$1@dont-email.me> <448e523a-bd08-414d-8489-f7453f2218edn@googlegroups.com> <a4bb1e3e-d311-4dac-90d8-4008ad217105n@googlegroups.com> <t0t9js$dg9$1@dont-email.me>
User-Agent: tin/1.9.2-20070201 ("Dalaruan") (UNIX) (Linux/4.18.0-305.25.1.el8_4.x86_64 (x86_64))
Message-ID: <pr-dnczU8cmty6n_nZ2dnUU7-efNnZ2d@supernews.com>
Date: Fri, 18 Mar 2022 04:51:44 -0500
Lines: 14
X-Trace: sv3-Lg6dMSUhMMs3wqUgecDlw22kQGjhsZeKJmPtAt7EDCeg9RG4XxuSKF0Wsjy67XhDhQ6S1f92RnVd9lt!xTFmUofOGW5qhKuEprq+0KiPqjgBOs7OQYZmunwRFDC7McQIEd46bIVKWfJJBjijEOCCLxapEJC0!5ulZ6vft
X-Complaints-To: www.supernews.com/docs/abuse.html
X-DMCA-Complaints-To: www.supernews.com/docs/dmca.html
X-Abuse-and-DMCA-Info: Please be sure to forward a copy of ALL headers
X-Abuse-and-DMCA-Info: Otherwise we will be unable to process your complaint properly
X-Postfilter: 1.3.40
X-Original-Bytes: 2505

by: aph...@littlepinkcloud.invalid - Fri, 18 Mar 2022 09:51 UTC

BGB <cr88192@gmail.com> wrote:
>
> I am not sure whether "in general" a chain-hash or B-Tree would be
> better for a large sparse address space, would need to model this.

Or perhaps a hybrid: do you know about Phil Bagwell's Ideal Hash
Trees? They're great for searching, high density, but the cost of
updating may be too high for your application. They also depend
critically on a fast POPCOUNT instruction. But even if they don't work
for your application, you might find some inspiration here.

https://lampwww.epfl.ch/papers/idealhashtrees.pdf

Andrew.

Re: Packing Hash Tables

<6J0ZJ.123422$Wdl5.41110@fx44.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24303&group=comp.arch#24303

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!feeder1.feed.usenet.farm!feed.usenet.farm!peer01.ams4!peer.am4.highwinds-media.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx44.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com> <FAqXJ.156801$Gojc.137903@fx99.iad> <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com> <d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com> <de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com> <IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com> <t0r6hu$p5m$1@dont-email.me> <3TbYJ.121927$GjY3.90377@fx01.iad> <ddba6add-19ab-43b6-b98f-58110a886f8cn@googlegroups.com> <6d38f64c-ff74-45f9-9ee5-3d3619518d15n@googlegroups.com> <t0tbte$1tk$1@dont-email.me> <df9ccc34-5820-453c-8e57-46f2193dc7ccn@googlegroups.com> <t0thlc$db1$1@newsreader4.netcologne.de> <aa10821c-28f2-4a7c-b2e8-e82c2f0afd19n@googlegroups.com> <t0unlv$5vo$1@newsreader4.netcologne.de> <t0vm2k$ltn$1@dont-email.me>
In-Reply-To: <t0vm2k$ltn$1@dont-email.me>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 57
Message-ID: <6J0ZJ.123422$Wdl5.41110@fx44.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Fri, 18 Mar 2022 14:31:30 UTC
Date: Fri, 18 Mar 2022 10:31:24 -0400
X-Received-Bytes: 4446

by: EricP - Fri, 18 Mar 2022 14:31 UTC

Stephen Fuld wrote:
> On 3/17/2022 12:23 AM, Thomas Koenig wrote:
>> MitchAlsup <MitchAlsup@aol.com> schrieb:
>>> On Wednesday, March 16, 2022 at 3:34:55 PM UTC-5, Thomas Koenig wrote:
>>>> MitchAlsup <Mitch...@aol.com> schrieb:
>>>>> There are different consistency and coherence rules for these
>>>>> different
>>>>> spaces, too. ROM ends up cacheable, but does not use cache coherence
>>>>> messages--lowering overhead.
>>> <
>>>> Could an operating system also choose to apply this to executables
>>>> and shared libraries?
>>> <
>>> I am hoping that certain C-library functions are in ROM for exactly
>>> these reasons.
>>> under the caveat that ROM is approximately as fast as DRAM.
>>
>> Unlikely to happen for general purpose computers - what should
>> happen in the event of a security upgrade? And should this ROM
>> be for Linux, or for Windows, or for one of the BSD variants?
>> One exception is probably the BIOS.
>>
>>> One cannot avoid the coherence traffic in DRAM because HW does not know
>>> that a page will be swapped out to DISK/SSD/over-the-network.
>>
>> That is the point I was trying to make. The operating system should
>> know, and if there is a mechanism that it could tell the hardware
>> "I guarantee you that this page will not change", then that could
>> have advantages.
>
> I must be missing something. If the page is marked with no write
> permission, i.e. a page containing executable code, there should be no
> coherence traffic. If the OS decides to do some I/O into that page to
> change it for some reason, then ISTM that the OS is responsible for
> notifying all the relevant caches that the relevant data should be
> invalidated.

A memory address that is marked read-only in an MMU prevents that process
from writing that address. It is not an assertion that this physical
address is unwritable by anyone. Therefore the cache hierarchy cannot
use the read-only protection to optimize how a cache line is managed.

For example in a server that publishes an information table to clients,
a page can be mapped RW in the server and RO in the clients.
The client cache will still receive Invalidates for those RO page lines.

That being said, off hand I can't think of a way that a cache could
make use of a PTE flag that marked a page as "unchangeable".
If it doesn't cache it, then it is the same as other non-cacheable pages.
If it does cache it, then it will just never receive an Invalidate
for those lines and they will eventually fall out.

I suppose it guarantees that such ROM cache evicts can be "silent".
There is no need for home directory for ROM physical address and no
tracking of who has copies, so no one to coordinate with on evicts.
But some coherence protocols already allow for silent Shared evicts.

Re: Packing Hash Tables

<D72ZJ.163258$Tr18.67505@fx42.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24306&group=comp.arch#24306

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!news.swapon.de!weretis.net!feeder8.news.weretis.net!feeds.phibee-telecom.net!newsfeed.xs4all.nl!newsfeed9.news.xs4all.nl!peer02.ams4!peer.am4.highwinds-media.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx42.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com> <5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com> <6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com> <3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com> <FAqXJ.156801$Gojc.137903@fx99.iad> <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com> <d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com> <de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com> <IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com> <t0r6hu$p5m$1@dont-email.me> <3TbYJ.121927$GjY3.90377@fx01.iad> <ddba6add-19ab-43b6-b98f-58110a886f8cn@googlegroups.com> <6d38f64c-ff74-45f9-9ee5-3d3619518d15n@googlegroups.com> <4VoYJ.108094$3jp8.77411@fx33.iad> <63fe646c-495b-4a18-934f-3d3b51dce9e9n@googlegroups.com>
In-Reply-To: <63fe646c-495b-4a18-934f-3d3b51dce9e9n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 129
Message-ID: <D72ZJ.163258$Tr18.67505@fx42.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Fri, 18 Mar 2022 16:08:03 UTC
Date: Fri, 18 Mar 2022 12:07:18 -0400
X-Received-Bytes: 8022

by: EricP - Fri, 18 Mar 2022 16:07 UTC

MitchAlsup wrote:
> On Wednesday, March 16, 2022 at 12:13:39 PM UTC-5, EricP wrote:
>> MitchAlsup wrote:
>>> On Tuesday, March 15, 2022 at 9:46:07 PM UTC-5, robf...@gmail.com wrote:
>>>>> On 3/15/2022 1:08 PM, MitchAlsup wrote:
>>>>> So, what they need is a quick way of getting an address to the failing PTP/PTE.
>>>> I am reminded of a trace cache. Something similar could be used.
>>>>
>>>> It would be easier if a link back to the parent were in a PTP. Since the failing PTE
>>>> / PTP is known, back links could be followed. I wonder if an empty entry in the
>>>> PTP could be used as a back-link. Is it faster to scan a page for a back-link or use
>>>> other means? If there were extra room in the PTP a link could be added in the
>>>> header.
>>> <
>>> Linux uses struct_page (about 2 cache lines per entry) to keep track of "all the
>>> states a page can get into": so struct_page[PPN] gets you to everything you need
>>> to know about a page in 1 layer of indirection.
>>> <
>>> Since HW makes no claims to the bit patterns when p = 0 (present), PPN of this
>>> entry should be pointing at what this non-present PTP would have been pointing
>>> at were it present.
>> The struct_page table is used for different purposes.
>> It is used for recording info about the status of each physical page.
>> There can be multiple virtual addresses in multiple processes
>> mapped to the same physical page.
>>
>> Page faults are different.
>> The page fault exception stack gets the instruction pointer of
>> the start of the faulting instruction, the faulting address,
>> and some status bits indicating the error, what access was being attempted.
> <
> Page Fault gets attached to thread number (which accesses the faulting IP)
> ....................gets the faulting instruction-specifier in R1

I assume this is the instruction pointer.

> ....................gets the faulting virtual address in R2
> ....................gets the address of fault generating PTE/PTP in R3

Passing the physical address of the faulting PTE to the page fault
handler doesn't harm anything but it doesn't really help.
Though I've not seen a page fault exception frame the does this.

This misses the critical page fault reason info. The page fault handler
needs to know what the thread was attempting to do when it triggered
the fault. Looking at x86 that is a set of bit flags indicating:

P = Caused by Not-Present PTE, or other protection codes
R/W = Caused by Read, or caused by Write
U/S = Cause by User/Super mode
RSVD = Reserved bit non-zero
I/D = access was instruction fetch, or data access
PK = Protection Key violation
SGX = SGX Software Guard Extensions violation

> If the problem is in Guest OS translations Guest OS page fault handler
> receives control. If the problem is in HyperVisor Tables, HV page fault
> handler receives control. In both cases the faulting thread becomes
> a "customer" of the fault handler. This customer thing is a means to
> access customer memory to fix exceptions, and access to customer
> state {to fix page tables,...} from afar using indirection without needing
> to have the customer virtual memory mapped into handler.

Ok, except that is not how OS's currently handle page faults.
They assume each thread receives its own page faults and "fixes itself".
Your approach is more like how Mach OS does things.

>> The page faulting thread's handler acquires a page table mutex
>> and uses the faulting virtual address to walk down the PGT doing
>> the same checks a HW walker would do. For it to access the page table
>> the PTP must (somehow) already be mapped into virtual memory.
> <
> No need to walk the tables you already have a pointer to the fault detecting
> PTP/PTE. Only if you need to go back up the table do you need to walk it.
> But, here, you can LDPT LVL-1 and let HW give you the address to the
> upper level table.

This assumes direct access to memory through a physical address.
Some OS do have a memory section that maps all of physical memory into
a virtual range indexed by physical page number for exactly this purpose.
However that takes up a big chunk of kernel virtual space.
That is not always possible so kernel dynamically adds a PTE to itself to
map the target physical address, access memory, unmaps the target address.

Using the faulting virtual address to walk the already self-mapped PTP's
accomplishes the same thing in a portable manner.

>> There are many situations that can occur.
>> Two threads could have faulted the same page at once.
>> The first fixes the fault, releases the mutex and returns.
>> The second walks the table and finds the fault already fixed
>> and just returns.
> <
> Except you don't have to walk the table.

Its not that big of a burdened and the algorithm is portable.

>> Or maybe it walks the table and finds an interior PTP marked Not-Present.
>> Or maybe the faulting page was Present but marked read-only and
>> this thread attempted a write.
> <
> Anything is possible; you just don't have to walk the table; you receive
> control with a pointer to where the fault detecting memory management
> pointer PTP/PTE resides in the translation tables themselves.
> <
> And you only have to lock the table if you want to change it.
> And when you get the table locked, you have to check that the table has
> not already been updated by some other thread. More test-and-test-and-set
> than just test-and-set.

Ok I think I see where you are going with this now.
You want to eliminate the software table walk, which eliminates
the need to acquire a page table mutex for some/many situations
and allow the PTE to be fixed "atomically".

The thing is there are other data structures that are guarded by
that one mutex on "the process address space"
- the process page table
- the virtual section table.
- maybe working set tracking tables if an OS has them.

There are other data structures with their own mutexes:
shared memory sections, page file management management, file system.

The page fault handler needs access to all these data structures
in a synchronized manner. So even if you could eliminate the
SW table walk for some situations, that is not going to buy you much.

Re: Packing Hash Tables

<3be6ed80-9ce6-4c38-947b-2a5e77b15060n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24308&group=comp.arch#24308

copy link Newsgroups: comp.arch

X-Received: by 2002:a0c:fe47:0:b0:42d:f798:3da5 with SMTP id u7-20020a0cfe47000000b0042df7983da5mr8264436qvs.77.1647631777221;
Fri, 18 Mar 2022 12:29:37 -0700 (PDT)
X-Received: by 2002:a05:6830:40c8:b0:5cb:557a:bba6 with SMTP id
h8-20020a05683040c800b005cb557abba6mr863712otu.350.1647631776793; Fri, 18 Mar
2022 12:29:36 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 18 Mar 2022 12:29:36 -0700 (PDT)
In-Reply-To: <D72ZJ.163258$Tr18.67505@fx42.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:dd3d:e99a:37f7:e0f7;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:dd3d:e99a:37f7:e0f7
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<5667c422-d8ad-4159-a0f1-4543223eca38n@googlegroups.com> <6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com> <FAqXJ.156801$Gojc.137903@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me> <3TbYJ.121927$GjY3.90377@fx01.iad>
<ddba6add-19ab-43b6-b98f-58110a886f8cn@googlegroups.com> <6d38f64c-ff74-45f9-9ee5-3d3619518d15n@googlegroups.com>
<4VoYJ.108094$3jp8.77411@fx33.iad> <63fe646c-495b-4a18-934f-3d3b51dce9e9n@googlegroups.com>
<D72ZJ.163258$Tr18.67505@fx42.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <3be6ed80-9ce6-4c38-947b-2a5e77b15060n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 18 Mar 2022 19:29:37 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 189

by: MitchAlsup - Fri, 18 Mar 2022 19:29 UTC

On Friday, March 18, 2022 at 11:08:08 AM UTC-5, EricP wrote:
> MitchAlsup wrote:
> > On Wednesday, March 16, 2022 at 12:13:39 PM UTC-5, EricP wrote:
> >> MitchAlsup wrote:
> >>> On Tuesday, March 15, 2022 at 9:46:07 PM UTC-5, robf...@gmail.com wrote:
> >>>>> On 3/15/2022 1:08 PM, MitchAlsup wrote:
> >>>>> So, what they need is a quick way of getting an address to the failing PTP/PTE.
> >>>> I am reminded of a trace cache. Something similar could be used.
> >>>>
> >>>> It would be easier if a link back to the parent were in a PTP. Since the failing PTE
> >>>> / PTP is known, back links could be followed. I wonder if an empty entry in the
> >>>> PTP could be used as a back-link. Is it faster to scan a page for a back-link or use
> >>>> other means? If there were extra room in the PTP a link could be added in the
> >>>> header.
> >>> <
> >>> Linux uses struct_page (about 2 cache lines per entry) to keep track of "all the
> >>> states a page can get into": so struct_page[PPN] gets you to everything you need
> >>> to know about a page in 1 layer of indirection.
> >>> <
> >>> Since HW makes no claims to the bit patterns when p = 0 (present), PPN of this
> >>> entry should be pointing at what this non-present PTP would have been pointing
> >>> at were it present.
> >> The struct_page table is used for different purposes.
> >> It is used for recording info about the status of each physical page.
> >> There can be multiple virtual addresses in multiple processes
> >> mapped to the same physical page.
> >>
> >> Page faults are different.
> >> The page fault exception stack gets the instruction pointer of
> >> the start of the faulting instruction, the faulting address,
> >> and some status bits indicating the error, what access was being attempted.
> > <
> > Page Fault gets attached to thread number (which accesses the faulting IP)
> > ....................gets the faulting instruction-specifier in R1
>
> I assume this is the instruction pointer.
<
It is the first 32-bits of the instruction itself. This contains everything but the
constants.
<
>
> > ....................gets the faulting virtual address in R2
> > ....................gets the address of fault generating PTE/PTP in R3
>
> Passing the physical address of the faulting PTE to the page fault
> handler doesn't harm anything but it doesn't really help.
> Though I've not seen a page fault exception frame the does this.
>
> This misses the critical page fault reason info. The page fault handler
> needs to know what the thread was attempting to do when it triggered
> the fault. Looking at x86 that is a set of bit flags indicating:
>
> P = Caused by Not-Present PTE, or other protection codes
> R/W = Caused by Read, or caused by Write
> U/S = Cause by User/Super mode
> RSVD = Reserved bit non-zero
> I/D = access was instruction fetch, or data access
> PK = Protection Key violation
> SGX = SGX Software Guard Extensions violation
<
The given address points at the PT[PE] where the fault was detected.
So, for example if you were trying to Read the first PT[PE] that does not have
read permission is where you end up pointing {same for write and execute}
<
The instruction tells you what the instruction stream was doing.
Inst<31..26> = 001001 & Inst<10..5> = 0000xx -> LD
Inst<31..26> = 001001 & Inst<10..5> = 00x1xx -> ST
Inst<31..26> = 1000xx LD
Inst<31..26> = 1001xx ST
<
So, there is a clever trick, Because My 66000 detects all mal-formed instructions,
{mal-formed instructions arrive at the Operation exception handler}
the page fault handler can look at Instruction<31> and decide to look at
Inst<28> or Inst<7> where 0->LD and 1->ST.
So 3 PREDs in a row parse the instruction that caused the page fault.
<
But you make a good point:: I have another 32-bits in R1 I could completely
specify what problem was detected. Let me work on that.
<
My 66000 does not have a u/s mode in the architecture
..................does not have storage keys (yet)
..................does not have software guard extension
..................Keeps RWE separate
<
> > If the problem is in Guest OS translations Guest OS page fault handler
> > receives control. If the problem is in HyperVisor Tables, HV page fault
> > handler receives control. In both cases the faulting thread becomes
> > a "customer" of the fault handler. This customer thing is a means to
> > access customer memory to fix exceptions, and access to customer
> > state {to fix page tables,...} from afar using indirection without needing
> > to have the customer virtual memory mapped into handler.
<
> Ok, except that is not how OS's currently handle page faults.
<
My model allows for applications to use 64-bit virtual address space,
while allowing the HyperVisor to have its 64-bit virtual address space,
and each guest OS to have its 64-bit virtual address space. All separate.
While having 64-bit physical address space for DRAM.
<
> They assume each thread receives its own page faults and "fixes itself".
> Your approach is more like how Mach OS does things.
<
My 66000 model is still that some service provider handles page faults
and when done, returns control to fault initiator. The only real difference
is the the service provider does not have to have its translation tables
map the excepting application.
<
> >> The page faulting thread's handler acquires a page table mutex
> >> and uses the faulting virtual address to walk down the PGT doing
> >> the same checks a HW walker would do. For it to access the page table
> >> the PTP must (somehow) already be mapped into virtual memory.
> > <
> > No need to walk the tables you already have a pointer to the fault detecting
> > PTP/PTE. Only if you need to go back up the table do you need to walk it.
> > But, here, you can LDPT LVL-1 and let HW give you the address to the
> > upper level table.
<
> This assumes direct access to memory through a physical address.
<
No; it assumes a memory reference instruction who kicks the table walker
in the side, and lets the table walker do its job. Its just the result goes into a
register instead of going into the TLB. So, for example, Guest OS could ask
for Level-3 table entry; table walker has to walk Application table, which
has to walk HyperVisor table for Guest OS, to get a PT[PE] to Guest OS
{never being able to see any HyperVisor state or even know one is present}.
<
> Some OS do have a memory section that maps all of physical memory into
> a virtual range indexed by physical page number for exactly this purpose.
> However that takes up a big chunk of kernel virtual space.
<
Which is why is is better to do this with PTE at a high level than a Root.
<
> That is not always possible so kernel dynamically adds a PTE to itself to
> map the target physical address, access memory, unmaps the target address.
>
> Using the faulting virtual address to walk the already self-mapped PTP's
> accomplishes the same thing in a portable manner.
<> >> There are many situations that can occur.
> >> Two threads could have faulted the same page at once.
> >> The first fixes the fault, releases the mutex and returns.
> >> The second walks the table and finds the fault already fixed
> >> and just returns.
> > <
> > Except you don't have to walk the table.
> Its not that big of a burdened and the algorithm is portable.
> >> Or maybe it walks the table and finds an interior PTP marked Not-Present.
> >> Or maybe the faulting page was Present but marked read-only and
> >> this thread attempted a write.
> > <
> > Anything is possible; you just don't have to walk the table; you receive
> > control with a pointer to where the fault detecting memory management
> > pointer PTP/PTE resides in the translation tables themselves.
> > <
> > And you only have to lock the table if you want to change it.
> > And when you get the table locked, you have to check that the table has
> > not already been updated by some other thread. More test-and-test-and-set
> > than just test-and-set.
<
> Ok I think I see where you are going with this now.
> You want to eliminate the software table walk, which eliminates
> the need to acquire a page table mutex for some/many situations
> and allow the PTE to be fixed "atomically".
>
> The thing is there are other data structures that are guarded by
> that one mutex on "the process address space"
> - the process page table
> - the virtual section table.
> - maybe working set tracking tables if an OS has them.
<
Either more Mutexes are in order, or you grab the big one in early generation
ports and clean it up in later ports.
>
> There are other data structures with their own mutexes:
> shared memory sections, page file management management, file system.
>
> The page fault handler needs access to all these data structures
> in a synchronized manner. So even if you could eliminate the
> SW table walk for some situations, that is not going to buy you much.
<
It prevents a bunch of field extractions,
if the page fault was recent, the tables are going to be found in L2--say
......15 cycles per layer in the page hierarchy,
Handler only needs faulting Root pointer.
It uses the table walk accelerators
......(which are guaranteed to hit on recent fault),
And it saves the flow control overhead incurred with level skipping,
So instead of taking 100+ (1-level) cycles it takes 5-cycles using Guest OS
and HyperVisor tables as they sit in their various memory spaces.
<
But on the large scheme of things maybe this is not many cycles.

Click here to read the complete article

Re: Packing Hash Tables

<t131k2$3di$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24313&group=comp.arch#24313

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
Date: Fri, 18 Mar 2022 17:37:51 -0500
Organization: A noiseless patient Spider
Lines: 88
Message-ID: <t131k2$3di$1@dont-email.me>
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7e@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd201$Gojc.137903@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
<dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>
<d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com>
<db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad>
<f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me>
<3cd629e4-3623-4ad2-9e0f-ea9c992325e0n@googlegroups.com>
<t0rjd1$d6i$1@dont-email.me>
<448e523a-bd08-414d-8489-f7453f2218edn@googlegroups.com>
<a4bb1e3e-d311-4dac-90d8-4008ad217105n@googlegroups.com>
<t0t9js$dg9$1@dont-email.me> <pr-dnczU8cmty6n_nZ2dnUU7-efNnZ2d@supernews.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 18 Mar 2022 22:37:54 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="1f0b464d270df404a822b4ff30ba8e09";
logging-data="3506"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/eAL1SUZPbK/SzOKbeDtx/"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:5cXjbtq2G9vVwqdnWaPfBLrDmg4=
In-Reply-To: <pr-dnczU8cmty6n_nZ2dnUU7-efNnZ2d@supernews.com>
Content-Language: en-US

by: BGB - Fri, 18 Mar 2022 22:37 UTC

On 3/18/2022 4:51 AM, aph@littlepinkcloud.invalid wrote:
> BGB <cr88192@gmail.com> wrote:
>>
>> I am not sure whether "in general" a chain-hash or B-Tree would be
>> better for a large sparse address space, would need to model this.
>
> Or perhaps a hybrid: do you know about Phil Bagwell's Ideal Hash
> Trees? They're great for searching, high density, but the cost of
> updating may be too high for your application. They also depend
> critically on a fast POPCOUNT instruction. But even if they don't work
> for your application, you might find some inspiration here.
>
> https://lampwww.epfl.ch/papers/idealhashtrees.pdf
>

It is possible, though I don't have a direct analog of this instruction.

I have yet to evaluate performance tradeoffs, but thus far the AVL trees
appear to be the clear winner in terms of memory overhead (for some
simple synthetic tests) among the options I have tested thus far.

So, for example, 256x 1MB at random addresses:
3-level page table: 8MB
8-level page table: 33MB
AVL Tree: 512K
Hybrid AVL (Last Level is Page Table): 4MB

Current suspicion is that an AVL tree walk will be significantly slower
than a page-table walk.

The AVL Tree variants don't really care as much about address space
size, and will take about the same space in both a 48 and 96 bit address
space (unlike traditional page tables).

I have yet to implement and test a B-Tree.
I have also yet to evaluate the code complexity tradeoffs of the B-Tree.

However, the B-Tree is likely to use up less space than the AVL Tree,
but is likely to be slower (Still TBD).

In the hybrid approach, the AVL tree is used to point to the last level
of a page-table, using page-table pages for the "bottom" level. In
theory, this could be space-competitive, but only when the last-level
pages are more than 25% full (doesn't happen with 1MB allocs).

However, these would reduce the number of "last-level" steps.

One issue with the AVL tree implementation is that I haven't yet gotten
the tree rebalancing working correctly as of yet, which is a fairly
critical part of making the performance "reasonable" (otherwise,
inserting items in sequential order is prone to significantly
unbalancing the tree).

The AVL code isn't helped though by being implemented via a bunch of
bit-twiddly (obscuring where the logic is going wrong). I could have
used structs, but this would likely be slower.

Balancing is less of an issue with B-Trees, as the design of the
structure keeps the tree balanced. Nodes split and merge laterally,
rather than being inserted below another element and then rotated into
position. However, a B-Tree may need to be able to split the top-level
node, whereas node-splitting doesn't exist for AVL Trees.

I may also need to compare against a chain hash.
It is not clear which would win yet: hash walking is simpler, but chain
lengths are likely to be longer than the log2 lookup distance.

I am not sure yet whether a naive benchmark performed on x86 will be
directly comparable to results on BJX2.

I am testing this at present within a partial mockup of TestKern's
memory management facilities implemented on x86-64.

....

May post more when I do more testing here...

Re: Packing Hash Tables

<08607ccc-6ffc-4279-9f38-268df12beae6n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24314&group=comp.arch#24314

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:214:b0:2e1:a8cf:959f with SMTP id b20-20020a05622a021400b002e1a8cf959fmr9276863qtx.300.1647644023023;
Fri, 18 Mar 2022 15:53:43 -0700 (PDT)
X-Received: by 2002:a05:6870:45a4:b0:dd:b08e:fa49 with SMTP id
y36-20020a05687045a400b000ddb08efa49mr4891403oao.270.1647644022771; Fri, 18
Mar 2022 15:53:42 -0700 (PDT)
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!2.eu.feeder.erje.net!3.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 18 Mar 2022 15:53:42 -0700 (PDT)
In-Reply-To: <t131k2$3di$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:dd3d:e99a:37f7:e0f7;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:dd3d:e99a:37f7:e0f7
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com> <3c363fd7-4004-4d99-bea7-7e@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd201$Gojc.137903@fx99.iad> <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
<dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com> <d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>
<d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com> <de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com>
<db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com> <IY3YJ.108064$3jp8.15219@fx33.iad>
<f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com> <t0r6hu$p5m$1@dont-email.me>
<3cd629e4-3623-4ad2-9e0f-ea9c992325e0n@googlegroups.com> <t0rjd1$d6i$1@dont-email.me>
<448e523a-bd08-414d-8489-f7453f2218edn@googlegroups.com> <a4bb1e3e-d311-4dac-90d8-4008ad217105n@googlegroups.com>
<t0t9js$dg9$1@dont-email.me> <pr-dnczU8cmty6n_nZ2dnUU7-efNnZ2d@supernews.com> <t131k2$3di$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <08607ccc-6ffc-4279-9f38-268df12beae6n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Fri, 18 Mar 2022 22:53:43 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 96

by: MitchAlsup - Fri, 18 Mar 2022 22:53 UTC

On Friday, March 18, 2022 at 5:37:57 PM UTC-5, BGB wrote:
> On 3/18/2022 4:51 AM, a...@littlepinkcloud.invalid wrote:
> > BGB <cr8...@gmail.com> wrote:
> >>
> >> I am not sure whether "in general" a chain-hash or B-Tree would be
> >> better for a large sparse address space, would need to model this.
> >
> > Or perhaps a hybrid: do you know about Phil Bagwell's Ideal Hash
> > Trees? They're great for searching, high density, but the cost of
> > updating may be too high for your application. They also depend
> > critically on a fast POPCOUNT instruction. But even if they don't work
> > for your application, you might find some inspiration here.
> >
> > https://lampwww.epfl.ch/papers/idealhashtrees.pdf
> >
> It is possible, though I don't have a direct analog of this instruction.
>
>
> I have yet to evaluate performance tradeoffs, but thus far the AVL trees
> appear to be the clear winner in terms of memory overhead (for some
> simple synthetic tests) among the options I have tested thus far.
>
>
> So, for example, 256x 1MB at random addresses:
> 3-level page table: 8MB
> 8-level page table: 33MB
> AVL Tree: 512K
> Hybrid AVL (Last Level is Page Table): 4MB
>
> Current suspicion is that an AVL tree walk will be significantly slower
> than a page-table walk.
<
At 1 memory ref per level and 1-bit of VA per level, you can be your bottom
dollar it would be slow. 51-level table when application is "really big". When
application is the sizeof "cat" you would only be making 3-accesses per walk.
<
With 1MB sized application and 4K pages, 8-level AVL walk, but the size of the
page tables is about 2 pages. With level skipping conventional page tables
the number of pages will be about 5. With level-skipping sized-super-pages
the size of the translation table is about 1 page.
<
>
> The AVL Tree variants don't really care as much about address space
> size, and will take about the same space in both a 48 and 96 bit address
> space (unlike traditional page tables).
>
>
> I have yet to implement and test a B-Tree.
> I have also yet to evaluate the code complexity tradeoffs of the B-Tree.
>
> However, the B-Tree is likely to use up less space than the AVL Tree,
> but is likely to be slower (Still TBD).
>
>
> In the hybrid approach, the AVL tree is used to point to the last level
> of a page-table, using page-table pages for the "bottom" level. In
> theory, this could be space-competitive, but only when the last-level
> pages are more than 25% full (doesn't happen with 1MB allocs).
>
> However, these would reduce the number of "last-level" steps.
>
>
>
>
> One issue with the AVL tree implementation is that I haven't yet gotten
> the tree rebalancing working correctly as of yet, which is a fairly
> critical part of making the performance "reasonable" (otherwise,
> inserting items in sequential order is prone to significantly
> unbalancing the tree).
>
> The AVL code isn't helped though by being implemented via a bunch of
> bit-twiddly (obscuring where the logic is going wrong). I could have
> used structs, but this would likely be slower.
>
> Balancing is less of an issue with B-Trees, as the design of the
> structure keeps the tree balanced. Nodes split and merge laterally,
> rather than being inserted below another element and then rotated into
> position. However, a B-Tree may need to be able to split the top-level
> node, whereas node-splitting doesn't exist for AVL Trees.
>
>
> I may also need to compare against a chain hash.
> It is not clear which would win yet: hash walking is simpler, but chain
> lengths are likely to be longer than the log2 lookup distance.
>
>
>
> I am not sure yet whether a naive benchmark performed on x86 will be
> directly comparable to results on BJX2.
>
> I am testing this at present within a partial mockup of TestKern's
> memory management facilities implemented on x86-64.
>
>
> ...
>
> May post more when I do more testing here...

Re: Packing Hash Tables

<bd85373f-1c42-4b50-ac44-183e22a9823dn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24315&group=comp.arch#24315

copy link Newsgroups: comp.arch

X-Received: by 2002:ae9:e411:0:b0:67e:616f:400a with SMTP id q17-20020ae9e411000000b0067e616f400amr2397132qkc.645.1647649312643;
Fri, 18 Mar 2022 17:21:52 -0700 (PDT)
X-Received: by 2002:aca:ead4:0:b0:2ec:ba66:12df with SMTP id
i203-20020acaead4000000b002ecba6612dfmr9927167oih.194.1647649312363; Fri, 18
Mar 2022 17:21:52 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!3.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Fri, 18 Mar 2022 17:21:52 -0700 (PDT)
In-Reply-To: <08607ccc-6ffc-4279-9f38-268df12beae6n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de4:1f00:5c0a:d0cd:8503:8656;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de4:1f00:5c0a:d0cd:8503:8656
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<6c227477-2d01-4496-b93a-2b9389cc7453n@googlegroups.com> <3c363fd7-4004-4d99-bea7-7e@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd201$Gojc.137903@fx99.iad> <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
<dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com> <d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>
<d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com> <de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com>
<db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com> <IY3YJ.108064$3jp8.15219@fx33.iad>
<f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com> <t0r6hu$p5m$1@dont-email.me>
<3cd629e4-3623-4ad2-9e0f-ea9c992325e0n@googlegroups.com> <t0rjd1$d6i$1@dont-email.me>
<448e523a-bd08-414d-8489-f7453f2218edn@googlegroups.com> <a4bb1e3e-d311-4dac-90d8-4008ad217105n@googlegroups.com>
<t0t9js$dg9$1@dont-email.me> <pr-dnczU8cmty6n_nZ2dnUU7-efNnZ2d@supernews.com>
<t131k2$3di$1@dont-email.me> <08607ccc-6ffc-4279-9f38-268df12beae6n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <bd85373f-1c42-4b50-ac44-183e22a9823dn@googlegroups.com>
Subject: Re: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Sat, 19 Mar 2022 00:21:52 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 122

by: robf...@gmail.com - Sat, 19 Mar 2022 00:21 UTC

On Friday, March 18, 2022 at 6:53:44 PM UTC-4, MitchAlsup wrote:
> On Friday, March 18, 2022 at 5:37:57 PM UTC-5, BGB wrote:
> > On 3/18/2022 4:51 AM, a...@littlepinkcloud.invalid wrote:
> > > BGB <cr8...@gmail.com> wrote:
> > >>
> > >> I am not sure whether "in general" a chain-hash or B-Tree would be
> > >> better for a large sparse address space, would need to model this.
> > >
> > > Or perhaps a hybrid: do you know about Phil Bagwell's Ideal Hash
> > > Trees? They're great for searching, high density, but the cost of
> > > updating may be too high for your application. They also depend
> > > critically on a fast POPCOUNT instruction. But even if they don't work
> > > for your application, you might find some inspiration here.
> > >
> > > https://lampwww.epfl.ch/papers/idealhashtrees.pdf
> > >
> > It is possible, though I don't have a direct analog of this instruction.
> >
> >
> > I have yet to evaluate performance tradeoffs, but thus far the AVL trees
> > appear to be the clear winner in terms of memory overhead (for some
> > simple synthetic tests) among the options I have tested thus far.
> >
> >
> > So, for example, 256x 1MB at random addresses:
> > 3-level page table: 8MB
> > 8-level page table: 33MB
> > AVL Tree: 512K
> > Hybrid AVL (Last Level is Page Table): 4MB
> >
> > Current suspicion is that an AVL tree walk will be significantly slower
> > than a page-table walk.
> <
> At 1 memory ref per level and 1-bit of VA per level, you can be your bottom
> dollar it would be slow. 51-level table when application is "really big". When
> application is the sizeof "cat" you would only be making 3-accesses per walk.
> <
> With 1MB sized application and 4K pages, 8-level AVL walk, but the size of the
> page tables is about 2 pages. With level skipping conventional page tables
> the number of pages will be about 5. With level-skipping sized-super-pages
> the size of the translation table is about 1 page.
> <
> >
> > The AVL Tree variants don't really care as much about address space
> > size, and will take about the same space in both a 48 and 96 bit address
> > space (unlike traditional page tables).
> >
> >
> > I have yet to implement and test a B-Tree.
> > I have also yet to evaluate the code complexity tradeoffs of the B-Tree.
> >
> > However, the B-Tree is likely to use up less space than the AVL Tree,
> > but is likely to be slower (Still TBD).
> >
> >
> > In the hybrid approach, the AVL tree is used to point to the last level
> > of a page-table, using page-table pages for the "bottom" level. In
> > theory, this could be space-competitive, but only when the last-level
> > pages are more than 25% full (doesn't happen with 1MB allocs).
> >
> > However, these would reduce the number of "last-level" steps.
> >
> >
> >
> >
> > One issue with the AVL tree implementation is that I haven't yet gotten
> > the tree rebalancing working correctly as of yet, which is a fairly
> > critical part of making the performance "reasonable" (otherwise,
> > inserting items in sequential order is prone to significantly
> > unbalancing the tree).
> >
> > The AVL code isn't helped though by being implemented via a bunch of
> > bit-twiddly (obscuring where the logic is going wrong). I could have
> > used structs, but this would likely be slower.
> >
> > Balancing is less of an issue with B-Trees, as the design of the
> > structure keeps the tree balanced. Nodes split and merge laterally,
> > rather than being inserted below another element and then rotated into
> > position. However, a B-Tree may need to be able to split the top-level
> > node, whereas node-splitting doesn't exist for AVL Trees.
> >
> >
> > I may also need to compare against a chain hash.
> > It is not clear which would win yet: hash walking is simpler, but chain
> > lengths are likely to be longer than the log2 lookup distance.
> >
> >
> >
> > I am not sure yet whether a naive benchmark performed on x86 will be
> > directly comparable to results on BJX2.
> >
> > I am testing this at present within a partial mockup of TestKern's
> > memory management facilities implemented on x86-64.
> >
> >
> > ...
> >
> > May post more when I do more testing here...

My favorite has got to be hash tables although they are likely a bit slower than
hierarchical tables if MMU caches are used. They conserve memory space
which may be important for smaller systems. A small hash table could fit
entirely in a cache. For the FPGA board only 8192 pages need be managed,
256k RAM. It is tempting to dedicate block RAM for this rather than use main
memory and possibly get rid of the TLB. I could then make the RAM wide so
that a PTG could read in a single cycle. The hash table can likely be entirely
resident in memory.

If page table groups are used, keys that hash to the same value end up in the
same group. This can be used to get some locality of reference. I have my hash
function ignoring an extra pair of low address bits. That means hashes of similar
addresses will hash to the same PTG making it local. I also cache the PTGs in
the MMU so for a small set of address hashes things are coming from the cache.
Entries could be re-arranged in the PTGs to give move frequently used entries
preference so that they are hit first.

I have the page tables located in the physical address space instead of the virtual
address space. Are page tables ever located in the virtual address space? It
seems to me that would require the ability to process nested TLB misses.

I am wondering if ASLR is used for page tables? I am assuming the OS could move
data around to different physical pages to randomize things then update the page
table to point to the new data location.

Re: Packing Hash Tables

<t13e7v$nio$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24318&group=comp.arch#24318

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
Date: Fri, 18 Mar 2022 21:13:17 -0500
Organization: A noiseless patient Spider
Lines: 107
Message-ID: <t13e7v$nio$1@dont-email.me>
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7e@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd201$Gojc.137903@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
<dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>
<d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com>
<db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad>
<f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me>
<3cd629e4-3623-4ad2-9e0f-ea9c992325e0n@googlegroups.com>
<t0rjd1$d6i$1@dont-email.me>
<448e523a-bd08-414d-8489-f7453f2218edn@googlegroups.com>
<a4bb1e3e-d311-4dac-90d8-4008ad217105n@googlegroups.com>
<t0t9js$dg9$1@dont-email.me> <pr-dnczU8cmty6n_nZ2dnUU7-efNnZ2d@supernews.com>
<t131k2$3di$1@dont-email.me>
<08607ccc-6ffc-4279-9f38-268df12beae6n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 19 Mar 2022 02:13:19 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="a25628b8221a77b50836d0d0d2124021";
logging-data="24152"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+KN0837eSOg8YNDLon9Bge"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:1UwFcn+TFrs+h06zKz6TJQueLNY=
In-Reply-To: <08607ccc-6ffc-4279-9f38-268df12beae6n@googlegroups.com>
Content-Language: en-US

by: BGB - Sat, 19 Mar 2022 02:13 UTC

On 3/18/2022 5:53 PM, MitchAlsup wrote:
> On Friday, March 18, 2022 at 5:37:57 PM UTC-5, BGB wrote:
>> On 3/18/2022 4:51 AM, a...@littlepinkcloud.invalid wrote:
>>> BGB <cr8...@gmail.com> wrote:
>>>>
>>>> I am not sure whether "in general" a chain-hash or B-Tree would be
>>>> better for a large sparse address space, would need to model this.
>>>
>>> Or perhaps a hybrid: do you know about Phil Bagwell's Ideal Hash
>>> Trees? They're great for searching, high density, but the cost of
>>> updating may be too high for your application. They also depend
>>> critically on a fast POPCOUNT instruction. But even if they don't work
>>> for your application, you might find some inspiration here.
>>>
>>> https://lampwww.epfl.ch/papers/idealhashtrees.pdf
>>>
>> It is possible, though I don't have a direct analog of this instruction.
>>
>>
>> I have yet to evaluate performance tradeoffs, but thus far the AVL trees
>> appear to be the clear winner in terms of memory overhead (for some
>> simple synthetic tests) among the options I have tested thus far.
>>
>>
>> So, for example, 256x 1MB at random addresses:
>> 3-level page table: 8MB
>> 8-level page table: 33MB
>> AVL Tree: 512K
>> Hybrid AVL (Last Level is Page Table): 4MB
>>
>> Current suspicion is that an AVL tree walk will be significantly slower
>> than a page-table walk.
> <
> At 1 memory ref per level and 1-bit of VA per level, you can be your bottom
> dollar it would be slow. 51-level table when application is "really big". When
> application is the sizeof "cat" you would only be making 3-accesses per walk.
> <
> With 1MB sized application and 4K pages, 8-level AVL walk, but the size of the
> page tables is about 2 pages. With level skipping conventional page tables
> the number of pages will be about 5. With level-skipping sized-super-pages
> the size of the translation table is about 1 page.
> <

In this case, it is more about trying to conserve memory than making it
fast. Aggressive ASLR would do bad things to page-table size.

In ongoing tests, direct page table is fastest by a significant margin.

AVL vs Hash seems to be break-even ~ 384MB (24576 pages).
Less than this, Hash is faster;
More than this, AVL is faster.

Both of these seem to be around 10x slower than the 3-level page table.
The 8-level table is around 4x slower than the 3-level table.

At this point:
Hash size is 2048, So ~ 12 steps per hash chain.
Log2 is ~ 14.6 ...

This is slightly curious though as I would have expected the chain-hash
walk to be slightly faster. Either way, seems to follow the (expected)
pattern that whichever results in the shorter path length wins.

The "hybrid" strategy is around twice as fast as the pure AVL, or around
5x slower than the 3-level page table.

The hybrid strategy benefits considerably if one does a smaller number
of larger allocations (whereas neither Hash nor AVL gain much here).

Both the AVL Tree and Chain Hash seem to end up taking approximately the
same amount of memory in this case.

It also seems that the time is directly proportional to the number of
steps needed to get to the target.

Granted, ended up "cheesing it" for the AVL rebalancing, as I didn't
implement the full set of node rotations (2 out of 4), which seems to
result in the rebalancing getting "stuck" in situations where it tries
in vain to endlessly rotate the same nodes (and getting stuck in an
infinite loop).

The cheese strategy being to keep a counter, and if the rebalancing
takes too many steps, it gives up and stops rotating the nodes.

Could try implementing a B-Tree, but at this point I am not super
optimistic about it...

Some of this also opens up the possibility of a Radix-8 or Radix-16
table, which functions (in premise) similar to a conventional page
table, except that each directory only has 8 or 16 entries.

Though, a Radix-16 table would need 24 steps to resolve a 96-bit
address, which implies that (based on the currently seen pattern) this
would likely be slower than the AVL tree.

....

Re: Packing Hash Tables

<7e360dc1-95ab-493a-9cab-a6865222a8fan@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24321&group=comp.arch#24321

copy link Newsgroups: comp.arch

X-Received: by 2002:ad4:5de9:0:b0:435:4fdb:5c46 with SMTP id jn9-20020ad45de9000000b004354fdb5c46mr10304689qvb.125.1647696370451;
Sat, 19 Mar 2022 06:26:10 -0700 (PDT)
X-Received: by 2002:a05:6870:241b:b0:d7:22f5:815e with SMTP id
n27-20020a056870241b00b000d722f5815emr5681796oap.58.1647696370162; Sat, 19
Mar 2022 06:26:10 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 19 Mar 2022 06:26:09 -0700 (PDT)
In-Reply-To: <t13e7v$nio$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de4:1f00:a126:1084:e087:c4f4;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de4:1f00:a126:1084:e087:c4f4
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7e@fx99.iad> <691e9232-da74-43c2-9b07-e6dd201$Gojc.137903@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me> <3cd629e4-3623-4ad2-9e0f-ea9c992325e0n@googlegroups.com>
<t0rjd1$d6i$1@dont-email.me> <448e523a-bd08-414d-8489-f7453f2218edn@googlegroups.com>
<a4bb1e3e-d311-4dac-90d8-4008ad217105n@googlegroups.com> <t0t9js$dg9$1@dont-email.me>
<pr-dnczU8cmty6n_nZ2dnUU7-efNnZ2d@supernews.com> <t131k2$3di$1@dont-email.me>
<08607ccc-6ffc-4279-9f38-268df12beae6n@googlegroups.com> <t13e7v$nio$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <7e360dc1-95ab-493a-9cab-a6865222a8fan@googlegroups.com>
Subject: Re: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Sat, 19 Mar 2022 13:26:10 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 120

by: robf...@gmail.com - Sat, 19 Mar 2022 13:26 UTC

On Friday, March 18, 2022 at 10:13:22 PM UTC-4, BGB wrote:
> On 3/18/2022 5:53 PM, MitchAlsup wrote:
> > On Friday, March 18, 2022 at 5:37:57 PM UTC-5, BGB wrote:
> >> On 3/18/2022 4:51 AM, a...@littlepinkcloud.invalid wrote:
> >>> BGB <cr8...@gmail.com> wrote:
> >>>>
> >>>> I am not sure whether "in general" a chain-hash or B-Tree would be
> >>>> better for a large sparse address space, would need to model this.
> >>>
> >>> Or perhaps a hybrid: do you know about Phil Bagwell's Ideal Hash
> >>> Trees? They're great for searching, high density, but the cost of
> >>> updating may be too high for your application. They also depend
> >>> critically on a fast POPCOUNT instruction. But even if they don't work
> >>> for your application, you might find some inspiration here.
> >>>
> >>> https://lampwww.epfl.ch/papers/idealhashtrees.pdf
> >>>
> >> It is possible, though I don't have a direct analog of this instruction.
> >>
> >>
> >> I have yet to evaluate performance tradeoffs, but thus far the AVL trees
> >> appear to be the clear winner in terms of memory overhead (for some
> >> simple synthetic tests) among the options I have tested thus far.
> >>
> >>
> >> So, for example, 256x 1MB at random addresses:
> >> 3-level page table: 8MB
> >> 8-level page table: 33MB
> >> AVL Tree: 512K
> >> Hybrid AVL (Last Level is Page Table): 4MB
> >>
> >> Current suspicion is that an AVL tree walk will be significantly slower
> >> than a page-table walk.
> > <
> > At 1 memory ref per level and 1-bit of VA per level, you can be your bottom
> > dollar it would be slow. 51-level table when application is "really big". When
> > application is the sizeof "cat" you would only be making 3-accesses per walk.
> > <
> > With 1MB sized application and 4K pages, 8-level AVL walk, but the size of the
> > page tables is about 2 pages. With level skipping conventional page tables
> > the number of pages will be about 5. With level-skipping sized-super-pages
> > the size of the translation table is about 1 page.
> > <
> In this case, it is more about trying to conserve memory than making it
> fast. Aggressive ASLR would do bad things to page-table size.
>
>
> In ongoing tests, direct page table is fastest by a significant margin.
>
> AVL vs Hash seems to be break-even ~ 384MB (24576 pages).
> Less than this, Hash is faster;
> More than this, AVL is faster.
>
>
> Both of these seem to be around 10x slower than the 3-level page table.
> The 8-level table is around 4x slower than the 3-level table.
>
>
>
> At this point:
> Hash size is 2048, So ~ 12 steps per hash chain.
> Log2 is ~ 14.6 ...
>
> This is slightly curious though as I would have expected the chain-hash
> walk to be slightly faster. Either way, seems to follow the (expected)
> pattern that whichever results in the shorter path length wins.
>
>
> The "hybrid" strategy is around twice as fast as the pure AVL, or around
> 5x slower than the 3-level page table.
>
> The hybrid strategy benefits considerably if one does a smaller number
> of larger allocations (whereas neither Hash nor AVL gain much here).
>
>
> Both the AVL Tree and Chain Hash seem to end up taking approximately the
> same amount of memory in this case.
>
>
> It also seems that the time is directly proportional to the number of
> steps needed to get to the target.
>
>
>
> Granted, ended up "cheesing it" for the AVL rebalancing, as I didn't
> implement the full set of node rotations (2 out of 4), which seems to
> result in the rebalancing getting "stuck" in situations where it tries
> in vain to endlessly rotate the same nodes (and getting stuck in an
> infinite loop).
>
> The cheese strategy being to keep a counter, and if the rebalancing
> takes too many steps, it gives up and stops rotating the nodes.
>
>
> Could try implementing a B-Tree, but at this point I am not super
> optimistic about it...
>
>
> Some of this also opens up the possibility of a Radix-8 or Radix-16
> table, which functions (in premise) similar to a conventional page
> table, except that each directory only has 8 or 16 entries.
>
> Though, a Radix-16 table would need 24 steps to resolve a 96-bit
> address, which implies that (based on the currently seen pattern) this
> would likely be slower than the AVL tree.
>
> ...

>At this point:
>Hash size is 2048, So ~ 12 steps per hash chain.
>Log2 is ~ 14.6 ...

That does not sound right. A decent hash usually averages 1.2 steps.
Decimal in the wrong place?
I am just in the process of getting a hash table to work. It is only 3.5x as
many block rams as the TLB was. Uses a 256-bit wide port on the cpu side
for loading and storing and a 2048-bit wide port on the memory side to
allow an entire page group to be read in a single cycle. It has the same
performance as a cache. But the block RAM is dedicated in the BIU, bus
interface unit, so it would not work with a multi-core cpu. Keeping the
RAM coherent between cores would be a challenge.

Re: Packing Hash Tables

<wgnZJ.101307$Mpg8.3733@fx34.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24322&group=comp.arch#24322

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!news.freedyn.de!newsreader4.netcologne.de!news.netcologne.de!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx34.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com> <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com> <d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com> <de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com> <IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com> <t0r6hu$p5m$1@dont-email.me> <3cd629e4-3623-4ad2-9e0f-ea9c992325e0n@googlegroups.com> <t0rjd1$d6i$1@dont-email.me> <448e523a-bd08-414d-8489-f7453f2218edn@googlegroups.com> <a4bb1e3e-d311-4dac-90d8-4008ad217105n@googlegroups.com> <t0t9js$dg9$1@dont-email.me> <pr-dnczU8cmty6n_nZ2dnUU7-efNnZ2d@supernews.com> <t131k2$3di$1@dont-email.me> <08607ccc-6ffc-4279-9f38-268df12beae6n@googlegroups.com> <bd85373f-1c42-4b50-ac44-183e22a9823dn@googlegroups.com>
In-Reply-To: <bd85373f-1c42-4b50-ac44-183e22a9823dn@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 53
Message-ID: <wgnZJ.101307$Mpg8.3733@fx34.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sat, 19 Mar 2022 16:11:08 UTC
Date: Sat, 19 Mar 2022 12:11:01 -0400
X-Received-Bytes: 4166

by: EricP - Sat, 19 Mar 2022 16:11 UTC

robf...@gmail.com wrote:
>
> I have the page tables located in the physical address space instead of the virtual
> address space. Are page tables ever located in the virtual address space? It
> seems to me that would require the ability to process nested TLB misses.

Yes, VAX had page tables located in virtual space.
I never looked in detail at Alpha/VMS but since it used a software
managed TLB it was quite possible they emulated it there too
as that's probably cheaper than rewriting an OS.
Alpha 21064 also had a virtually indexed L1 cache which
would have functioned equivalent to a large second level TLB.

Yes, VAX used nested TLB misses. IIRC...
Address space was separated into 4 regions by the 2 upper address bits.
These were process space P0 (user), and P1 (supervisor),
and system space S0 (executive) and S1 (kernel).

Address bits [29:9] were the Virtual Page Number (VPN).
System space S1 was mapped by the system page table, a vector of PTE's
stored in S1, indexed from a *physical* base register using the VPN,
with a count register giving the number of valid PTE's.

Process space P0 and P1 were each mapped by a page table, a vector of
PTE's indexed from a *virtual* base register using the VPN,
with a count register giving the number of valid PTE's.
The virtual addresses in the P0 and P1 base registers are
virtual addresses in S1 space.

Translation extracts the VPN for user code/data and looks up in TLB.
(1) If it is a hit then it gets the physical address.
(2) If it is a TLB-miss then it adds the VPN to the P0-base creating
a system virtual address for the PTE which it looks up in TLB.
(3) If it is a hit then it loads the PTE into TLB and repeats (1).
(4) If it is a TLB-miss then it extracts the VPN from the PTE's
system virtual address and adds it to the S1-base creating
a system physical address for the system PTE,
which it loads to TLB and repeats (3).

And the PTE Present and protection bits are being checked at each step.

The net effect is a backwards page table walk.
Intel also does a backward table walk but it does so by caching
interior page table nodes in the TLB while walking down the tree.

Because the process and system page table pages are virtually indexed
the OS can choose which process and system page table pages to swap.

A process address space switch can be accomplished by simply
changing the P0-base and P1-base registers to different values.
They will index to different PTE's in the system space.

Re: Packing Hash Tables

<t159k2$8la$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24328&group=comp.arch#24328

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
Date: Sat, 19 Mar 2022 14:06:40 -0500
Organization: A noiseless patient Spider
Lines: 206
Message-ID: <t159k2$8la$1@dont-email.me>
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com>
<dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>
<d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com>
<db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad>
<f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me>
<3cd629e4-3623-4ad2-9e0f-ea9c992325e0n@googlegroups.com>
<t0rjd1$d6i$1@dont-email.me>
<448e523a-bd08-414d-8489-f7453f2218edn@googlegroups.com>
<a4bb1e3e-d311-4dac-90d8-4008ad217105n@googlegroups.com>
<t0t9js$dg9$1@dont-email.me> <pr-dnczU8cmty6n_nZ2dnUU7-efNnZ2d@supernews.com>
<t131k2$3di$1@dont-email.me>
<08607ccc-6ffc-4279-9f38-268df12beae6n@googlegroups.com>
<t13e7v$nio$1@dont-email.me>
<7e360dc1-95ab-493a-9cab-a6865222a8fan@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 19 Mar 2022 19:06:42 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="a25628b8221a77b50836d0d0d2124021";
logging-data="8874"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/Kiu3HoMfTPFDsAyQCLAFD"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:xk0ksrKUA556VfDF924j6dXxcjA=
In-Reply-To: <7e360dc1-95ab-493a-9cab-a6865222a8fan@googlegroups.com>
Content-Language: en-US

by: BGB - Sat, 19 Mar 2022 19:06 UTC

On 3/19/2022 8:26 AM, robf...@gmail.com wrote:
> On Friday, March 18, 2022 at 10:13:22 PM UTC-4, BGB wrote:
>> On 3/18/2022 5:53 PM, MitchAlsup wrote:
>>> On Friday, March 18, 2022 at 5:37:57 PM UTC-5, BGB wrote:
>>>> On 3/18/2022 4:51 AM, a...@littlepinkcloud.invalid wrote:
>>>>> BGB <cr8...@gmail.com> wrote:
>>>>>>
>>>>>> I am not sure whether "in general" a chain-hash or B-Tree would be
>>>>>> better for a large sparse address space, would need to model this.
>>>>>
>>>>> Or perhaps a hybrid: do you know about Phil Bagwell's Ideal Hash
>>>>> Trees? They're great for searching, high density, but the cost of
>>>>> updating may be too high for your application. They also depend
>>>>> critically on a fast POPCOUNT instruction. But even if they don't work
>>>>> for your application, you might find some inspiration here.
>>>>>
>>>>> https://lampwww.epfl.ch/papers/idealhashtrees.pdf
>>>>>
>>>> It is possible, though I don't have a direct analog of this instruction.
>>>>
>>>>
>>>> I have yet to evaluate performance tradeoffs, but thus far the AVL trees
>>>> appear to be the clear winner in terms of memory overhead (for some
>>>> simple synthetic tests) among the options I have tested thus far.
>>>>
>>>>
>>>> So, for example, 256x 1MB at random addresses:
>>>> 3-level page table: 8MB
>>>> 8-level page table: 33MB
>>>> AVL Tree: 512K
>>>> Hybrid AVL (Last Level is Page Table): 4MB
>>>>
>>>> Current suspicion is that an AVL tree walk will be significantly slower
>>>> than a page-table walk.
>>> <
>>> At 1 memory ref per level and 1-bit of VA per level, you can be your bottom
>>> dollar it would be slow. 51-level table when application is "really big". When
>>> application is the sizeof "cat" you would only be making 3-accesses per walk.
>>> <
>>> With 1MB sized application and 4K pages, 8-level AVL walk, but the size of the
>>> page tables is about 2 pages. With level skipping conventional page tables
>>> the number of pages will be about 5. With level-skipping sized-super-pages
>>> the size of the translation table is about 1 page.
>>> <
>> In this case, it is more about trying to conserve memory than making it
>> fast. Aggressive ASLR would do bad things to page-table size.
>>
>>
>> In ongoing tests, direct page table is fastest by a significant margin.
>>
>> AVL vs Hash seems to be break-even ~ 384MB (24576 pages).
>> Less than this, Hash is faster;
>> More than this, AVL is faster.
>>
>>
>> Both of these seem to be around 10x slower than the 3-level page table.
>> The 8-level table is around 4x slower than the 3-level table.
>>
>>
>>
>> At this point:
>> Hash size is 2048, So ~ 12 steps per hash chain.
>> Log2 is ~ 14.6 ...
>>
>> This is slightly curious though as I would have expected the chain-hash
>> walk to be slightly faster. Either way, seems to follow the (expected)
>> pattern that whichever results in the shorter path length wins.
>>
>>
>> The "hybrid" strategy is around twice as fast as the pure AVL, or around
>> 5x slower than the 3-level page table.
>>
>> The hybrid strategy benefits considerably if one does a smaller number
>> of larger allocations (whereas neither Hash nor AVL gain much here).
>>
>>
>> Both the AVL Tree and Chain Hash seem to end up taking approximately the
>> same amount of memory in this case.
>>
>>
>> It also seems that the time is directly proportional to the number of
>> steps needed to get to the target.
>>
>>
>>
>> Granted, ended up "cheesing it" for the AVL rebalancing, as I didn't
>> implement the full set of node rotations (2 out of 4), which seems to
>> result in the rebalancing getting "stuck" in situations where it tries
>> in vain to endlessly rotate the same nodes (and getting stuck in an
>> infinite loop).
>>
>> The cheese strategy being to keep a counter, and if the rebalancing
>> takes too many steps, it gives up and stops rotating the nodes.
>>
>>
>> Could try implementing a B-Tree, but at this point I am not super
>> optimistic about it...
>>
>>
>> Some of this also opens up the possibility of a Radix-8 or Radix-16
>> table, which functions (in premise) similar to a conventional page
>> table, except that each directory only has 8 or 16 entries.
>>
>> Though, a Radix-16 table would need 24 steps to resolve a 96-bit
>> address, which implies that (based on the currently seen pattern) this
>> would likely be slower than the AVL tree.
>>
>> ...
>
>> At this point:
>> Hash size is 2048, So ~ 12 steps per hash chain.
>> Log2 is ~ 14.6 ...
>
> That does not sound right. A decent hash usually averages 1.2 steps.
> Decimal in the wrong place?

24576 pages, with a 2048-entry chain hash: 24576/2048 = 12.
Both theoretical estimates and measured values appear to roughly agree here.

One could make the starting hash bigger, which would reduce this, say:
24576 pages, with a 4096-entry chain hash: 24576/4096 = 6.

So, one goes through the first stage hash, and then it becomes a linked
list walk. The size of the first hash working as constant division
factor for the average linked-list length.

This was the measured "break-even" point (according to benchmarks on
x86-64). A smaller number of pages favors the hash, a larger number
favors AVL.

A direct hash would give 1 or 2 steps, however, a direct hash could not
be expanded linearly (resizing a direct hash requires rehashing
everything in the hash table).

So, I went with a chain-hash in this test.

The mystery, though, is why the per-element cost in the hash seems to be
slightly higher than the AVL, when the logic for the hash-list walk is
simpler (simply moves on to the next element, rather than needing to
also compare the address to go down the left or right side).

Both could actually be a little faster, as I wrote the logic to deal
with a full 96-bit address. If the VA space were reduced to 76 or 78
bits, then a simpler/cheaper 64-bit compare could be used.

I was initially going to use 128-bit ops, but then (since I wanted to be
able to test this in MSVC), I ended up switching to 64 bit operations
(and also hacking around the issue that Win10 is using 48-bit VA's which
were breaking the bit-field lengths I was using).

But, as-noted, I am using bit-packed structures, and:
struct AvlNode_s {
AvlNode *right; //Right Child Node
AvlNode *left; //Left Child Node
u64 pte; //Page Table Entry (Phys Addr)
byte level; //Node Depth
u64 vaddr_lo; //Virtual Addr and Pivot
u64 vaddr_hi; //Virtual Addr and Pivot
};

Would exceed the current 32-byte size limit.

The Chain-Hash Entry would be more like:
struct HashNode_s {
HashNode *next; //Next entry in Hash
u64 pte; //Page Table Entry (Phys Addr)
u64 vaddr_lo; //Virtual Addr and Pivot
u64 vaddr_hi; //Virtual Addr and Pivot
};

This isn't too far off from how it works already...

Hash-list lookup, expressed this way, would be effectively:
cur=pagehash[hidx];
while(cur)
{
if( (vaddr_hi==cur->vaddr_hi) &&
(vaddr_lo==cur->vaddr_lo))
return(cur->pte);
cur=cur->next;
}

> I am just in the process of getting a hash table to work. It is only 3.5x as
> many block rams as the TLB was. Uses a 256-bit wide port on the cpu side
> for loading and storing and a 2048-bit wide port on the memory side to
> allow an entire page group to be read in a single cycle. It has the same
> performance as a cache. But the block RAM is dedicated in the BIU, bus
> interface unit, so it would not work with a multi-core cpu. Keeping the
> RAM coherent between cores would be a challenge.

OK.

As noted, I am using a software managed TLB, where by extension the page
management structures are mostly written in C (with some bits written in
ASM).

....

Re: Packing Hash Tables

<t15boq$pq9$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24329&group=comp.arch#24329

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
Date: Sat, 19 Mar 2022 12:43:21 -0700
Organization: A noiseless patient Spider
Lines: 171
Message-ID: <t15boq$pq9$1@dont-email.me>
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>
<d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com>
<db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad>
<f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me>
<3cd629e4-3623-4ad2-9e0f-ea9c992325e0n@googlegroups.com>
<t0rjd1$d6i$1@dont-email.me>
<448e523a-bd08-414d-8489-f7453f2218edn@googlegroups.com>
<a4bb1e3e-d311-4dac-90d8-4008ad217105n@googlegroups.com>
<t0t9js$dg9$1@dont-email.me> <pr-dnczU8cmty6n_nZ2dnUU7-efNnZ2d@supernews.com>
<t131k2$3di$1@dont-email.me>
<08607ccc-6ffc-4279-9f38-268df12beae6n@googlegroups.com>
<t13e7v$nio$1@dont-email.me>
<7e360dc1-95ab-493a-9cab-a6865222a8fan@googlegroups.com>
<t159k2$8la$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 19 Mar 2022 19:43:22 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="d6b68c0ad0549b324853aaeeb5e2344e";
logging-data="26441"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18FDhVsyzY7gmStnsdKbT9+"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.2
Cancel-Lock: sha1:GTkfUmbbBLHJbGvLEoLDlR7Eivs=
In-Reply-To: <t159k2$8la$1@dont-email.me>
Content-Language: en-US

by: Ivan Godard - Sat, 19 Mar 2022 19:43 UTC

On 3/19/2022 12:06 PM, BGB wrote:
> On 3/19/2022 8:26 AM, robf...@gmail.com wrote:
>> On Friday, March 18, 2022 at 10:13:22 PM UTC-4, BGB wrote:
>>> On 3/18/2022 5:53 PM, MitchAlsup wrote:
>>>> On Friday, March 18, 2022 at 5:37:57 PM UTC-5, BGB wrote:
>>>>> On 3/18/2022 4:51 AM, a...@littlepinkcloud.invalid wrote:
>>>>>> BGB <cr8...@gmail.com> wrote:
>>>>>>>
>>>>>>> I am not sure whether "in general" a chain-hash or B-Tree would be
>>>>>>> better for a large sparse address space, would need to model this.
>>>>>>
>>>>>> Or perhaps a hybrid: do you know about Phil Bagwell's Ideal Hash
>>>>>> Trees? They're great for searching, high density, but the cost of
>>>>>> updating may be too high for your application. They also depend
>>>>>> critically on a fast POPCOUNT instruction. But even if they don't
>>>>>> work
>>>>>> for your application, you might find some inspiration here.
>>>>>>
>>>>>> https://lampwww.epfl.ch/papers/idealhashtrees.pdf
>>>>>>
>>>>> It is possible, though I don't have a direct analog of this
>>>>> instruction.
>>>>>
>>>>>
>>>>> I have yet to evaluate performance tradeoffs, but thus far the AVL
>>>>> trees
>>>>> appear to be the clear winner in terms of memory overhead (for some
>>>>> simple synthetic tests) among the options I have tested thus far.
>>>>>
>>>>>
>>>>> So, for example, 256x 1MB at random addresses:
>>>>> 3-level page table: 8MB
>>>>> 8-level page table: 33MB
>>>>> AVL Tree: 512K
>>>>> Hybrid AVL (Last Level is Page Table): 4MB
>>>>>
>>>>> Current suspicion is that an AVL tree walk will be significantly
>>>>> slower
>>>>> than a page-table walk.
>>>> <
>>>> At 1 memory ref per level and 1-bit of VA per level, you can be your
>>>> bottom
>>>> dollar it would be slow. 51-level table when application is "really
>>>> big". When
>>>> application is the sizeof "cat" you would only be making 3-accesses
>>>> per walk.
>>>> <
>>>> With 1MB sized application and 4K pages, 8-level AVL walk, but the
>>>> size of the
>>>> page tables is about 2 pages. With level skipping conventional page
>>>> tables
>>>> the number of pages will be about 5. With level-skipping
>>>> sized-super-pages
>>>> the size of the translation table is about 1 page.
>>>> <
>>> In this case, it is more about trying to conserve memory than making it
>>> fast. Aggressive ASLR would do bad things to page-table size.
>>>
>>>
>>> In ongoing tests, direct page table is fastest by a significant margin.
>>>
>>> AVL vs Hash seems to be break-even ~ 384MB (24576 pages).
>>> Less than this, Hash is faster;
>>> More than this, AVL is faster.
>>>
>>>
>>> Both of these seem to be around 10x slower than the 3-level page table.
>>> The 8-level table is around 4x slower than the 3-level table.
>>>
>>>
>>>
>>> At this point:
>>> Hash size is 2048, So ~ 12 steps per hash chain.
>>> Log2 is ~ 14.6 ...
>>>
>>> This is slightly curious though as I would have expected the chain-hash
>>> walk to be slightly faster. Either way, seems to follow the (expected)
>>> pattern that whichever results in the shorter path length wins.
>>>
>>>
>>> The "hybrid" strategy is around twice as fast as the pure AVL, or around
>>> 5x slower than the 3-level page table.
>>>
>>> The hybrid strategy benefits considerably if one does a smaller number
>>> of larger allocations (whereas neither Hash nor AVL gain much here).
>>>
>>>
>>> Both the AVL Tree and Chain Hash seem to end up taking approximately the
>>> same amount of memory in this case.
>>>
>>>
>>> It also seems that the time is directly proportional to the number of
>>> steps needed to get to the target.
>>>
>>>
>>>
>>> Granted, ended up "cheesing it" for the AVL rebalancing, as I didn't
>>> implement the full set of node rotations (2 out of 4), which seems to
>>> result in the rebalancing getting "stuck" in situations where it tries
>>> in vain to endlessly rotate the same nodes (and getting stuck in an
>>> infinite loop).
>>>
>>> The cheese strategy being to keep a counter, and if the rebalancing
>>> takes too many steps, it gives up and stops rotating the nodes.
>>>
>>>
>>> Could try implementing a B-Tree, but at this point I am not super
>>> optimistic about it...
>>>
>>>
>>> Some of this also opens up the possibility of a Radix-8 or Radix-16
>>> table, which functions (in premise) similar to a conventional page
>>> table, except that each directory only has 8 or 16 entries.
>>>
>>> Though, a Radix-16 table would need 24 steps to resolve a 96-bit
>>> address, which implies that (based on the currently seen pattern) this
>>> would likely be slower than the AVL tree.
>>>
>>> ...
>>
>>> At this point:
>>> Hash size is 2048, So ~ 12 steps per hash chain.
>>> Log2 is ~ 14.6 ...
>>
>> That does not sound right. A decent hash usually averages 1.2 steps.
>> Decimal in the wrong place?
>
> 24576 pages, with a 2048-entry chain hash: 24576/2048 = 12.
> Both theoretical estimates and measured values appear to roughly agree
> here.
>
> One could make the starting hash bigger, which would reduce this, say:
> 24576 pages, with a 4096-entry chain hash: 24576/4096 = 6.
>
> So, one goes through the first stage hash, and then it becomes a linked
> list walk. The size of the first hash working as constant division
> factor for the average linked-list length.
>
>
> This was the measured "break-even" point (according to benchmarks on
> x86-64). A smaller number of pages favors the hash, a larger number
> favors AVL.
>
>
>
> A direct hash would give 1 or 2 steps, however, a direct hash could not
> be expanded linearly (resizing a direct hash requires rehashing
> everything in the hash table).

Rehashing need not be all encompassing.

The scheme I've used has a starting hash table of some smallish size. If
that fills for a trigger level, I create a new empty table twice (or
some factor) the size, linked to the old table. Searches start in the
new table, and of course miss and then look in the old table, whence
hits move the entry to the new table, to the position that was just
found to be empty so no recalculation of the hash is needed. Soon all
the hotties are in the new table, so the amortized cost is close to one
probe.

Rinse repeat. When the chain of tables gets too long (another trigger) a
background process walks the oldest table, touching all remaining live
entries and hence hoisting them to a newer table. The oldest table is
then delinked. While somewhat sensitive to the distribution of (and
definition of) liveness, three tables seem to be a good chain length in
practice.

Because both new table creation and old table destruction are triggered
background events, there is no global-halt-and-sweep as would be
required for a total rehash. And the mean probe depth is close to one
because all the hot references are on top.

Re: Packing Hash Tables

<%UqZJ.22841$4T.21873@fx24.iad>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24331&group=comp.arch#24331

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!aioe.org!feeder1.feed.usenet.farm!feed.usenet.farm!peer01.ams4!peer.am4.highwinds-media.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx24.iad.POSTED!not-for-mail
From: ThatWoul...@thevillage.com (EricP)
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com> <3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com> <FAqXJ.156801$Gojc.137903@fx99.iad> <691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com> <d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com> <de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com> <IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com> <t0r6hu$p5m$1@dont-email.me> <3TbYJ.121927$GjY3.90377@fx01.iad> <ddba6add-19ab-43b6-b98f-58110a886f8cn@googlegroups.com> <6d38f64c-ff74-45f9-9ee5-3d3619518d15n@googlegroups.com> <4VoYJ.108094$3jp8.77411@fx33.iad> <63fe646c-495b-4a18-934f-3d3b51dce9e9n@googlegroups.com> <D72ZJ.163258$Tr18.67505@fx42.iad> <3be6ed80-9ce6-4c38-947b-2a5e77b15060n@googlegroups.com>
In-Reply-To: <3be6ed80-9ce6-4c38-947b-2a5e77b15060n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 70
Message-ID: <%UqZJ.22841$4T.21873@fx24.iad>
X-Complaints-To: abuse@UsenetServer.com
NNTP-Posting-Date: Sat, 19 Mar 2022 20:19:07 UTC
Date: Sat, 19 Mar 2022 16:18:58 -0400
X-Received-Bytes: 4803

by: EricP - Sat, 19 Mar 2022 20:18 UTC

MitchAlsup wrote:
> On Friday, March 18, 2022 at 11:08:08 AM UTC-5, EricP wrote:
>> MitchAlsup wrote:
>>> <
>>> Page Fault gets attached to thread number (which accesses the faulting IP)
>>> ....................gets the faulting instruction-specifier in R1
>> I assume this is the instruction pointer.
> <
> It is the first 32-bits of the instruction itself. This contains everything but the
> constants.

If it faults during instruction fetch then this instruction specifier
buffer could be empty or partially filled.

It absolutely needs the address of the start of the faulting instruction.
Structured exception handlers. Debugger watch points.
These can't function without it.

You can provide this partial instruction bits in addition to the standard
exception details. Though as I point out below I don't think anyone
will need this if the page fault syndrome bits are provided.

> <
>>> ....................gets the faulting virtual address in R2
>>> ....................gets the address of fault generating PTE/PTP in R3
>> Passing the physical address of the faulting PTE to the page fault
>> handler doesn't harm anything but it doesn't really help.
>> Though I've not seen a page fault exception frame the does this.
>>
>> This misses the critical page fault reason info. The page fault handler
>> needs to know what the thread was attempting to do when it triggered
>> the fault. Looking at x86 that is a set of bit flags indicating:
>>
>> P = Caused by Not-Present PTE, or other protection codes
>> R/W = Caused by Read, or caused by Write
>> U/S = Cause by User/Super mode
>> RSVD = Reserved bit non-zero
>> I/D = access was instruction fetch, or data access
>> PK = Protection Key violation
>> SGX = SGX Software Guard Extensions violation
> <
> The given address points at the PT[PE] where the fault was detected.
> So, for example if you were trying to Read the first PT[PE] that does not have
> read permission is where you end up pointing {same for write and execute}
> <
> The instruction tells you what the instruction stream was doing.
> Inst<31..26> = 001001 & Inst<10..5> = 0000xx -> LD
> Inst<31..26> = 001001 & Inst<10..5> = 00x1xx -> ST
> Inst<31..26> = 1000xx LD
> Inst<31..26> = 1001xx ST
> <
> So, there is a clever trick, Because My 66000 detects all mal-formed instructions,
> {mal-formed instructions arrive at the Operation exception handler}
> the page fault handler can look at Instruction<31> and decide to look at
> Inst<28> or Inst<7> where 0->LD and 1->ST.
> So 3 PREDs in a row parse the instruction that caused the page fault.
> <
> But you make a good point:: I have another 32-bits in R1 I could completely
> specify what problem was detected. Let me work on that.

The fault syndrome bits should be sufficient.
I don't think anyone wants to have to reverse guesstimate what the
problem was in a page fault handler by parsing instructions.

An emulator might want to parse the faulting instructions but that
is an application specific use for the general page fault mechanism
and would take place in an application specific user mode handler.

Re: Packing Hash Tables

<c99570ff-5f9e-435c-a98b-62515bf352ffn@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24394&group=comp.arch#24394

copy link Newsgroups: comp.arch

X-Received: by 2002:a0c:f70d:0:b0:441:4558:b70c with SMTP id w13-20020a0cf70d000000b004414558b70cmr1104292qvn.82.1647930484048;
Mon, 21 Mar 2022 23:28:04 -0700 (PDT)
X-Received: by 2002:a4a:9772:0:b0:320:f0e1:c82d with SMTP id
v47-20020a4a9772000000b00320f0e1c82dmr7809241ooi.54.1647930483719; Mon, 21
Mar 2022 23:28:03 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Mon, 21 Mar 2022 23:28:03 -0700 (PDT)
In-Reply-To: <%UqZJ.22841$4T.21873@fx24.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de4:1f00:d4e5:9ef0:4ed5:f074;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de4:1f00:d4e5:9ef0:4ed5:f074
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com> <FAqXJ.156801$Gojc.137903@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me> <3TbYJ.121927$GjY3.90377@fx01.iad>
<ddba6add-19ab-43b6-b98f-58110a886f8cn@googlegroups.com> <6d38f64c-ff74-45f9-9ee5-3d3619518d15n@googlegroups.com>
<4VoYJ.108094$3jp8.77411@fx33.iad> <63fe646c-495b-4a18-934f-3d3b51dce9e9n@googlegroups.com>
<D72ZJ.163258$Tr18.67505@fx42.iad> <3be6ed80-9ce6-4c38-947b-2a5e77b15060n@googlegroups.com>
<%UqZJ.22841$4T.21873@fx24.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c99570ff-5f9e-435c-a98b-62515bf352ffn@googlegroups.com>
Subject: Re: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Tue, 22 Mar 2022 06:28:03 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

by: robf...@gmail.com - Tue, 22 Mar 2022 06:28 UTC

On Saturday, March 19, 2022 at 4:19:12 PM UTC-4, EricP wrote:
> MitchAlsup wrote:
> > On Friday, March 18, 2022 at 11:08:08 AM UTC-5, EricP wrote:
> >> MitchAlsup wrote:
> >>> <
> >>> Page Fault gets attached to thread number (which accesses the faulting IP)
> >>> ....................gets the faulting instruction-specifier in R1
> >> I assume this is the instruction pointer.
> > <
> > It is the first 32-bits of the instruction itself. This contains everything but the
> > constants.
> If it faults during instruction fetch then this instruction specifier
> buffer could be empty or partially filled.
>
> It absolutely needs the address of the start of the faulting instruction.
> Structured exception handlers. Debugger watch points.
> These can't function without it.
>
> You can provide this partial instruction bits in addition to the standard
> exception details. Though as I point out below I don't think anyone
> will need this if the page fault syndrome bits are provided.
> > <
> >>> ....................gets the faulting virtual address in R2
> >>> ....................gets the address of fault generating PTE/PTP in R3
> >> Passing the physical address of the faulting PTE to the page fault
> >> handler doesn't harm anything but it doesn't really help.
> >> Though I've not seen a page fault exception frame the does this.
> >>
> >> This misses the critical page fault reason info. The page fault handler
> >> needs to know what the thread was attempting to do when it triggered
> >> the fault. Looking at x86 that is a set of bit flags indicating:
> >>
> >> P = Caused by Not-Present PTE, or other protection codes
> >> R/W = Caused by Read, or caused by Write
> >> U/S = Cause by User/Super mode
> >> RSVD = Reserved bit non-zero
> >> I/D = access was instruction fetch, or data access
> >> PK = Protection Key violation
> >> SGX = SGX Software Guard Extensions violation
> > <
> > The given address points at the PT[PE] where the fault was detected.
> > So, for example if you were trying to Read the first PT[PE] that does not have
> > read permission is where you end up pointing {same for write and execute}
> > <
> > The instruction tells you what the instruction stream was doing.
> > Inst<31..26> = 001001 & Inst<10..5> = 0000xx -> LD
> > Inst<31..26> = 001001 & Inst<10..5> = 00x1xx -> ST
> > Inst<31..26> = 1000xx LD
> > Inst<31..26> = 1001xx ST
> > <
> > So, there is a clever trick, Because My 66000 detects all mal-formed instructions,
> > {mal-formed instructions arrive at the Operation exception handler}
> > the page fault handler can look at Instruction<31> and decide to look at
> > Inst<28> or Inst<7> where 0->LD and 1->ST.
> > So 3 PREDs in a row parse the instruction that caused the page fault.
> > <
> > But you make a good point:: I have another 32-bits in R1 I could completely
> > specify what problem was detected. Let me work on that.
> The fault syndrome bits should be sufficient.
> I don't think anyone wants to have to reverse guesstimate what the
> problem was in a page fault handler by parsing instructions.
>
> An emulator might want to parse the faulting instructions but that
> is an application specific use for the general page fault mechanism
> and would take place in an application specific user mode handler.

For my core I have both the address of the faulting instruction stored in an exception
instruction pointer register, and the faulting address stored in a bad address register.
The cause of an exception is stored in a cause register. Maybe the cause register
does not have enough information, but it can tell if the fault was data or instruction,
or read, write, or execute violations.

Looks like its back to the drawing board for me. Planning on using compressed
pages, an idea gleaned from BGB’s posting.
I was using an inverted hash table, but had not considered memory paged out to disk.
Being able to swap out to disk means making the page table larger, meaning it will no
longer fit in block RAM. So, back to using a TLB. I was thinking of increasing the page
table size by a factor four, ¼ real memory ¾ disk storage. I gather its not a good idea
to have too large of a swap file.

The TLBE is 256 bits in size, and about 90% of the bits are used. The PTE could be
smaller, but I am having it mimic the TLBE to keep things simple.
For the hash table, hash groups are used to alleviate the need to walk chains. The hash
chain walks by a group not by a single entry. Groups contain eight entries that hashed to
the same value or are displaced, so there is seldom a need to walk more than two or
three steps.

Re: Packing Hash Tables

<0bf63080-454f-4405-9f97-c452c82c0e52n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24406&group=comp.arch#24406

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:208:b0:2e1:b3ec:b7ce with SMTP id b8-20020a05622a020800b002e1b3ecb7cemr20877124qtx.345.1647961633778;
Tue, 22 Mar 2022 08:07:13 -0700 (PDT)
X-Received: by 2002:a05:6808:11cc:b0:2d9:a01a:4ba6 with SMTP id
p12-20020a05680811cc00b002d9a01a4ba6mr2316508oiv.205.1647961633490; Tue, 22
Mar 2022 08:07:13 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 22 Mar 2022 08:07:13 -0700 (PDT)
In-Reply-To: <c99570ff-5f9e-435c-a98b-62515bf352ffn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:50e0:1014:d0d0:a52e;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:50e0:1014:d0d0:a52e
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com> <FAqXJ.156801$Gojc.137903@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me> <3TbYJ.121927$GjY3.90377@fx01.iad>
<ddba6add-19ab-43b6-b98f-58110a886f8cn@googlegroups.com> <6d38f64c-ff74-45f9-9ee5-3d3619518d15n@googlegroups.com>
<4VoYJ.108094$3jp8.77411@fx33.iad> <63fe646c-495b-4a18-934f-3d3b51dce9e9n@googlegroups.com>
<D72ZJ.163258$Tr18.67505@fx42.iad> <3be6ed80-9ce6-4c38-947b-2a5e77b15060n@googlegroups.com>
<%UqZJ.22841$4T.21873@fx24.iad> <c99570ff-5f9e-435c-a98b-62515bf352ffn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <0bf63080-454f-4405-9f97-c452c82c0e52n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Tue, 22 Mar 2022 15:07:13 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 118

by: MitchAlsup - Tue, 22 Mar 2022 15:07 UTC

On Tuesday, March 22, 2022 at 1:28:05 AM UTC-5, robf...@gmail.com wrote:
> On Saturday, March 19, 2022 at 4:19:12 PM UTC-4, EricP wrote:
> > MitchAlsup wrote:
> > > On Friday, March 18, 2022 at 11:08:08 AM UTC-5, EricP wrote:
> > >> MitchAlsup wrote:
> > >>> <
> > >>> Page Fault gets attached to thread number (which accesses the faulting IP)
> > >>> ....................gets the faulting instruction-specifier in R1
> > >> I assume this is the instruction pointer.
> > > <
> > > It is the first 32-bits of the instruction itself. This contains everything but the
> > > constants.
> > If it faults during instruction fetch then this instruction specifier
> > buffer could be empty or partially filled.
> >
> > It absolutely needs the address of the start of the faulting instruction.
> > Structured exception handlers. Debugger watch points.
> > These can't function without it.
> >
> > You can provide this partial instruction bits in addition to the standard
> > exception details. Though as I point out below I don't think anyone
> > will need this if the page fault syndrome bits are provided.
> > > <
> > >>> ....................gets the faulting virtual address in R2
> > >>> ....................gets the address of fault generating PTE/PTP in R3
> > >> Passing the physical address of the faulting PTE to the page fault
> > >> handler doesn't harm anything but it doesn't really help.
> > >> Though I've not seen a page fault exception frame the does this.
> > >>
> > >> This misses the critical page fault reason info. The page fault handler
> > >> needs to know what the thread was attempting to do when it triggered
> > >> the fault. Looking at x86 that is a set of bit flags indicating:
> > >>
> > >> P = Caused by Not-Present PTE, or other protection codes
> > >> R/W = Caused by Read, or caused by Write
> > >> U/S = Cause by User/Super mode
> > >> RSVD = Reserved bit non-zero
> > >> I/D = access was instruction fetch, or data access
> > >> PK = Protection Key violation
> > >> SGX = SGX Software Guard Extensions violation
> > > <
> > > The given address points at the PT[PE] where the fault was detected.
> > > So, for example if you were trying to Read the first PT[PE] that does not have
> > > read permission is where you end up pointing {same for write and execute}
> > > <
> > > The instruction tells you what the instruction stream was doing.
> > > Inst<31..26> = 001001 & Inst<10..5> = 0000xx -> LD
> > > Inst<31..26> = 001001 & Inst<10..5> = 00x1xx -> ST
> > > Inst<31..26> = 1000xx LD
> > > Inst<31..26> = 1001xx ST
> > > <
> > > So, there is a clever trick, Because My 66000 detects all mal-formed instructions,
> > > {mal-formed instructions arrive at the Operation exception handler}
> > > the page fault handler can look at Instruction<31> and decide to look at
> > > Inst<28> or Inst<7> where 0->LD and 1->ST.
> > > So 3 PREDs in a row parse the instruction that caused the page fault.
> > > <
> > > But you make a good point:: I have another 32-bits in R1 I could completely
> > > specify what problem was detected. Let me work on that.
> > The fault syndrome bits should be sufficient.
> > I don't think anyone wants to have to reverse guesstimate what the
> > problem was in a page fault handler by parsing instructions.
> >
> > An emulator might want to parse the faulting instructions but that
> > is an application specific use for the general page fault mechanism
> > and would take place in an application specific user mode handler.
> For my core I have both the address of the faulting instruction stored in an exception
> instruction pointer register, and the faulting address stored in a bad address register.
> The cause of an exception is stored in a cause register. Maybe the cause register
> does not have enough information, but it can tell if the fault was data or instruction,
> or read, write, or execute violations.
<
There is also the case of a poorly-formed table entry.
>
> Looks like its back to the drawing board for me. Planning on using compressed
> pages, an idea gleaned from BGB’s posting.
> I was using an inverted hash table, but had not considered memory paged out to disk.
> Being able to swap out to disk means making the page table larger, meaning it will no
> longer fit in block RAM. So, back to using a TLB. I was thinking of increasing the page
> table size by a factor four, ¼ real memory ¾ disk storage. I gather its not a good idea
> to have too large of a swap file.
>
> The TLBE is 256 bits in size, and about 90% of the bits are used. The PTE could be
> smaller, but I am having it mimic the TLBE to keep things simple.
> For the hash table, hash groups are used to alleviate the need to walk chains. The hash
> chain walks by a group not by a single entry. Groups contain eight entries that hashed to
> the same value or are displaced, so there is seldom a need to walk more than two or
> three steps.

Re: Packing Hash Tables

<06618615-124e-4cf1-a0e9-f9a1f5c66539n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24411&group=comp.arch#24411

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:1309:b0:2e1:cc2f:3738 with SMTP id v9-20020a05622a130900b002e1cc2f3738mr20570134qtk.655.1647964785499;
Tue, 22 Mar 2022 08:59:45 -0700 (PDT)
X-Received: by 2002:aca:905:0:b0:2ee:f62a:e08e with SMTP id
5-20020aca0905000000b002eef62ae08emr2384544oij.54.1647964785235; Tue, 22 Mar
2022 08:59:45 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Tue, 22 Mar 2022 08:59:45 -0700 (PDT)
In-Reply-To: <0bf63080-454f-4405-9f97-c452c82c0e52n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de4:1f00:d81a:9af8:67a3:8252;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de4:1f00:d81a:9af8:67a3:8252
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<3c363fd7-4004-4d99-bea7-7eb3c5cb3aa6n@googlegroups.com> <FAqXJ.156801$Gojc.137903@fx99.iad>
<691e9232-da74-43c2-9b07-e6dd2a9fd406n@googlegroups.com> <dc768c6b-9ab2-45ed-9236-6543b645c051n@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me> <3TbYJ.121927$GjY3.90377@fx01.iad>
<ddba6add-19ab-43b6-b98f-58110a886f8cn@googlegroups.com> <6d38f64c-ff74-45f9-9ee5-3d3619518d15n@googlegroups.com>
<4VoYJ.108094$3jp8.77411@fx33.iad> <63fe646c-495b-4a18-934f-3d3b51dce9e9n@googlegroups.com>
<D72ZJ.163258$Tr18.67505@fx42.iad> <3be6ed80-9ce6-4c38-947b-2a5e77b15060n@googlegroups.com>
<%UqZJ.22841$4T.21873@fx24.iad> <c99570ff-5f9e-435c-a98b-62515bf352ffn@googlegroups.com>
<0bf63080-454f-4405-9f97-c452c82c0e52n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <06618615-124e-4cf1-a0e9-f9a1f5c66539n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Tue, 22 Mar 2022 15:59:45 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 159

by: robf...@gmail.com - Tue, 22 Mar 2022 15:59 UTC

On Tuesday, March 22, 2022 at 11:07:15 AM UTC-4, MitchAlsup wrote:
> On Tuesday, March 22, 2022 at 1:28:05 AM UTC-5, robf...@gmail.com wrote:
> > On Saturday, March 19, 2022 at 4:19:12 PM UTC-4, EricP wrote:
> > > MitchAlsup wrote:
> > > > On Friday, March 18, 2022 at 11:08:08 AM UTC-5, EricP wrote:
> > > >> MitchAlsup wrote:
> > > >>> <
> > > >>> Page Fault gets attached to thread number (which accesses the faulting IP)
> > > >>> ....................gets the faulting instruction-specifier in R1
> > > >> I assume this is the instruction pointer.
> > > > <
> > > > It is the first 32-bits of the instruction itself. This contains everything but the
> > > > constants.
> > > If it faults during instruction fetch then this instruction specifier
> > > buffer could be empty or partially filled.
> > >
> > > It absolutely needs the address of the start of the faulting instruction.
> > > Structured exception handlers. Debugger watch points.
> > > These can't function without it.
> > >
> > > You can provide this partial instruction bits in addition to the standard
> > > exception details. Though as I point out below I don't think anyone
> > > will need this if the page fault syndrome bits are provided.
> > > > <
> > > >>> ....................gets the faulting virtual address in R2
> > > >>> ....................gets the address of fault generating PTE/PTP in R3
> > > >> Passing the physical address of the faulting PTE to the page fault
> > > >> handler doesn't harm anything but it doesn't really help.
> > > >> Though I've not seen a page fault exception frame the does this.
> > > >>
> > > >> This misses the critical page fault reason info. The page fault handler
> > > >> needs to know what the thread was attempting to do when it triggered
> > > >> the fault. Looking at x86 that is a set of bit flags indicating:
> > > >>
> > > >> P = Caused by Not-Present PTE, or other protection codes
> > > >> R/W = Caused by Read, or caused by Write
> > > >> U/S = Cause by User/Super mode
> > > >> RSVD = Reserved bit non-zero
> > > >> I/D = access was instruction fetch, or data access
> > > >> PK = Protection Key violation
> > > >> SGX = SGX Software Guard Extensions violation
> > > > <
> > > > The given address points at the PT[PE] where the fault was detected..
> > > > So, for example if you were trying to Read the first PT[PE] that does not have
> > > > read permission is where you end up pointing {same for write and execute}
> > > > <
> > > > The instruction tells you what the instruction stream was doing.
> > > > Inst<31..26> = 001001 & Inst<10..5> = 0000xx -> LD
> > > > Inst<31..26> = 001001 & Inst<10..5> = 00x1xx -> ST
> > > > Inst<31..26> = 1000xx LD
> > > > Inst<31..26> = 1001xx ST
> > > > <
> > > > So, there is a clever trick, Because My 66000 detects all mal-formed instructions,
> > > > {mal-formed instructions arrive at the Operation exception handler}
> > > > the page fault handler can look at Instruction<31> and decide to look at
> > > > Inst<28> or Inst<7> where 0->LD and 1->ST.
> > > > So 3 PREDs in a row parse the instruction that caused the page fault.
> > > > <
> > > > But you make a good point:: I have another 32-bits in R1 I could completely
> > > > specify what problem was detected. Let me work on that.
> > > The fault syndrome bits should be sufficient.
> > > I don't think anyone wants to have to reverse guesstimate what the
> > > problem was in a page fault handler by parsing instructions.
> > >
> > > An emulator might want to parse the faulting instructions but that
> > > is an application specific use for the general page fault mechanism
> > > and would take place in an application specific user mode handler.
> > For my core I have both the address of the faulting instruction stored in an exception
> > instruction pointer register, and the faulting address stored in a bad address register.
> > The cause of an exception is stored in a cause register. Maybe the cause register
> > does not have enough information, but it can tell if the fault was data or instruction,
> > or read, write, or execute violations.
> <
> There is also the case of a poorly-formed table entry.
> >
> > Looks like its back to the drawing board for me. Planning on using compressed
> > pages, an idea gleaned from BGB’s posting.
> > I was using an inverted hash table, but had not considered memory paged out to disk.
> > Being able to swap out to disk means making the page table larger, meaning it will no
> > longer fit in block RAM. So, back to using a TLB. I was thinking of increasing the page
> > table size by a factor four, ¼ real memory ¾ disk storage. I gather its not a good idea
> > to have too large of a swap file.
> >
> > The TLBE is 256 bits in size, and about 90% of the bits are used. The PTE could be
> > smaller, but I am having it mimic the TLBE to keep things simple.
> > For the hash table, hash groups are used to alleviate the need to walk chains. The hash
> > chain walks by a group not by a single entry. Groups contain eight entries that hashed to
> > the same value or are displaced, so there is seldom a need to walk more than two or
> > three steps.

There are only a couple of things I could think of for poorly formed table entries. If the
physical address is not a legal value, or if the page mask begin is greater than the page
mask end. Otherwise, the remaining fields could have just about any value. If the physical
address is not legal a PMA exception will occur when the address is mapped. If the page
mask begin is greater than the end, then there will be no access to the page resulting in
read, write, or execute violations. I suppose the cause code could be expanded to add
syndrome bits. They may be useful for other units too.

I have the page compression indicators covering a 4kB block sub-page range. While the
page mask begin, end divides the page into 1kB blocks. It may be better to cover the
same range with both fields. 4kB blocks might be used to emulate an older system.
The first half of the PTE is mainly the translation info. The second half is access rights,
key and compression indicators. The second half of the PTE is loaded from a per-page
page management table. It is info that needs to be in the TLBE so the PTE is just used to
cache the information. The second half of the PTE could be stuffed with other information
for software use until the access rights info is made valid.

I am assuming the PPN field can be used to store the page-file swap address when the
page is not resident in memory.

Re: Packing Hash Tables

<t1dcq3$qst$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24417&group=comp.arch#24417

copy link Newsgroups: comp.arch

Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Packing Hash Tables
Date: Tue, 22 Mar 2022 15:50:09 -0500
Organization: A noiseless patient Spider
Lines: 307
Message-ID: <t1dcq3$qst$1@dont-email.me>
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com>
<d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com>
<db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad>
<f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me> <3TbYJ.121927$GjY3.90377@fx01.iad>
<ddba6add-19ab-43b6-b98f-58110a886f8cn@googlegroups.com>
<6d38f64c-ff74-45f9-9ee5-3d3619518d15n@googlegroups.com>
<4VoYJ.108094$3jp8.77411@fx33.iad>
<63fe646c-495b-4a18-934f-3d3b51dce9e9n@googlegroups.com>
<D72ZJ.163258$Tr18.67505@fx42.iad>
<3be6ed80-9ce6-4c38-947b-2a5e77b15060n@googlegroups.com>
<%UqZJ.22841$4T.21873@fx24.iad>
<c99570ff-5f9e-435c-a98b-62515bf352ffn@googlegroups.com>
<0bf63080-454f-4405-9f97-c452c82c0e52n@googlegroups.com>
<06618615-124e-4cf1-a0e9-f9a1f5c66539n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 22 Mar 2022 20:50:11 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="704730ca154d89a2457bb61df0c35dcd";
logging-data="27549"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19vGyIWR5D8sjLqMt4XwTdK"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.6.1
Cancel-Lock: sha1:EQP4qoRKr4OL5PYXTyA0OJpZVT4=
In-Reply-To: <06618615-124e-4cf1-a0e9-f9a1f5c66539n@googlegroups.com>
Content-Language: en-US

by: BGB - Tue, 22 Mar 2022 20:50 UTC

On 3/22/2022 10:59 AM, robf...@gmail.com wrote:
> On Tuesday, March 22, 2022 at 11:07:15 AM UTC-4, MitchAlsup wrote:
>> On Tuesday, March 22, 2022 at 1:28:05 AM UTC-5, robf...@gmail.com wrote:
>>> On Saturday, March 19, 2022 at 4:19:12 PM UTC-4, EricP wrote:
>>>> MitchAlsup wrote:
>>>>> On Friday, March 18, 2022 at 11:08:08 AM UTC-5, EricP wrote:
>>>>>> MitchAlsup wrote:
>>>>>>> <
>>>>>>> Page Fault gets attached to thread number (which accesses the faulting IP)
>>>>>>> ....................gets the faulting instruction-specifier in R1
>>>>>> I assume this is the instruction pointer.
>>>>> <
>>>>> It is the first 32-bits of the instruction itself. This contains everything but the
>>>>> constants.
>>>> If it faults during instruction fetch then this instruction specifier
>>>> buffer could be empty or partially filled.
>>>>
>>>> It absolutely needs the address of the start of the faulting instruction.
>>>> Structured exception handlers. Debugger watch points.
>>>> These can't function without it.
>>>>
>>>> You can provide this partial instruction bits in addition to the standard
>>>> exception details. Though as I point out below I don't think anyone
>>>> will need this if the page fault syndrome bits are provided.
>>>>> <
>>>>>>> ....................gets the faulting virtual address in R2
>>>>>>> ....................gets the address of fault generating PTE/PTP in R3
>>>>>> Passing the physical address of the faulting PTE to the page fault
>>>>>> handler doesn't harm anything but it doesn't really help.
>>>>>> Though I've not seen a page fault exception frame the does this.
>>>>>>
>>>>>> This misses the critical page fault reason info. The page fault handler
>>>>>> needs to know what the thread was attempting to do when it triggered
>>>>>> the fault. Looking at x86 that is a set of bit flags indicating:
>>>>>>
>>>>>> P = Caused by Not-Present PTE, or other protection codes
>>>>>> R/W = Caused by Read, or caused by Write
>>>>>> U/S = Cause by User/Super mode
>>>>>> RSVD = Reserved bit non-zero
>>>>>> I/D = access was instruction fetch, or data access
>>>>>> PK = Protection Key violation
>>>>>> SGX = SGX Software Guard Extensions violation
>>>>> <
>>>>> The given address points at the PT[PE] where the fault was detected.
>>>>> So, for example if you were trying to Read the first PT[PE] that does not have
>>>>> read permission is where you end up pointing {same for write and execute}
>>>>> <
>>>>> The instruction tells you what the instruction stream was doing.
>>>>> Inst<31..26> = 001001 & Inst<10..5> = 0000xx -> LD
>>>>> Inst<31..26> = 001001 & Inst<10..5> = 00x1xx -> ST
>>>>> Inst<31..26> = 1000xx LD
>>>>> Inst<31..26> = 1001xx ST
>>>>> <
>>>>> So, there is a clever trick, Because My 66000 detects all mal-formed instructions,
>>>>> {mal-formed instructions arrive at the Operation exception handler}
>>>>> the page fault handler can look at Instruction<31> and decide to look at
>>>>> Inst<28> or Inst<7> where 0->LD and 1->ST.
>>>>> So 3 PREDs in a row parse the instruction that caused the page fault.
>>>>> <
>>>>> But you make a good point:: I have another 32-bits in R1 I could completely
>>>>> specify what problem was detected. Let me work on that.
>>>> The fault syndrome bits should be sufficient.
>>>> I don't think anyone wants to have to reverse guesstimate what the
>>>> problem was in a page fault handler by parsing instructions.
>>>>
>>>> An emulator might want to parse the faulting instructions but that
>>>> is an application specific use for the general page fault mechanism
>>>> and would take place in an application specific user mode handler.
>>> For my core I have both the address of the faulting instruction stored in an exception
>>> instruction pointer register, and the faulting address stored in a bad address register.
>>> The cause of an exception is stored in a cause register. Maybe the cause register
>>> does not have enough information, but it can tell if the fault was data or instruction,
>>> or read, write, or execute violations.
>> <
>> There is also the case of a poorly-formed table entry.
>>>
>>> Looks like its back to the drawing board for me. Planning on using compressed
>>> pages, an idea gleaned from BGB’s posting.
>>> I was using an inverted hash table, but had not considered memory paged out to disk.
>>> Being able to swap out to disk means making the page table larger, meaning it will no
>>> longer fit in block RAM. So, back to using a TLB. I was thinking of increasing the page
>>> table size by a factor four, ¼ real memory ¾ disk storage. I gather its not a good idea
>>> to have too large of a swap file.
>>>

I think the size of a swap file is a matter of interpretation.

In current tests, sizes are:
RAM = 128MB (set to match the Nexys A7 I am using)
SWAP = 256MB or 384MB in emulation and simulation tests.

On my SDcard, IIRC, the swapfile was set up as 2GB (and I didn't
originally think it would take this long to before trying to get this
stuff working).

I am not assuming any particular limit at the moment, apart from the 4GB
size limit implied by FAT32 (at present, this would be 256k pages).

Currently, the swap code sets up a 32MB region of RAM to be used as swap
pages, with the rest (for now) being left for either physical mapping or
direct remapping (where pages are mapped into virtual addresses, but are
not backed by swap).

I initially thought 32MB would be too limiting, but it appears that Doom
and similar can sit pretty easily in this space.

Doom doesn't really generate any swapfile traffic, but Quake uses enough
memory that does start accessing swap.

One limit for swapfile size would be the relative cost of any management
structures.

So, for example, with a 4GB swapfile, if spending 8B per page to track
page location / compression, this would be an ~ 2MB array, ...

In the current (uncompressed) page-table scheme, page management is
mostly limited to a few bitmaps and similar.

Though, this stuff still doesn't work particularly reliably as of yet
(trying to run programs from swapfile backed memory is still pretty buggy).

There is also some hackery, like needing to make sure that everything in
the FAT driver and similar is using physically mapped memory.

There are ending up being various functions for memory allocation with
various properties:
tk_malloc: Generic malloc, may use Swap
tk_malloc_usr: Usermode accessible (kernel), may use Swap
tk_malloc_wxe: Read/Write/Execute, Physically Mapped
tk_malloc_krn: Kernel-Mode memory, Physically Mapped
...

Internally, these map to a "tk_malloc_cat" function, which adds a
"category" argument describing the type of memory one wants (currently
as a "magic number" scheme).

Generally, anything allocated from "userland" (with the exception of WXE
memory) is currently assumed to use Swap.

Anything which may be accessed from an ISR needs to be physically mapped
(ISRs don't use address translation).

....

However, the program loaders still use direct-mapping for the '.text'
sections and similar (for now at least, may change this later). It is
likely though that the '.data'/'.bss' sections could be put into
swap-backed memory (in my ABI, the read-only and read/write sections are
handled independently of each other, *).

*: This is a little hacky as PE/COFF wasn't really designed to be used
this way, but either way.

Because both may access the SD card, trying to page stuff in or out from
within another IO operation is potentially catastrophic (though the
swapfile code does mostly bypass the FAT driver; it figures out the LBA
range and goes from there).

It doesn't currently deal with any fragmentation in the pagefile, it is
possible I could try to deal with this, to be decided if doing so is
worthwhile.

Would require either:
Pretranslating and caching the FAT cluster chains;
Hacking into the FAT FS driver internals;
Feeding swapfile operations through normal filesystem IO interfaces.

Click here to read the complete article

Re: Packing Hash Tables

<023ca389-bf9d-47a4-a1df-a24e93419a45n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=24461&group=comp.arch#24461

copy link Newsgroups: comp.arch

X-Received: by 2002:a05:622a:148e:b0:2e2:2ebd:63d9 with SMTP id t14-20020a05622a148e00b002e22ebd63d9mr7913487qtx.601.1648186803607;
Thu, 24 Mar 2022 22:40:03 -0700 (PDT)
X-Received: by 2002:a05:6871:6a0:b0:dd:716f:5afd with SMTP id
l32-20020a05687106a000b000dd716f5afdmr4016375oao.69.1648186803399; Thu, 24
Mar 2022 22:40:03 -0700 (PDT)
Path: i2pn2.org!rocksolid2!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 24 Mar 2022 22:40:03 -0700 (PDT)
In-Reply-To: <t1dcq3$qst$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2607:fea8:1de4:1f00:d093:826b:5ca9:8a37;
posting-account=QId4bgoAAABV4s50talpu-qMcPp519Eb
NNTP-Posting-Host: 2607:fea8:1de4:1f00:d093:826b:5ca9:8a37
References: <43b7bfee-85be-4bab-842d-ef0d966f6dcdn@googlegroups.com>
<d28b11ac-d42c-470b-9335-a31014654dc6n@googlegroups.com> <d3d52db1-7513-465e-b120-86e0cab2cf44n@googlegroups.com>
<de548517-85c5-4f38-b012-bf5699d44daan@googlegroups.com> <db8ab033-b373-46dc-9381-2b5e2c82a3a0n@googlegroups.com>
<IY3YJ.108064$3jp8.15219@fx33.iad> <f4b408a9-a145-4a7f-ae80-a3f3fd46babdn@googlegroups.com>
<t0r6hu$p5m$1@dont-email.me> <3TbYJ.121927$GjY3.90377@fx01.iad>
<ddba6add-19ab-43b6-b98f-58110a886f8cn@googlegroups.com> <6d38f64c-ff74-45f9-9ee5-3d3619518d15n@googlegroups.com>
<4VoYJ.108094$3jp8.77411@fx33.iad> <63fe646c-495b-4a18-934f-3d3b51dce9e9n@googlegroups.com>
<D72ZJ.163258$Tr18.67505@fx42.iad> <3be6ed80-9ce6-4c38-947b-2a5e77b15060n@googlegroups.com>
<%UqZJ.22841$4T.21873@fx24.iad> <c99570ff-5f9e-435c-a98b-62515bf352ffn@googlegroups.com>
<0bf63080-454f-4405-9f97-c452c82c0e52n@googlegroups.com> <06618615-124e-4cf1-a0e9-f9a1f5c66539n@googlegroups.com>
<t1dcq3$qst$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <023ca389-bf9d-47a4-a1df-a24e93419a45n@googlegroups.com>
Subject: Re: Packing Hash Tables
From: robfi...@gmail.com (robf...@gmail.com)
Injection-Date: Fri, 25 Mar 2022 05:40:03 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 406

by: robf...@gmail.com - Fri, 25 Mar 2022 05:40 UTC

On Tuesday, March 22, 2022 at 4:50:14 PM UTC-4, BGB wrote:
> On 3/22/2022 10:59 AM, robf...@gmail.com wrote:
> > On Tuesday, March 22, 2022 at 11:07:15 AM UTC-4, MitchAlsup wrote:
> >> On Tuesday, March 22, 2022 at 1:28:05 AM UTC-5, robf...@gmail.com wrote:
> >>> On Saturday, March 19, 2022 at 4:19:12 PM UTC-4, EricP wrote:
> >>>> MitchAlsup wrote:
> >>>>> On Friday, March 18, 2022 at 11:08:08 AM UTC-5, EricP wrote:
> >>>>>> MitchAlsup wrote:
> >>>>>>> <
> >>>>>>> Page Fault gets attached to thread number (which accesses the faulting IP)
> >>>>>>> ....................gets the faulting instruction-specifier in R1
> >>>>>> I assume this is the instruction pointer.
> >>>>> <
> >>>>> It is the first 32-bits of the instruction itself. This contains everything but the
> >>>>> constants.
> >>>> If it faults during instruction fetch then this instruction specifier
> >>>> buffer could be empty or partially filled.
> >>>>
> >>>> It absolutely needs the address of the start of the faulting instruction.
> >>>> Structured exception handlers. Debugger watch points.
> >>>> These can't function without it.
> >>>>
> >>>> You can provide this partial instruction bits in addition to the standard
> >>>> exception details. Though as I point out below I don't think anyone
> >>>> will need this if the page fault syndrome bits are provided.
> >>>>> <
> >>>>>>> ....................gets the faulting virtual address in R2
> >>>>>>> ....................gets the address of fault generating PTE/PTP in R3
> >>>>>> Passing the physical address of the faulting PTE to the page fault
> >>>>>> handler doesn't harm anything but it doesn't really help.
> >>>>>> Though I've not seen a page fault exception frame the does this.
> >>>>>>
> >>>>>> This misses the critical page fault reason info. The page fault handler
> >>>>>> needs to know what the thread was attempting to do when it triggered
> >>>>>> the fault. Looking at x86 that is a set of bit flags indicating:
> >>>>>>
> >>>>>> P = Caused by Not-Present PTE, or other protection codes
> >>>>>> R/W = Caused by Read, or caused by Write
> >>>>>> U/S = Cause by User/Super mode
> >>>>>> RSVD = Reserved bit non-zero
> >>>>>> I/D = access was instruction fetch, or data access
> >>>>>> PK = Protection Key violation
> >>>>>> SGX = SGX Software Guard Extensions violation
> >>>>> <
> >>>>> The given address points at the PT[PE] where the fault was detected..
> >>>>> So, for example if you were trying to Read the first PT[PE] that does not have
> >>>>> read permission is where you end up pointing {same for write and execute}
> >>>>> <
> >>>>> The instruction tells you what the instruction stream was doing.
> >>>>> Inst<31..26> = 001001 & Inst<10..5> = 0000xx -> LD
> >>>>> Inst<31..26> = 001001 & Inst<10..5> = 00x1xx -> ST
> >>>>> Inst<31..26> = 1000xx LD
> >>>>> Inst<31..26> = 1001xx ST
> >>>>> <
> >>>>> So, there is a clever trick, Because My 66000 detects all mal-formed instructions,
> >>>>> {mal-formed instructions arrive at the Operation exception handler}
> >>>>> the page fault handler can look at Instruction<31> and decide to look at
> >>>>> Inst<28> or Inst<7> where 0->LD and 1->ST.
> >>>>> So 3 PREDs in a row parse the instruction that caused the page fault.
> >>>>> <
> >>>>> But you make a good point:: I have another 32-bits in R1 I could completely
> >>>>> specify what problem was detected. Let me work on that.
> >>>> The fault syndrome bits should be sufficient.
> >>>> I don't think anyone wants to have to reverse guesstimate what the
> >>>> problem was in a page fault handler by parsing instructions.
> >>>>
> >>>> An emulator might want to parse the faulting instructions but that
> >>>> is an application specific use for the general page fault mechanism
> >>>> and would take place in an application specific user mode handler.
> >>> For my core I have both the address of the faulting instruction stored in an exception
> >>> instruction pointer register, and the faulting address stored in a bad address register.
> >>> The cause of an exception is stored in a cause register. Maybe the cause register
> >>> does not have enough information, but it can tell if the fault was data or instruction,
> >>> or read, write, or execute violations.
> >> <
> >> There is also the case of a poorly-formed table entry.
> >>>
> >>> Looks like its back to the drawing board for me. Planning on using compressed
> >>> pages, an idea gleaned from BGB’s posting.
> >>> I was using an inverted hash table, but had not considered memory paged out to disk.
> >>> Being able to swap out to disk means making the page table larger, meaning it will no
> >>> longer fit in block RAM. So, back to using a TLB. I was thinking of increasing the page
> >>> table size by a factor four, ¼ real memory ¾ disk storage. I gather its not a good idea
> >>> to have too large of a swap file.
> >>>
> I think the size of a swap file is a matter of interpretation.
>
> In current tests, sizes are:
> RAM = 128MB (set to match the Nexys A7 I am using)
> SWAP = 256MB or 384MB in emulation and simulation tests.
>
>
> On my SDcard, IIRC, the swapfile was set up as 2GB (and I didn't
> originally think it would take this long to before trying to get this
> stuff working).
>
> I am not assuming any particular limit at the moment, apart from the 4GB
> size limit implied by FAT32 (at present, this would be 256k pages).
>
>
> Currently, the swap code sets up a 32MB region of RAM to be used as swap
> pages, with the rest (for now) being left for either physical mapping or
> direct remapping (where pages are mapped into virtual addresses, but are
> not backed by swap).
>
> I initially thought 32MB would be too limiting, but it appears that Doom
> and similar can sit pretty easily in this space.
>
> Doom doesn't really generate any swapfile traffic, but Quake uses enough
> memory that does start accessing swap.
>
> One limit for swapfile size would be the relative cost of any management
> structures.
>
> So, for example, with a 4GB swapfile, if spending 8B per page to track
> page location / compression, this would be an ~ 2MB array, ...
>
> In the current (uncompressed) page-table scheme, page management is
> mostly limited to a few bitmaps and similar.
>
>
>
> Though, this stuff still doesn't work particularly reliably as of yet
> (trying to run programs from swapfile backed memory is still pretty buggy).
>
> There is also some hackery, like needing to make sure that everything in
> the FAT driver and similar is using physically mapped memory.
>
> There are ending up being various functions for memory allocation with
> various properties:
> tk_malloc: Generic malloc, may use Swap
> tk_malloc_usr: Usermode accessible (kernel), may use Swap
> tk_malloc_wxe: Read/Write/Execute, Physically Mapped
> tk_malloc_krn: Kernel-Mode memory, Physically Mapped
> ...
>
> Internally, these map to a "tk_malloc_cat" function, which adds a
> "category" argument describing the type of memory one wants (currently
> as a "magic number" scheme).
>
>
> Generally, anything allocated from "userland" (with the exception of WXE
> memory) is currently assumed to use Swap.
>
> Anything which may be accessed from an ISR needs to be physically mapped
> (ISRs don't use address translation).
>
> ...
>
>
>
> However, the program loaders still use direct-mapping for the '.text'
> sections and similar (for now at least, may change this later). It is
> likely though that the '.data'/'.bss' sections could be put into
> swap-backed memory (in my ABI, the read-only and read/write sections are
> handled independently of each other, *).
>
> *: This is a little hacky as PE/COFF wasn't really designed to be used
> this way, but either way.
>
>
>
> Because both may access the SD card, trying to page stuff in or out from
> within another IO operation is potentially catastrophic (though the
> swapfile code does mostly bypass the FAT driver; it figures out the LBA
> range and goes from there).
>
> It doesn't currently deal with any fragmentation in the pagefile, it is
> possible I could try to deal with this, to be decided if doing so is
> worthwhile.
>
> Would require either:
> Pretranslating and caching the FAT cluster chains;
> Hacking into the FAT FS driver internals;
> Feeding swapfile operations through normal filesystem IO interfaces.
>
> Sending swapfile stuff through the filesystem driver is undesirable, for
> various reasons. For the time being, it is easier to require an
> unfragmented swapfile (which, granted, does require some special care
> when setting up the SDcard).
>
> Would have almost gone the Linux route of requiring a swap partition,
> except I am using Windows and there is no good way to set this up short
> of doing it on a Linux machine, so using a file was the easier option.
>
>
>
> I am at the moment again torn between wanting to post the code to GitHub
> (since it has been a few weeks), and not wanting to post the code when
> stuff is still obviously buggy or broken.
>
>
> I am left to wonder if all this stuff is usually so much of a PITA to
> debug for everyone else, or if I just kinda suck at it...
> >>> The TLBE is 256 bits in size, and about 90% of the bits are used. The PTE could be
> >>> smaller, but I am having it mimic the TLBE to keep things simple.
> >>> For the hash table, hash groups are used to alleviate the need to walk chains. The hash
> >>> chain walks by a group not by a single entry. Groups contain eight entries that hashed to
> >>> the same value or are displaced, so there is seldom a need to walk more than two or
> >>> three steps.
> >
> > There are only a couple of things I could think of for poorly formed table entries. If the
> > physical address is not a legal value, or if the page mask begin is greater than the page
> > mask end. Otherwise, the remaining fields could have just about any value. If the physical
> > address is not legal a PMA exception will occur when the address is mapped. If the page
> > mask begin is greater than the end, then there will be no access to the page resulting in
> > read, write, or execute violations. I suppose the cause code could be expanded to add
> > syndrome bits. They may be useful for other units too.
> >
> > I have the page compression indicators covering a 4kB block sub-page range. While the
> > page mask begin, end divides the page into 1kB blocks. It may be better to cover the
> > same range with both fields. 4kB blocks might be used to emulate an older system.
> > The first half of the PTE is mainly the translation info. The second half is access rights,
> > key and compression indicators. The second half of the PTE is loaded from a per-page
> > page management table. It is info that needs to be in the TLBE so the PTE is just used to
> > cache the information. The second half of the PTE could be stuffed with other information
> > for software use until the access rights info is made valid.
> >
> So, considering something simplistic like "skip zeroed parts of a
> not-entirely-zero" page or something?...
>
>
> I guess this is possible, vs what I was considering of having one of 3
> options:
> Zeroed / Skipped;
> Stored;
> LZ Compressed.
>
> At present (with an uncompressed pagefile), only the Zeroed and Stored
> states exist.
>
>
> Though, I still have a few concerns with my idea:
> Making LZ compression sufficiently fast is still an issue;
> Doing a chunk buffer turns a large number of small stalls into a smaller
> number of large stalls.
>
> It will probably be painfully obvious if the whole system stalls for
> several seconds at a time to write several MB of compressed pagefile
> data to the SD card.
>
>
> Though, looking stuff up, it appears I may not need to write the data
> all at once, rather it may be sufficient to (merely) write to the data
> in a roughly sequential order.
>
> This would imply abandoning the 4MB chunking idea, and instead treating
> the swap as a circular log. This would, however, require some way to
> relocate or skip over existing pages if they are "still live".
>
> I could probably manage this with a cell bitmap (probably with 512B or
> 1K cells), effectively treating the pagefile like a large bitmap-managed
> heap. This could work OK, but its effectiveness (outside of an explicit
> "compactor" mechanism) would depend highly on fragmentation.
>
>
> Assuming 1K cells with a 2GB pagefile, would need 256K at 1b/cell.
> Would need ~ 28b to encode the size and location of a compressed page.
> Assuming there are 128k pages (each 16K), with 4B overhead per page,
> this part costs 512K.
>
> Probably doable...
>
>
> Though, it is possible that (if enough pages are poorly compressible),
> that fragmentation would result in the pagefile turning into an unusable
> state with this design.
>
> Assuming a 1:1 parity between pagefile size and logical page count; this
> will likely only work OK if average compressed page size is below around
> 8K or so. The alternative being to only fill the pagefile to 50% or 75%
> capacity, such that the pagefile still works even if a majority of the
> pages are poorly compressible.
>
> But, at that point, one would be better off using an uncompressed pagefile.
>
>
> Some of my mock-up tests aren't giving as good of compression as I had
> hoped. This idea would work a lot better if pages got an average
> compression ratio of around 75% or so, but the 45%..65% I am seeing in
> my mock-up tests is likely to be borderline at best.
>
>
> From what I can find, most OS's seem to be using uncompressed pagefiles.
> > I am assuming the PPN field can be used to store the page-file swap address when the
> > page is not resident in memory.
> >
> Storing the logical pagefile page-number in non-resident pages is how I
> was doing it.
>
> Address is Virtual:
> Present==1:
> Page number maps to a physical page address;
> Present==0:
> Page Number==0: PTE is Free (PTE==0) or Reserved (PTE!=0)
> Page Number!=0: PTE points at a pagefile page
>
> ...
>
> Pagefile page 0 is currently reserved.

Click here to read the complete article

Pages:1 234

server_pubkey.txt

rocksolid light 0.9.8
clearnet tor