novaBBS - comp.arch - Status: Working on MMU and memory protection, VUGID + Keyrings

So, I guess I can note something:
When I had switched over to the newer "ringbus" design, one side effect
was that the MMU had gotten broken (though I didn't have everything
stable even before this).

I have since gotten around to fixing a lot of this so that the MMU now
works again.

Have gotten a few other things written:
Volatile cache lines, which are auto-evicted after a few clock-cycles;
Logic to enforce Read/Write/Execute on cache-lines;
Inappropriate access will result in a CPU fault.
Decode-time logic to disallow using certain instructions in user-mode;
...

Some features, like memory access protection, are currently only done if
the MMU is enabled. Otherwise with the MMU disabled, the whole address
space is accessible in user-mode.

The restrictions on which instructions and which control-registers are
allowed is enforced in the instruction decoder, which in turn depends on
the mode bit in SR (which is Read-Only in usermode). This does mean that
changes to the mode would require a pipeline flush to take effect, but
given that the main way for exiting or returning to usermode would be
via an Interrupt or RTE instruction, this is not likely an issue.

Access to the Keyring Register (KRR) is also disallowed in usermode,
since it is assumes that usermode code is untrusted.
Code in supervisor mode will have access to KRR, in which case any use
of keyring checks in supervisor mode would be based on the "honor system".

For kernel code and drivers, having KRR accessible should be OK assuming
non-hostile code.

Could in-theory add something akin to the 4-level protection rings in
x86, but can't really think of any way to protect keyring access in
kernel mode that would allow kernel code to still do its thing without
also being trivially bypassed (thus defeating the whole point of trying
to protect it).

The only way to meaningfully protect the keyring from drivers would also
(effectively) mean running them in user-mode (but maybe add a partial
split between "User Mode" and "Superuser Mode", with the latter mostly
intended for running drivers and similar).

So, by the time there is hostile code running in kernel mode, everything
is basically "already hosed" as it were (and there would be no point in
adding additional protection rings).

As noted, for now I only have two levels:
Supervisor Mode: Can do pretty much anything it wants.
User Mode: Can basically do stuff relevant to user-level code.

Then, Supervisor mode effectively has two sub-modes:
ISR Mode: SR.MD, SR.RB, and SR.BL Set;
Kernel Mode: SR.MD Set, SR.RB and SR.BL Clear.

And user-mode:
User Mode: SR.MD, SR.RB, and SR.BL Clear;
Superuser Mode (?): SR.MD is Clear, SR.RB is Set(?).

Superuser mode means slightly tweaking a few things, and works on the
assumption that one can't be simultaneously in an ISR and also in User
Mode. In terms of operation, it could be a "slightly less restrictive"
version of normal user mode (specifics here are still TBD).

The Keyring system is, as noted:
Each page may have a VUGID pair, and User/Group/Other flags;
Each thread may have a Keyring register, with up to 4 keys
Though, this could be potentially expanded if needed (*1).

So, as noted, access to a page is based on having a key in the keyring
which grants a certain level of access.

*1: It is possible that I could expand the keyring to 8 or 12 keys if
needed (via the use of additional registers).

Another recent change was that the key can now be interpreted as either:
64 VGID, 1024 VUID (Default)
Or:
1024 VGID, 64 VUID (Alternate)

Though, if going "mix and match", it would be necessary to assign pairs
such that there are not unintentional overlaps (such as not allowing the
same number to be used as both a VGID and as a VUID within the other space).

Say, for example, VGIDs are assigned in decrementing order, and VUIDs in
incrementing order, such that there is a low probability of clash (or at
least until the numbering space gets full); though with the special-case
exception that "0/0" is hard-wired as the "VUGID Root".

Another thing I had looked into before was Capability Addressing,
however, lacking with this is a good way to implement it which is:
Straightforward to implement;
Affordable (for both hardware and software);
Plays well with C;
...

Ideally, whatever mechanism is used should be "mostly invisible" as far
as normal user code is concerned, which limits the use of addressing
modes which are not (or do not appear as) linear pointer-based addressing.

Similarly, enforcing "memory safety" is a non-start for C, and requiring
the use of some other "memory safe" language, is also a bit lame. A
better option would be one where even actively hostile C code can't
break out of its sandbox (while still allowing it free reign over
whatever memory it "owns").

One possible alternative to capabilities (the poor man's version) could
be, rather than having a special-purpose cache/memory for them; The
address space is made "particularly gigantic", and a sort of
cryptographic ASLR is used to make it "nearly impossible" to guess
memory addresses to things one doesn't already have a pointer to.

Say, for example, if stuff is randomly distributed within a 112 bit
address space via a cryptographic RNG, then the chances of being able to
guess the address of something one doesn't already have the address of
is a bit of a stretch.

Though, this does have the weakness that, in a high-level sense, this
doesn't offer any additional protection by itself over a traditional
MMU; and does have costs in other areas (eg: pointers and intptr_t would
be 128 bits). Similarly, it would already be "fairly difficult" to guess
addresses with ASLR within a 48 bit address space (and this is not a
problem easily subjected to a brute force search; so in practice the use
of ASLR in a huge 112-bit address space wouldn't necessarily gain all
that much over ASLR in a 48-bit space).

Similarly, there is a lot of C code that would not likely respond well
to suddenly having "void *" and "intptr_t" expand to being 128-bit types
(even if it otherwise handles the 32/64-bit transition well), so as a "C
friendly" option it would still be a little weak.

It would also have the issue of basically making the L1 caches and TLB a
lot more expensive (also also eat a lot more registers by requiring the
use of paired GPRs for memory addressing, ...).

Well, excluding a further "poor man's" option of only supporting 48-bit
addressing in hardware, and then faking the 112-bit addressing via
runtime calls and software emulation (and probably requiring 128-bit
pointers to be declared like "int __far *obj;" or similar).

Granted, within an "actual" capability system, there is still the
problem that there is no good way to grant a capability to one part of a
program without also granting it to the whole program (or, preventing
hostile code from being able to "steal" access to something from a
library, by being able to figure out where the desired capability
descriptor is stored in a library's data sections).

This could at least be hindered by the use of intra-binary ASLR (or
compile-time ASLR), so multiple builds of the same binary would have
different memory layouts. However, untrusted code being able to gain
access to any debug metadata for the library would defeat this. Within
the program loader, it could also make sense to randomize library
indices within the PBO ABI, ...

I have experimented with some of this, but compile-time ASLR is not
enabled by default because this makes debugging harder (some types of
otherwise-deterministic bugs get turned into heisenbugs).

Memory ASLR isn't really used much either, since this is only really
viable when one has working support for virtual-memory, which is still
an ongoing sub-project.

This latter point is at least possible with keyrings, if there is a
mechanism to allow a thread to temporarily gain or lose keys (likely
involving the use of specialized system calls). The system call could
validate a request and then add the key to the keyring, with the keyring
reverting to its prior value once the scope in question exits. Threads
would also have an initial keyring.

One likely possibility is that the ability to request a given key would
be assigned to a given binary image, and could potentially be wrapped as
part of the "dllexport" mechanism.

Ideally, one wants to try to limit how much any C code needs to know or
care that such a thing is going on, so by default it would be hidden
away as part of the ABI. In C land, code would then assume that pointers
"just work" ideally without needing to know or care that a function call
has crossed between protection domains.

I am not sure if there is a formal overlap between keys and
capabilities; a key could be considered as a sort of capability applied
over a range of memory objects, rather than a single object. But,
interpreting it this way seems like possibly a little bit of a stretch
(given the mode-of-operation for the keys is a bit different).

Click here to read the complete article

On Saturday, August 7, 2021 at 2:50:53 PM UTC-5, BGB wrote:
> So, I guess I can note something:
> When I had switched over to the newer "ringbus" design, one side effect
> was that the MMU had gotten broken (though I didn't have everything
> stable even before this).
>
> I have since gotten around to fixing a lot of this so that the MMU now
> works again.
>
> Have gotten a few other things written:
> Volatile cache lines, which are auto-evicted after a few clock-cycles;
> Logic to enforce Read/Write/Execute on cache-lines;
> Inappropriate access will result in a CPU fault.
> Decode-time logic to disallow using certain instructions in user-mode;
> ...
>
> Some features, like memory access protection, are currently only done if
> the MMU is enabled. Otherwise with the MMU disabled, the whole address
> space is accessible in user-mode.
>
> The restrictions on which instructions and which control-registers are
> allowed is enforced in the instruction decoder, which in turn depends on
> the mode bit in SR (which is Read-Only in usermode).
<
Quibble: If the user can "see" the mode bit, the a virtualized OS can detect
that [s]he is being virtualized.
<
> This does mean that
> changes to the mode would require a pipeline flush to take effect,
<
No, you just have to insert the mode bit with each instruction as it enters the
pipe. Then you can change the mode bit every other instruction--should you
desire. This is the means for executing multi-threads where each thread can
be in a different mode.
<
> but
> given that the main way for exiting or returning to usermode would be
> via an Interrupt or RTE instruction, this is not likely an issue.
>
>
> Access to the Keyring Register (KRR) is also disallowed in usermode,
> since it is assumes that usermode code is untrusted.
> Code in supervisor mode will have access to KRR, in which case any use
> of keyring checks in supervisor mode would be based on the "honor system".
>
> For kernel code and drivers, having KRR accessible should be OK assuming
> non-hostile code.
<
Most drivers, and huge swaths of the OS should not ever need to look/modify
the keyring. So, even if you trust them "a little" you might not want to trust them
completely. Can you think of a driver that does need access to KRR ?
Read access ? write access ?
>
>
> Could in-theory add something akin to the 4-level protection rings in
> x86, but can't really think of any way to protect keyring access in
> kernel mode that would allow kernel code to still do its thing without
> also being trivially bypassed (thus defeating the whole point of trying
> to protect it).
>
> The only way to meaningfully protect the keyring from drivers would also
> (effectively) mean running them in user-mode (but maybe add a partial
> split between "User Mode" and "Superuser Mode", with the latter mostly
> intended for running drivers and similar).
<
It is reasons like this that both Mill and My 66000 have no privilege--we
just have MMU and each "thing" runs in its own protection domain which
is setup by someone who has the required trust.
>
> So, by the time there is hostile code running in kernel mode, everything
> is basically "already hosed" as it were (and there would be no point in
> adding additional protection rings).
>
Yep, this is why the old systems (x86) have added a secure mode (circa
2006) and are now trying to add another mode (2020) and will be adding
a still 'nother mode circa 2026. The way they ar going there is no end to
the additions...........
>
> As noted, for now I only have two levels:
> Supervisor Mode: Can do pretty much anything it wants.
> User Mode: Can basically do stuff relevant to user-level code.
<
Mill & My 66000:: a thread can only touch things for which the MMU
grant access. There are no privileged instructions or modes or .....
<
(I don't remember about Mill) But My 66000 allows access to processor
control registers via memory the location of which is determined by a
PCI-configuration access page. Thus, there are not instructions to do
this stuff and everything goes through the MMU via LD and ST.
>
> Then, Supervisor mode effectively has two sub-modes:
> ISR Mode: SR.MD, SR.RB, and SR.BL Set;
> Kernel Mode: SR.MD Set, SR.RB and SR.BL Clear.
>
> And user-mode:
> User Mode: SR.MD, SR.RB, and SR.BL Clear;
> Superuser Mode (?): SR.MD is Clear, SR.RB is Set(?).
<
modes after modes after modes,................that is the old way of doing things.
>
> Superuser mode means slightly tweaking a few things, and works on the
> assumption that one can't be simultaneously in an ISR and also in User
> Mode. In terms of operation, it could be a "slightly less restrictive"
> version of normal user mode (specifics here are still TBD).
>
>
> The Keyring system is, as noted:
> Each page may have a VUGID pair, and User/Group/Other flags;
> Each thread may have a Keyring register, with up to 4 keys
> Though, this could be potentially expanded if needed (*1).
>
> So, as noted, access to a page is based on having a key in the keyring
> which grants a certain level of access.
>
> *1: It is possible that I could expand the keyring to 8 or 12 keys if
> needed (via the use of additional registers).
>
>
> Another recent change was that the key can now be interpreted as either:
> 64 VGID, 1024 VUID (Default)
> Or:
> 1024 VGID, 64 VUID (Alternate)
>
> Though, if going "mix and match", it would be necessary to assign pairs
> such that there are not unintentional overlaps (such as not allowing the
> same number to be used as both a VGID and as a VUID within the other space).
>
> Say, for example, VGIDs are assigned in decrementing order, and VUIDs in
> incrementing order, such that there is a low probability of clash (or at
> least until the numbering space gets full); though with the special-case
> exception that "0/0" is hard-wired as the "VUGID Root".
>
>
>
> Another thing I had looked into before was Capability Addressing,
> however, lacking with this is a good way to implement it which is:
<
> Straightforward to implement;
> Affordable (for both hardware and software);
> Plays well with C;
<
The plays well with C part is going to require a <near> complete 64-bit VA space.
<
In My 66000 the HoB of the VA selects between Portholes (1) and normal
memory (0). Normal memory is translated in a way similar to what we have
been doing since the days of S/360/67 (page tables). Portholes are a 4
doubleword PTE that contains a root pointer to another address space,
and a base (lowest addressable byte) and limit (highest addressable byte)
and addition access rights (restrictions).
<
Mill uses the HoB, also, but I forgot the mechanisms.
> ...
>
> Ideally, whatever mechanism is used should be "mostly invisible" as far
> as normal user code is concerned, which limits the use of addressing
> modes which are not (or do not appear as) linear pointer-based addressing.
>
> Similarly, enforcing "memory safety" is a non-start for C, and requiring
> the use of some other "memory safe" language, is also a bit lame. A
> better option would be one where even actively hostile C code can't
> break out of its sandbox (while still allowing it free reign over
> whatever memory it "owns").
<
"owns" or has been lent--like via portholes.
>
>
> One possible alternative to capabilities (the poor man's version) could
> be, rather than having a special-purpose cache/memory for them; The
> address space is made "particularly gigantic", and a sort of
> cryptographic ASLR is used to make it "nearly impossible" to guess
> memory addresses to things one doesn't already have a pointer to.
<
ROOT pointers are the equivalent of the huge ASID.
>
> Say, for example, if stuff is randomly distributed within a 112 bit
> address space via a cryptographic RNG, then the chances of being able to
> guess the address of something one doesn't already have the address of
> is a bit of a stretch.
>
> Though, this does have the weakness that, in a high-level sense, this
> doesn't offer any additional protection by itself over a traditional
> MMU; and does have costs in other areas (eg: pointers and intptr_t would
> be 128 bits). Similarly, it would already be "fairly difficult" to guess
> addresses with ASLR within a 48 bit address space (and this is not a
<
Say you could perform said guess--the MMU should still prevent your guess
from becoming accessible !! There is a porthole protocol, and if it is not
followed precisely, you get no access whatsoever.
<
> problem easily subjected to a brute force search; so in practice the use
> of ASLR in a huge 112-bit address space wouldn't necessarily gain all
> that much over ASLR in a 48-bit space).
<
Don't be subject to the guess! If the porthole is not created and passed
and then installed precisely via the protocol, the attacker gets nothing
(zero, nada, zilch).
>
> Similarly, there is a lot of C code that would not likely respond well
> to suddenly having "void *" and "intptr_t" expand to being 128-bit types
> (even if it otherwise handles the 32/64-bit transition well), so as a "C
> friendly" option it would still be a little weak.
<
I made it work in 64-bit VA, so did Mill, so can you.
>
>
> It would also have the issue of basically making the L1 caches and TLB a
> lot more expensive (also also eat a lot more registers by requiring the
> use of paired GPRs for memory addressing, ...).
>
> Well, excluding a further "poor man's" option of only supporting 48-bit
> addressing in hardware, and then faking the 112-bit addressing via
> runtime calls and software emulation (and probably requiring 128-bit
> pointers to be declared like "int __far *obj;" or similar).
>
>
>
> Granted, within an "actual" capability system, there is still the
> problem that there is no good way to grant a capability to one part of a
> program without also granting it to the whole program (or, preventing
> hostile code from being able to "steal" access to something from a
> library, by being able to figure out where the desired capability
> descriptor is stored in a library's data sections).
<
All capability systems have the ability to downgrade access rights when
passing the capability to another. So should yours. In My 66000 I also
have the ability to instantaneously remove accesses to every capability
pointing into a thread's VA when the thread dies and when it removes
pages from its address space. The capability is translated THROUGH
the page tables of the owner's tables.
>
> This could at least be hindered by the use of intra-binary ASLR (or
> compile-time ASLR), so multiple builds of the same binary would have
> different memory layouts. However, untrusted code being able to gain
> access to any debug metadata for the library would defeat this. Within
> the program loader, it could also make sense to randomize library
> indices within the PBO ABI, ...
<
Unnecessary when you get the rest of the mechanics properly defined.
>
>
> I have experimented with some of this, but compile-time ASLR is not
> enabled by default because this makes debugging harder (some types of
> otherwise-deterministic bugs get turned into heisenbugs).
>
> Memory ASLR isn't really used much either, since this is only really
> viable when one has working support for virtual-memory, which is still
> an ongoing sub-project.
>
>
> This latter point is at least possible with keyrings, if there is a
> mechanism to allow a thread to temporarily gain or lose keys (likely
> involving the use of specialized system calls). The system call could
> validate a request and then add the key to the keyring, with the keyring
> reverting to its prior value once the scope in question exits. Threads
> would also have an initial keyring.
>
> One likely possibility is that the ability to request a given key would
> be assigned to a given binary image, and could potentially be wrapped as
> part of the "dllexport" mechanism.
>
> Ideally, one wants to try to limit how much any C code needs to know or
> care that such a thing is going on, so by default it would be hidden
> away as part of the ABI. In C land, code would then assume that pointers
> "just work" ideally without needing to know or care that a function call
> has crossed between protection domains.
<
The only thing a C program needs to know about Portholes is how to
create one (like MMap), and how to take a passed porthole and install/
deinstall it in his VA map. (C++ is better here as constructors and destructors
can do the install and deinstall).
>
>
> I am not sure if there is a formal overlap between keys and
> capabilities; a key could be considered as a sort of capability applied
> over a range of memory objects, rather than a single object. But,
> interpreting it this way seems like possibly a little bit of a stretch
> (given the mode-of-operation for the keys is a bit different).
>
>
> A lot of this still needs a bit more work though...
>
>
> Any thoughts?...

Click here to read the complete article

On 8/7/2021 2:12 PM, MitchAlsup wrote:
> On Saturday, August 7, 2021 at 2:50:53 PM UTC-5, BGB wrote:

<snip>

> In My 66000 the HoB of the VA selects between Portholes (1) and
> normal memory (0). Normal memory is translated in a way similar to
> what we have been doing since the days of S/360/67 (page tables).
> Portholes are a 4 doubleword PTE that contains a root pointer to
> another address space, and a base (lowest addressable byte) and limit
> (highest addressable byte) and addition access rights
> (restrictions). < Mill uses the HoB, also, but I forgot the
> mechanisms.

Nope, Mill uses the HOB for GC support (a feature that still may be
excised after we get a GC running). Instead, we detect portals as a
permission bit in the PLB - R/W/X/P.

<snip>

>> Granted, within an "actual" capability system, there is still the
>> problem that there is no good way to grant a capability to one part
>> of a program without also granting it to the whole program (or,
>> preventing hostile code from being able to "steal" access to
>> something from a library, by being able to figure out where the
>> desired capability descriptor is stored in a library's data
>> sections).
> < All capability systems have the ability to downgrade access rights
> when passing the capability to another. So should yours. In My 66000
> I also have the ability to instantaneously remove accesses to every
> capability pointing into a thread's VA when the thread dies and when
> it removes pages from its address space. The capability is translated
> THROUGH the page tables of the owner's tables.

REVOKE is notoriously difficult to even define, much less implement. It
sounds like you have chosen to use rooted caps, whose life is tied to
the initial creation of the root of subsequently distributed caps. This
is a valid semantic, but it prevents creation of continuation-style caps
that survive their creation, which is another valid semantic.

<snip>

>> Ideally, one wants to try to limit how much any C code needs to
>> know or care that such a thing is going on, so by default it would
>> be hidden away as part of the ABI. In C land, code would then
>> assume that pointers "just work" ideally without needing to know or
>> care that a function call has crossed between protection domains.
> < The only thing a C program needs to know about Portholes is how to
> create one (like MMap), and how to take a passed porthole and
> install/ deinstall it in his VA map. (C++ is better here as
> constructors and destructors can do the install and deinstall).

Here again there's a problem of semantics: it sounds like you are using
argument passing to distribute caps, which works fine in a hierarchical
model. But it doesn't work so well in a collaborating-services model
where there is no hierarchy. Mill uses a "grant" model in which the
provider of a resource determines what is provided, and the receiver
does not need to know that the resource was "special". This lets us use
ordinary pointers in the receiver, instead of physically distinct
capabilities, which we judged was essential for business reasons.

On Saturday, August 7, 2021 at 5:16:24 PM UTC-5, Ivan Godard wrote:
> On 8/7/2021 2:12 PM, MitchAlsup wrote:
> > On Saturday, August 7, 2021 at 2:50:53 PM UTC-5, BGB wrote:
> <snip>
> > In My 66000 the HoB of the VA selects between Portholes (1) and
> > normal memory (0). Normal memory is translated in a way similar to
> > what we have been doing since the days of S/360/67 (page tables).
> > Portholes are a 4 doubleword PTE that contains a root pointer to
> > another address space, and a base (lowest addressable byte) and limit
> > (highest addressable byte) and addition access rights
> > (restrictions). < Mill uses the HoB, also, but I forgot the
> > mechanisms.
> Nope, Mill uses the HOB for GC support (a feature that still may be
> excised after we get a GC running). Instead, we detect portals as a
> permission bit in the PLB - R/W/X/P.
<
OK
>
> <snip>
> >> Granted, within an "actual" capability system, there is still the
> >> problem that there is no good way to grant a capability to one part
> >> of a program without also granting it to the whole program (or,
> >> preventing hostile code from being able to "steal" access to
> >> something from a library, by being able to figure out where the
> >> desired capability descriptor is stored in a library's data
> >> sections).
> > < All capability systems have the ability to downgrade access rights
> > when passing the capability to another. So should yours. In My 66000
> > I also have the ability to instantaneously remove accesses to every
> > capability pointing into a thread's VA when the thread dies and when
> > it removes pages from its address space. The capability is translated
> > THROUGH the page tables of the owner's tables.
<
> REVOKE is notoriously difficult to even define, much less implement. It
> sounds like you have chosen to use rooted caps, whose life is tied to
> the initial creation of the root of subsequently distributed caps. This
> is a valid semantic, but it prevents creation of continuation-style caps
> that survive their creation, which is another valid semantic.
<
It is not a capability system, it is a porthole system--you can provide
a porthole into your address space to another foreign thread. If
your address space vanishes, the porthole no longer looks into "your"
(now non existent) address space.
>
> <snip>
> >> Ideally, one wants to try to limit how much any C code needs to
> >> know or care that such a thing is going on, so by default it would
> >> be hidden away as part of the ABI. In C land, code would then
> >> assume that pointers "just work" ideally without needing to know or
> >> care that a function call has crossed between protection domains.
> > < The only thing a C program needs to know about Portholes is how to
> > create one (like MMap), and how to take a passed porthole and
> > install/ deinstall it in his VA map. (C++ is better here as
> > constructors and destructors can do the install and deinstall).
<
> Here again there's a problem of semantics: it sounds like you are using
> argument passing to distribute caps, which works fine in a hierarchical
> model. But it doesn't work so well in a collaborating-services model
> where there is no hierarchy.
<
The creator nor the consumer never gets his hands on the porthole,
they get their hands on a "handle" to the porthole. The handle is
encrypted into a 64-bit bit pattern which has no direct use to the
consumer other than serving as a name to the porthole manager
to install (deinstall) in his-own VA space.
<
< Mill uses a "grant" model in which the
> provider of a resource determines what is provided, and the receiver
> does not need to know that the resource was "special". This lets us use
> ordinary pointers in the receiver, instead of physically distinct
> capabilities, which we judged was essential for business reasons.
<
As you wish.......

On 8/7/2021 4:12 PM, MitchAlsup wrote:
> On Saturday, August 7, 2021 at 2:50:53 PM UTC-5, BGB wrote:
>> So, I guess I can note something:
>> When I had switched over to the newer "ringbus" design, one side effect
>> was that the MMU had gotten broken (though I didn't have everything
>> stable even before this).
>>
>> I have since gotten around to fixing a lot of this so that the MMU now
>> works again.
>>
>> Have gotten a few other things written:
>> Volatile cache lines, which are auto-evicted after a few clock-cycles;
>> Logic to enforce Read/Write/Execute on cache-lines;
>> Inappropriate access will result in a CPU fault.
>> Decode-time logic to disallow using certain instructions in user-mode;
>> ...
>>
>> Some features, like memory access protection, are currently only done if
>> the MMU is enabled. Otherwise with the MMU disabled, the whole address
>> space is accessible in user-mode.
>>
>> The restrictions on which instructions and which control-registers are
>> allowed is enforced in the instruction decoder, which in turn depends on
>> the mode bit in SR (which is Read-Only in usermode).
> <
> Quibble: If the user can "see" the mode bit, the a virtualized OS can detect
> that [s]he is being virtualized.
> <

Virtualization support is a not-yet-addressed issue...

It seems like it could require a separate set of control registers (or
faking the control registers), and possibly a trap to allow LDTLB
requests to be forwarded through another set of page tables.

Then again, "userland code can't see SR" is an interesting possibility
that I hadn't considered. In most cases, there is pretty much no need to
access SR directly, and the bits that *are* relevant to user-mode
operation are also shadowed in PC(63:48) and LR(63:48).

>> This does mean that
>> changes to the mode would require a pipeline flush to take effect,
> <
> No, you just have to insert the mode bit with each instruction as it enters the
> pipe. Then you can change the mode bit every other instruction--should you
> desire. This is the means for executing multi-threads where each thread can
> be in a different mode.
> <

I suspect that as-is, it may be moot given there is no way to jump
between Usermode and Supervisor mode that doesn't already result in a
pipeline flush.

In the current design for SMT, there are two redundant SRs, so each
thread would operate with its own SR by default (one thread could be in
supervisor mode and the other in usermode).

>> but
>> given that the main way for exiting or returning to usermode would be
>> via an Interrupt or RTE instruction, this is not likely an issue.
>>
>>
>> Access to the Keyring Register (KRR) is also disallowed in usermode,
>> since it is assumes that usermode code is untrusted.
>> Code in supervisor mode will have access to KRR, in which case any use
>> of keyring checks in supervisor mode would be based on the "honor system".
>>
>> For kernel code and drivers, having KRR accessible should be OK assuming
>> non-hostile code.
> <
> Most drivers, and huge swaths of the OS should not ever need to look/modify
> the keyring. So, even if you trust them "a little" you might not want to trust them
> completely. Can you think of a driver that does need access to KRR ?
> Read access ? write access ?

I could restrict it to only being allowed to be modified via ISRs.

However, given that kernel mode code can, as-is, just sorta sidestep the
MMU, protecting the keyring would be of limited effect.

The other option is putting drivers in usermode, but this does imply
making MMIO able to be mapped using the TLB. This could work, but does
imply that I would probably need to remap some hardware addresses such
that SD-SPI, UART, and GPIO are not located in the same page. It may
make sense to give a driver access to GPIO without also giving it access
to the SDcard's SPI interface (which should in-turn mostly be restricted
to only being accessible to the block-device driver).

>>
>>
>> Could in-theory add something akin to the 4-level protection rings in
>> x86, but can't really think of any way to protect keyring access in
>> kernel mode that would allow kernel code to still do its thing without
>> also being trivially bypassed (thus defeating the whole point of trying
>> to protect it).
>>
>> The only way to meaningfully protect the keyring from drivers would also
>> (effectively) mean running them in user-mode (but maybe add a partial
>> split between "User Mode" and "Superuser Mode", with the latter mostly
>> intended for running drivers and similar).
> <
> It is reasons like this that both Mill and My 66000 have no privilege--we
> just have MMU and each "thing" runs in its own protection domain which
> is setup by someone who has the required trust.

Hmm...

>>
>> So, by the time there is hostile code running in kernel mode, everything
>> is basically "already hosed" as it were (and there would be no point in
>> adding additional protection rings).
>>
> Yep, this is why the old systems (x86) have added a secure mode (circa
> 2006) and are now trying to add another mode (2020) and will be adding
> a still 'nother mode circa 2026. The way they ar going there is no end to
> the additions...........

Yeah.

>>
>> As noted, for now I only have two levels:
>> Supervisor Mode: Can do pretty much anything it wants.
>> User Mode: Can basically do stuff relevant to user-level code.
> <
> Mill & My 66000:: a thread can only touch things for which the MMU
> grant access. There are no privileged instructions or modes or .....
> <
> (I don't remember about Mill) But My 66000 allows access to processor
> control registers via memory the location of which is determined by a
> PCI-configuration access page. Thus, there are not instructions to do
> this stuff and everything goes through the MMU via LD and ST.

Hmm...

I have hardware registers and MMIO partly as their own things.

Hardware registers are registers, and MMIO is its own special address
range (and uses a different bus protocol from that of normal memory
accesses, ...).

>>
>> Then, Supervisor mode effectively has two sub-modes:
>> ISR Mode: SR.MD, SR.RB, and SR.BL Set;
>> Kernel Mode: SR.MD Set, SR.RB and SR.BL Clear.
>>
>> And user-mode:
>> User Mode: SR.MD, SR.RB, and SR.BL Clear;
>> Superuser Mode (?): SR.MD is Clear, SR.RB is Set(?).
> <
> modes after modes after modes,................that is the old way of doing things.

There are limits.

I have yet to come up with any meaningful semantic differences between
User Mode and Superuser Mode, so a separate mode may not be worthwhile.

The split between ISR Mode and Kernel Mode is mostly for hardware
reasons. Kernel Mode uses the MMU, whereas ISR Mode temporarily disables
the MMU because this is necessary to be able to implement the TLB-Miss
exception.

Something like a SYSCALL exception effectively needs to initiate a jump
from ISR mode to Kernel Mode.

ISR mode would probably be unnecessary if page walking were done in
hardware.

>>
>> Superuser mode means slightly tweaking a few things, and works on the
>> assumption that one can't be simultaneously in an ISR and also in User
>> Mode. In terms of operation, it could be a "slightly less restrictive"
>> version of normal user mode (specifics here are still TBD).
>>
>>
>> The Keyring system is, as noted:
>> Each page may have a VUGID pair, and User/Group/Other flags;
>> Each thread may have a Keyring register, with up to 4 keys
>> Though, this could be potentially expanded if needed (*1).
>>
>> So, as noted, access to a page is based on having a key in the keyring
>> which grants a certain level of access.
>>
>> *1: It is possible that I could expand the keyring to 8 or 12 keys if
>> needed (via the use of additional registers).
>>
>>
>> Another recent change was that the key can now be interpreted as either:
>> 64 VGID, 1024 VUID (Default)
>> Or:
>> 1024 VGID, 64 VUID (Alternate)
>>
>> Though, if going "mix and match", it would be necessary to assign pairs
>> such that there are not unintentional overlaps (such as not allowing the
>> same number to be used as both a VGID and as a VUID within the other space).
>>
>> Say, for example, VGIDs are assigned in decrementing order, and VUIDs in
>> incrementing order, such that there is a low probability of clash (or at
>> least until the numbering space gets full); though with the special-case
>> exception that "0/0" is hard-wired as the "VUGID Root".
>>
>>
>>
>> Another thing I had looked into before was Capability Addressing,
>> however, lacking with this is a good way to implement it which is:
> <
>> Straightforward to implement;
>> Affordable (for both hardware and software);
>> Plays well with C;
> <
> The plays well with C part is going to require a <near> complete 64-bit VA space.
> <
> In My 66000 the HoB of the VA selects between Portholes (1) and normal
> memory (0). Normal memory is translated in a way similar to what we have
> been doing since the days of S/360/67 (page tables). Portholes are a 4
> doubleword PTE that contains a root pointer to another address space,
> and a base (lowest addressable byte) and limit (highest addressable byte)
> and addition access rights (restrictions).
> <
> Mill uses the HoB, also, but I forgot the mechanisms.

Click here to read the complete article

On Saturday, August 7, 2021 at 9:03:53 PM UTC-5, BGB wrote:
> On 8/7/2021 4:12 PM, MitchAlsup wrote:
> > On Saturday, August 7, 2021 at 2:50:53 PM UTC-5, BGB wrote:
> >> So, I guess I can note something:
> >> When I had switched over to the newer "ringbus" design, one side effect
> >> was that the MMU had gotten broken (though I didn't have everything
> >> stable even before this).
> >>
> >> I have since gotten around to fixing a lot of this so that the MMU now
> >> works again.
> >>
> >> Have gotten a few other things written:
> >> Volatile cache lines, which are auto-evicted after a few clock-cycles;
> >> Logic to enforce Read/Write/Execute on cache-lines;
> >> Inappropriate access will result in a CPU fault.
> >> Decode-time logic to disallow using certain instructions in user-mode;
> >> ...
> >>
> >> Some features, like memory access protection, are currently only done if
> >> the MMU is enabled. Otherwise with the MMU disabled, the whole address
> >> space is accessible in user-mode.
> >>
> >> The restrictions on which instructions and which control-registers are
> >> allowed is enforced in the instruction decoder, which in turn depends on
> >> the mode bit in SR (which is Read-Only in usermode).
> > <
> > Quibble: If the user can "see" the mode bit, the a virtualized OS can detect
> > that [s]he is being virtualized.
> > <
> Virtualization support is a not-yet-addressed issue...
>
> It seems like it could require a separate set of control registers (or
> faking the control registers), and possibly a trap to allow LDTLB
> requests to be forwarded through another set of page tables.
>
> Then again, "userland code can't see SR" is an interesting possibility
> that I hadn't considered. In most cases, there is pretty much no need to
> access SR directly, and the bits that *are* relevant to user-mode
> operation are also shadowed in PC(63:48) and LR(63:48).
> >> This does mean that
> >> changes to the mode would require a pipeline flush to take effect,
> > <
> > No, you just have to insert the mode bit with each instruction as it enters the
> > pipe. Then you can change the mode bit every other instruction--should you
> > desire. This is the means for executing multi-threads where each thread can
> > be in a different mode.
> > <
> I suspect that as-is, it may be moot given there is no way to jump
> between Usermode and Supervisor mode that doesn't already result in a
> pipeline flush.
>
> In the current design for SMT, there are two redundant SRs, so each
> thread would operate with its own SR by default (one thread could be in
> supervisor mode and the other in usermode).
> >> but
> >> given that the main way for exiting or returning to usermode would be
> >> via an Interrupt or RTE instruction, this is not likely an issue.
> >>
> >>
> >> Access to the Keyring Register (KRR) is also disallowed in usermode,
> >> since it is assumes that usermode code is untrusted.
> >> Code in supervisor mode will have access to KRR, in which case any use
> >> of keyring checks in supervisor mode would be based on the "honor system".
> >>
> >> For kernel code and drivers, having KRR accessible should be OK assuming
> >> non-hostile code.
> > <
> > Most drivers, and huge swaths of the OS should not ever need to look/modify
> > the keyring. So, even if you trust them "a little" you might not want to trust them
> > completely. Can you think of a driver that does need access to KRR ?
> > Read access ? write access ?
> I could restrict it to only being allowed to be modified via ISRs.
>
> However, given that kernel mode code can, as-is, just sorta sidestep the
> MMU, protecting the keyring would be of limited effect.
>
> The other option is putting drivers in usermode, but this does imply
> making MMIO able to be mapped using the TLB. This could work, but does
> imply that I would probably need to remap some hardware addresses such
> that SD-SPI, UART, and GPIO are not located in the same page. It may
> make sense to give a driver access to GPIO without also giving it access
> to the SDcard's SPI interface (which should in-turn mostly be restricted
> to only being accessible to the block-device driver).
> >>
> >>
> >> Could in-theory add something akin to the 4-level protection rings in
> >> x86, but can't really think of any way to protect keyring access in
> >> kernel mode that would allow kernel code to still do its thing without
> >> also being trivially bypassed (thus defeating the whole point of trying
> >> to protect it).
> >>
> >> The only way to meaningfully protect the keyring from drivers would also
> >> (effectively) mean running them in user-mode (but maybe add a partial
> >> split between "User Mode" and "Superuser Mode", with the latter mostly
> >> intended for running drivers and similar).
> > <
> > It is reasons like this that both Mill and My 66000 have no privilege--we
> > just have MMU and each "thing" runs in its own protection domain which
> > is setup by someone who has the required trust.
> Hmm...
> >>
> >> So, by the time there is hostile code running in kernel mode, everything
> >> is basically "already hosed" as it were (and there would be no point in
> >> adding additional protection rings).
> >>
> > Yep, this is why the old systems (x86) have added a secure mode (circa
> > 2006) and are now trying to add another mode (2020) and will be adding
> > a still 'nother mode circa 2026. The way they ar going there is no end to
> > the additions...........
> Yeah.
> >>
> >> As noted, for now I only have two levels:
> >> Supervisor Mode: Can do pretty much anything it wants.
> >> User Mode: Can basically do stuff relevant to user-level code.
> > <
> > Mill & My 66000:: a thread can only touch things for which the MMU
> > grant access. There are no privileged instructions or modes or .....
> > <
> > (I don't remember about Mill) But My 66000 allows access to processor
> > control registers via memory the location of which is determined by a
> > PCI-configuration access page. Thus, there are not instructions to do
> > this stuff and everything goes through the MMU via LD and ST.
> Hmm...
>
> I have hardware registers and MMIO partly as their own things.
>
> Hardware registers are registers, and MMIO is its own special address
> range (and uses a different bus protocol from that of normal memory
> accesses, ...).
<
Think of it like this:: HW registers are Snooped and Snarfed. They are where they
are, and anyone with access can read or write them--even if the thread is running
in a different CPU way-over-there in a different chip. If you have the address of
the register and permission, you can do with it what you want. You DON'T have
to migrate your thread to the CPU owing the register !!!
> >>
> >> Then, Supervisor mode effectively has two sub-modes:
> >> ISR Mode: SR.MD, SR.RB, and SR.BL Set;
> >> Kernel Mode: SR.MD Set, SR.RB and SR.BL Clear.
> >>
> >> And user-mode:
> >> User Mode: SR.MD, SR.RB, and SR.BL Clear;
> >> Superuser Mode (?): SR.MD is Clear, SR.RB is Set(?).
> > <
> > modes after modes after modes,................that is the old way of doing things.
> There are limits.
>
> I have yet to come up with any meaningful semantic differences between
> User Mode and Superuser Mode, so a separate mode may not be worthwhile.
>
> The split between ISR Mode and Kernel Mode is mostly for hardware
> reasons. Kernel Mode uses the MMU, whereas ISR Mode temporarily disables
> the MMU because this is necessary to be able to implement the TLB-Miss
> exception.
<
I don't even get out of reset (i.e., into BOOT) with the MMU turned off.
>
>
> Something like a SYSCALL exception effectively needs to initiate a jump
> from ISR mode to Kernel Mode.
<
In My 66000, you set the active bit for the kernel, and reset the active bit of
the ISR and you are already there ! If the kernel and the ISR are inside one
mapping table, this is a single operation.
>
> ISR mode would probably be unnecessary if page walking were done in
> hardware.
> >>
> >> Superuser mode means slightly tweaking a few things, and works on the
> >> assumption that one can't be simultaneously in an ISR and also in User
> >> Mode. In terms of operation, it could be a "slightly less restrictive"
> >> version of normal user mode (specifics here are still TBD).
> >>
> >>
> >> The Keyring system is, as noted:
> >> Each page may have a VUGID pair, and User/Group/Other flags;
> >> Each thread may have a Keyring register, with up to 4 keys
> >> Though, this could be potentially expanded if needed (*1).
> >>
> >> So, as noted, access to a page is based on having a key in the keyring
> >> which grants a certain level of access.
> >>
> >> *1: It is possible that I could expand the keyring to 8 or 12 keys if
> >> needed (via the use of additional registers).
> >>
> >>
> >> Another recent change was that the key can now be interpreted as either:
> >> 64 VGID, 1024 VUID (Default)
> >> Or:
> >> 1024 VGID, 64 VUID (Alternate)
> >>
> >> Though, if going "mix and match", it would be necessary to assign pairs
> >> such that there are not unintentional overlaps (such as not allowing the
> >> same number to be used as both a VGID and as a VUID within the other space).
> >>
> >> Say, for example, VGIDs are assigned in decrementing order, and VUIDs in
> >> incrementing order, such that there is a low probability of clash (or at
> >> least until the numbering space gets full); though with the special-case
> >> exception that "0/0" is hard-wired as the "VUGID Root".
> >>
> >>
> >>
> >> Another thing I had looked into before was Capability Addressing,
> >> however, lacking with this is a good way to implement it which is:
> > <
> >> Straightforward to implement;
> >> Affordable (for both hardware and software);
> >> Plays well with C;
> > <
> > The plays well with C part is going to require a <near> complete 64-bit VA space.
> > <
> > In My 66000 the HoB of the VA selects between Portholes (1) and normal
> > memory (0). Normal memory is translated in a way similar to what we have
> > been doing since the days of S/360/67 (page tables). Portholes are a 4
> > doubleword PTE that contains a root pointer to another address space,
> > and a base (lowest addressable byte) and limit (highest addressable byte)
> > and addition access rights (restrictions).
> > <
> > Mill uses the HoB, also, but I forgot the mechanisms.
> Possible.
>
>
> When I was looking into some information about capabilities, it came off
> as basically using a mechanism similar to x86 segmented addressing, with
> each process effectively having its own LDT (and then accessing memory
> via far pointers).
>
> But, in this case, was skimming some papers on the subject from the
> 1980s, and one of the papers was about implementing capability
> addressing on the 80386.
>
>
> But, trying to add something analogous to the x86 GDT/LDT mechanism to
> BJX2 isn't super appealing (or, at least, not directly in hardware;
> however my current ISA design would allow something like a GDT or LDT to
> be implemented in software).
>
> In my case, I wanted something I could more easily bolt onto a more
> conventional page-based MMU and linear address space.
<
Which is where I got (to; too). The Porthole provides the address space
(base and bounds) and a pointer to the root of the address space in which
the mapping tables exist. So and address check (B&B) and then translate
through a root pointer (which is what MMUs do all the time anyway.) In
effect, the top 52 = 64-12 bits of the root ARE the ASID.
> >> ...
> >>
> >> Ideally, whatever mechanism is used should be "mostly invisible" as far
> >> as normal user code is concerned, which limits the use of addressing
> >> modes which are not (or do not appear as) linear pointer-based addressing.
> >>
> >> Similarly, enforcing "memory safety" is a non-start for C, and requiring
> >> the use of some other "memory safe" language, is also a bit lame. A
> >> better option would be one where even actively hostile C code can't
> >> break out of its sandbox (while still allowing it free reign over
> >> whatever memory it "owns").
> > <
> > "owns" or has been lent--like via portholes.
> Or something.
>
>
> My idea is that for userland code, it would operate with KRR containing
> a VUGID which is effectively its PID value, and would have access to
> pages keyed to its own PID (or, U/G/O = RWX/---/--- ).
<
So, let us postulate a system where you have several dozen unique
tasks, and each pair of tasks has a shared memory portion where those
tasks exchange data but those same tasks cannot see the other tasks
shared memory portions. Seems to me you run out of keys pretty quick.
>
> With the Root keyring as special in that it allows unrestricted access
> to everything.
>
>
> A more flexible mechanism would be to do full ACL checking (Access
> Control Lists), but this uses a slightly different mechanism (some
> details not fully worked out yet).
<
Aside from being slow.......
>
>
> Say, Page is set to ACL_Check rather than using a plain VUGID. In this
> case, when the page is accessed, it checks if the ACLID is present in an
> internal cache (With the ACLID in place of the page's VUGID). If so, the
> access given in the ACL Cache is used.
> Otherwise, it raises an ACL Check exception.
>
> In the current thinking, like with PIDs, the ACLID would be a subtype of
> VUGID, and (in effect), having an ACLID in KRR could be used to grant
> access directly without needing to raise an ACL Check exception. An
> ACLID could be equivalent to a PID, in which case access could be given
> to a list of other PID values.
>
>
> The exception handler is then responsible for checking if the current
> keyring contains a key which satisfies the ACL, and if so adds it to the
> ACL cache and returns control to the program, say:
> ACLID: Which ACL this applies to.
> VUGID: VUGID from the Keyring
> Mode Flags (RWX, Similar format to that of TLB entries).
>
> The ACL Cache would be similar in premise to the TLB, just smaller (*1).
> The ACL's would be treated as secondary though, because ACL-checking
> process is more complicated and expensive than a direct VUGID access check.
>
> All of this stuff is basically controlled per-page in my MMU design.
<
What does this "buy" that well managed page tables do not ?
{Maybe that you can find that someone gave permission to access a
page via MMU tables, but did not set keyring to do final permit ?? !! ?}
>
> In this case, an ACL would basically exist as a NULL-terminated list of
> 64-bit values, with the exception handler then fetching and walking the
> list. Though, still TBD, somewhere there may need to be an array of ACL
> pointers (so the ISR can fetch the ACL associated with a given ACLID).
>
> As with the TLB, there would be a special instruction to load an ACLE
> into the ACL cache.
>
>
> Then again, one could potentially try to make a case for ACL only
> operation, since in some ways the ACL system would be a little more
> versatile than bare VUGIDs.
>
>
> *1: MMU Operation:
> Fetch TLBEs for Address;
> Check for a Hit;
> Fetch ACLEs for the current TLBE's ACLID;
> Run VUGID and ACLE Checks;
> Reform the memory request with the translated address and mode.
>
> MMU may raise a TLB_Miss exception if no TLBE Matched, or an ACL_Check
> exception if the TLBE is set to ACL checking and no ACLE matched with
> the keys in KRR.
>
> ...
> >>
> >>
> >> One possible alternative to capabilities (the poor man's version) could
> >> be, rather than having a special-purpose cache/memory for them; The
> >> address space is made "particularly gigantic", and a sort of
> >> cryptographic ASLR is used to make it "nearly impossible" to guess
> >> memory addresses to things one doesn't already have a pointer to.
> > <
> > ROOT pointers are the equivalent of the huge ASID.
> OK.
> >>
> >> Say, for example, if stuff is randomly distributed within a 112 bit
> >> address space via a cryptographic RNG, then the chances of being able to
> >> guess the address of something one doesn't already have the address of
> >> is a bit of a stretch.
> >>
> >> Though, this does have the weakness that, in a high-level sense, this
> >> doesn't offer any additional protection by itself over a traditional
> >> MMU; and does have costs in other areas (eg: pointers and intptr_t would
> >> be 128 bits). Similarly, it would already be "fairly difficult" to guess
> >> addresses with ASLR within a 48 bit address space (and this is not a
> > <
> > Say you could perform said guess--the MMU should still prevent your guess
> > from becoming accessible !! There is a porthole protocol, and if it is not
> > followed precisely, you get no access whatsoever.
> > <
> Yeah, this is what will happen with VUGID.
>
> The idea of using cryptographic ASLR is a possible alternative, but
> ultimately has some weaknesses, and the costs associated with dealing
> with a 112-bit address space would likely far exceed those associated
> with doing VUGID and ACL Checks.
>
> Though, I could keep 112 bits on the back burner for the possible
> eventuality that, at some point, 48 bits may not be sufficient.
> >> problem easily subjected to a brute force search; so in practice the use
> >> of ASLR in a huge 112-bit address space wouldn't necessarily gain all
> >> that much over ASLR in a 48-bit space).
> > <
> > Don't be subject to the guess! If the porthole is not created and passed
> > and then installed precisely via the protocol, the attacker gets nothing
> > (zero, nada, zilch).
> OK.
> >>
> >> Similarly, there is a lot of C code that would not likely respond well
> >> to suddenly having "void *" and "intptr_t" expand to being 128-bit types
> >> (even if it otherwise handles the 32/64-bit transition well), so as a "C
> >> friendly" option it would still be a little weak.
> > <
> > I made it work in 64-bit VA, so did Mill, so can you.
> My current scheme has a 48-bit VA, I was mostly musing as to whether a
> giant VA space could be "better" in the sense that then one could
> potentially sidestep the need for access-rights checks.
<
A hobbiest can get away with 48-bits for another decade, but don't
depend on time standing still and assume this can hold forever:
Gordon Bell:: "The biggest mistake an Architect can do is to provide
insufficient address space for future implementations." (Paraphrased).
>
> Assuming a good RNG for the ASLR, it is a lot easier to guess a target
> address with 24 bits of entropy than one with 88 bits.
<
Do not allow a guess to pass through unless everything else has been
setup, by those who have permission, and are watching for "bad things"
and preventing them.
>
>
> But, even with the stronger ASLR, its security would still be inferior
> to "actually doing the checks", and a lot more expensive.
<
Which is an argument not to head down that dark alley at night.........
> >>
> >>
> >> It would also have the issue of basically making the L1 caches and TLB a
> >> lot more expensive (also also eat a lot more registers by requiring the
> >> use of paired GPRs for memory addressing, ...).
> >>
> >> Well, excluding a further "poor man's" option of only supporting 48-bit
> >> addressing in hardware, and then faking the 112-bit addressing via
> >> runtime calls and software emulation (and probably requiring 128-bit
> >> pointers to be declared like "int __far *obj;" or similar).
> >>
> >>
> >>
> >> Granted, within an "actual" capability system, there is still the
> >> problem that there is no good way to grant a capability to one part of a
> >> program without also granting it to the whole program (or, preventing
> >> hostile code from being able to "steal" access to something from a
> >> library, by being able to figure out where the desired capability
> >> descriptor is stored in a library's data sections).
> > <
> > All capability systems have the ability to downgrade access rights when
> > passing the capability to another. So should yours. In My 66000 I also
> > have the ability to instantaneously remove accesses to every capability
> > pointing into a thread's VA when the thread dies and when it removes
> > pages from its address space. The capability is translated THROUGH
> > the page tables of the owner's tables.
<
> May depend on how it is implemented.
>
> If it was via something akin to an x86 LDT, then in premise there isn't
> a good way to allow separate access other than giving each "module" its
> own LDT.
<
One thing I have not talked about is the ability for a thread owning a porthole
to access data in the foreign address space through another porthole ! But
this is a story for a different time.
>
>
> I guess another possibility could be to allow the TLB-Miss handler to do
> something clever via the TLB and ACL cache.
>
> There is nothing in particular that mandates that the memory page be
> pulled from a page-table, and it is possible that the TLB-Miss handler
> could implement something akin to a GDT or LDT instead of a page-table.
>
> Or even, say:
> 0zzz_zzzz_zzzz .. 5zzz_zzzz_zzzz : Page Table
> 6yyy_zzzz_zzzz: GDT-like
> 7yyy_zzzz_zzzz: LDT-like
>
> The memory map for 8..F is more hard-wired, and access to this range is
> not allowed in usermode.
> >>
> >> This could at least be hindered by the use of intra-binary ASLR (or
> >> compile-time ASLR), so multiple builds of the same binary would have
> >> different memory layouts. However, untrusted code being able to gain
> >> access to any debug metadata for the library would defeat this. Within
> >> the program loader, it could also make sense to randomize library
> >> indices within the PBO ABI, ...
> > <
> > Unnecessary when you get the rest of the mechanics properly defined.
<
> Possible, though ASLR does still add more resistance against buffer
> overrun exploits, and is "mostly free".
> >>
> >>
> >> I have experimented with some of this, but compile-time ASLR is not
> >> enabled by default because this makes debugging harder (some types of
> >> otherwise-deterministic bugs get turned into heisenbugs).
> >>
> >> Memory ASLR isn't really used much either, since this is only really
> >> viable when one has working support for virtual-memory, which is still
> >> an ongoing sub-project.
> >>
> >>
> >> This latter point is at least possible with keyrings, if there is a
> >> mechanism to allow a thread to temporarily gain or lose keys (likely
> >> involving the use of specialized system calls). The system call could
> >> validate a request and then add the key to the keyring, with the keyring
> >> reverting to its prior value once the scope in question exits. Threads
> >> would also have an initial keyring.
> >>
> >> One likely possibility is that the ability to request a given key would
> >> be assigned to a given binary image, and could potentially be wrapped as
> >> part of the "dllexport" mechanism.
> >>
> >> Ideally, one wants to try to limit how much any C code needs to know or
> >> care that such a thing is going on, so by default it would be hidden
> >> away as part of the ABI. In C land, code would then assume that pointers
> >> "just work" ideally without needing to know or care that a function call
> >> has crossed between protection domains.
> > <
> > The only thing a C program needs to know about Portholes is how to
> > create one (like MMap), and how to take a passed porthole and install/
> > deinstall it in his VA map. (C++ is better here as constructors and destructors
> > can do the install and deinstall).
> OK.
>
> If it can be mapped to mmap, that is good.
<
It can be mmap ! but I digress......
>
> I guess to rephrase it though, it is partly a question of how to
> associate a given region of code with access to a particular resource.
> In my case, I am doing it with keyrings, which could be either globally
> tied to a given thread, or updated based on the control flow graph (such
> as when calling into an OS library, which may have access to resources
> which are not intended to be shared with the host application).
> >>
> >>
> >> I am not sure if there is a formal overlap between keys and
> >> capabilities; a key could be considered as a sort of capability applied
> >> over a range of memory objects, rather than a single object. But,
> >> interpreting it this way seems like possibly a little bit of a stretch
> >> (given the mode-of-operation for the keys is a bit different).
> >>
> >>
> >> A lot of this still needs a bit more work though...
> >>
> >>
> >> Any thoughts?...

Click here to read the complete article

On 8/7/2021 10:04 PM, MitchAlsup wrote:
> On Saturday, August 7, 2021 at 9:03:53 PM UTC-5, BGB wrote:
>> On 8/7/2021 4:12 PM, MitchAlsup wrote:
>>> On Saturday, August 7, 2021 at 2:50:53 PM UTC-5, BGB wrote:
>>>> So, I guess I can note something:
>>>> When I had switched over to the newer "ringbus" design, one side effect
>>>> was that the MMU had gotten broken (though I didn't have everything
>>>> stable even before this).
>>>>
>>>> I have since gotten around to fixing a lot of this so that the MMU now
>>>> works again.
>>>>
>>>> Have gotten a few other things written:
>>>> Volatile cache lines, which are auto-evicted after a few clock-cycles;
>>>> Logic to enforce Read/Write/Execute on cache-lines;
>>>> Inappropriate access will result in a CPU fault.
>>>> Decode-time logic to disallow using certain instructions in user-mode;
>>>> ...
>>>>
>>>> Some features, like memory access protection, are currently only done if
>>>> the MMU is enabled. Otherwise with the MMU disabled, the whole address
>>>> space is accessible in user-mode.
>>>>
>>>> The restrictions on which instructions and which control-registers are
>>>> allowed is enforced in the instruction decoder, which in turn depends on
>>>> the mode bit in SR (which is Read-Only in usermode).
>>> <
>>> Quibble: If the user can "see" the mode bit, the a virtualized OS can detect
>>> that [s]he is being virtualized.
>>> <
>> Virtualization support is a not-yet-addressed issue...
>>
>> It seems like it could require a separate set of control registers (or
>> faking the control registers), and possibly a trap to allow LDTLB
>> requests to be forwarded through another set of page tables.
>>
>> Then again, "userland code can't see SR" is an interesting possibility
>> that I hadn't considered. In most cases, there is pretty much no need to
>> access SR directly, and the bits that *are* relevant to user-mode
>> operation are also shadowed in PC(63:48) and LR(63:48).
>>>> This does mean that
>>>> changes to the mode would require a pipeline flush to take effect,
>>> <
>>> No, you just have to insert the mode bit with each instruction as it enters the
>>> pipe. Then you can change the mode bit every other instruction--should you
>>> desire. This is the means for executing multi-threads where each thread can
>>> be in a different mode.
>>> <
>> I suspect that as-is, it may be moot given there is no way to jump
>> between Usermode and Supervisor mode that doesn't already result in a
>> pipeline flush.
>>
>> In the current design for SMT, there are two redundant SRs, so each
>> thread would operate with its own SR by default (one thread could be in
>> supervisor mode and the other in usermode).
>>>> but
>>>> given that the main way for exiting or returning to usermode would be
>>>> via an Interrupt or RTE instruction, this is not likely an issue.
>>>>
>>>>
>>>> Access to the Keyring Register (KRR) is also disallowed in usermode,
>>>> since it is assumes that usermode code is untrusted.
>>>> Code in supervisor mode will have access to KRR, in which case any use
>>>> of keyring checks in supervisor mode would be based on the "honor system".
>>>>
>>>> For kernel code and drivers, having KRR accessible should be OK assuming
>>>> non-hostile code.
>>> <
>>> Most drivers, and huge swaths of the OS should not ever need to look/modify
>>> the keyring. So, even if you trust them "a little" you might not want to trust them
>>> completely. Can you think of a driver that does need access to KRR ?
>>> Read access ? write access ?
>> I could restrict it to only being allowed to be modified via ISRs.
>>
>> However, given that kernel mode code can, as-is, just sorta sidestep the
>> MMU, protecting the keyring would be of limited effect.
>>
>> The other option is putting drivers in usermode, but this does imply
>> making MMIO able to be mapped using the TLB. This could work, but does
>> imply that I would probably need to remap some hardware addresses such
>> that SD-SPI, UART, and GPIO are not located in the same page. It may
>> make sense to give a driver access to GPIO without also giving it access
>> to the SDcard's SPI interface (which should in-turn mostly be restricted
>> to only being accessible to the block-device driver).
>>>>
>>>>
>>>> Could in-theory add something akin to the 4-level protection rings in
>>>> x86, but can't really think of any way to protect keyring access in
>>>> kernel mode that would allow kernel code to still do its thing without
>>>> also being trivially bypassed (thus defeating the whole point of trying
>>>> to protect it).
>>>>
>>>> The only way to meaningfully protect the keyring from drivers would also
>>>> (effectively) mean running them in user-mode (but maybe add a partial
>>>> split between "User Mode" and "Superuser Mode", with the latter mostly
>>>> intended for running drivers and similar).
>>> <
>>> It is reasons like this that both Mill and My 66000 have no privilege--we
>>> just have MMU and each "thing" runs in its own protection domain which
>>> is setup by someone who has the required trust.
>> Hmm...
>>>>
>>>> So, by the time there is hostile code running in kernel mode, everything
>>>> is basically "already hosed" as it were (and there would be no point in
>>>> adding additional protection rings).
>>>>
>>> Yep, this is why the old systems (x86) have added a secure mode (circa
>>> 2006) and are now trying to add another mode (2020) and will be adding
>>> a still 'nother mode circa 2026. The way they ar going there is no end to
>>> the additions...........
>> Yeah.
>>>>
>>>> As noted, for now I only have two levels:
>>>> Supervisor Mode: Can do pretty much anything it wants.
>>>> User Mode: Can basically do stuff relevant to user-level code.
>>> <
>>> Mill & My 66000:: a thread can only touch things for which the MMU
>>> grant access. There are no privileged instructions or modes or .....
>>> <
>>> (I don't remember about Mill) But My 66000 allows access to processor
>>> control registers via memory the location of which is determined by a
>>> PCI-configuration access page. Thus, there are not instructions to do
>>> this stuff and everything goes through the MMU via LD and ST.
>> Hmm...
>>
>> I have hardware registers and MMIO partly as their own things.
>>
>> Hardware registers are registers, and MMIO is its own special address
>> range (and uses a different bus protocol from that of normal memory
>> accesses, ...).
> <
> Think of it like this:: HW registers are Snooped and Snarfed. They are where they
> are, and anyone with access can read or write them--even if the thread is running
> in a different CPU way-over-there in a different chip. If you have the address of
> the register and permission, you can do with it what you want. You DON'T have
> to migrate your thread to the CPU owing the register !!!

Hmm, one doesn't usually interact with registers on a different core though;
Usually there is some sort of mechanism to throw INT and RESET signals
at another core;
Where, say, each secondary core tries to boot up it sees that it wasn't
the first core to start up, and then goes into an idle state in the ROM
BIOS until it is kicked into life by sending it a special interrupt,
where it then branches into a boot thunk provided by the first processor?...

Well, except that SMT mode will work a little differently on BJX2, in
that it is two threads on a single core, and so effectively adds a bunch
of duplicate registers. I didn't go the route of making each SMT thread
think it was its own core.

>>>>
>>>> Then, Supervisor mode effectively has two sub-modes:
>>>> ISR Mode: SR.MD, SR.RB, and SR.BL Set;
>>>> Kernel Mode: SR.MD Set, SR.RB and SR.BL Clear.
>>>>
>>>> And user-mode:
>>>> User Mode: SR.MD, SR.RB, and SR.BL Clear;
>>>> Superuser Mode (?): SR.MD is Clear, SR.RB is Set(?).
>>> <
>>> modes after modes after modes,................that is the old way of doing things.
>> There are limits.
>>
>> I have yet to come up with any meaningful semantic differences between
>> User Mode and Superuser Mode, so a separate mode may not be worthwhile.
>>
>> The split between ISR Mode and Kernel Mode is mostly for hardware
>> reasons. Kernel Mode uses the MMU, whereas ISR Mode temporarily disables
>> the MMU because this is necessary to be able to implement the TLB-Miss
>> exception.
> <
> I don't even get out of reset (i.e., into BOOT) with the MMU turned off.

Click here to read the complete article

I was just working on TLB stuff today, so this topic fits right in. I do not follow
everything here, but it is a great learning opportunity. I am picking up on some
of the concepts.
I have memory translation enabled all the time now for the ANY1 core after
reading about it in this newsgroup. All the I/O devices are mapped into 16kB
pages and are present within a 16MB address range. One issue is that all the
potential I/O cannot be mapped at once or there would be no room left in the
TLB for anything else. Another issue is processing a TLB miss on an I/O access.
This may not be acceptable for some devices. The TLB is four-way associative
with 1024 entries. It also allows three-way random updates. The fourth way is
always specified. That way it can act a bit like a locked translation. An entry in
the fourth way cannot be bumped out except on purpose. The plan is to reserve
room in the fourth way for the TLB miss handler and its data so that it is always
resident. On reset the TLB entries are all initialized by hardware to enable
access to the boot rom and some data. This takes about 4096 cycles during
reset. In addition to the TLB, base registers are present but there are no bound
registers (yet). Bounds are created by what is mapped into the memory a base
register points to. Accessing an unmapped location relative to a base register
results in a page fault. At the moment there are 2048 base registers so that
apps can have access to a small set of base registers without the need for
swapping base registers during app switches. It is envisioned that base registers
are associated with apps and there would be a limited number of apps running in
the system at once. The base register in use is specified as part of the virtual
address – the upper 12 -bits. A 43-bit virtual address is in use, but this is
configurable in eight-bit increments. A 32-bit physical address is in use. Base
register spec is a bit like an ASID and I am trying to reconcile the concept with
memory portholes. RWX access rights are specified by the base register. Each
page of memory has a 20-bit key associated with it. Access to memory is possible
only if the app has a matching key in its keyring. There is a separate cache for
memory keys which is accessed during the memory access.
Would not accessing control registers through the MMU be slow?

On Tuesday, August 10, 2021 at 3:39:20 AM UTC-5, robf...@gmail.com wrote:
> I was just working on TLB stuff today, so this topic fits right in. I do not follow
> everything here, but it is a great learning opportunity. I am picking up on some
> of the concepts.
> I have memory translation enabled all the time now for the ANY1 core after
> reading about it in this newsgroup. All the I/O devices are mapped into 16kB
> pages and are present within a 16MB address range. One issue is that all the
> potential I/O cannot be mapped at once or there would be no room left in the
> TLB for anything else.
<
How often do you think you will have 1000 or more I/O activities in concurrent
operation ? If the OS setups the mapping, tells the device to "do a page" of I/O
and when the device signals it is done, the OS removes the mapping. This is
basically what SUN-OS did.
<
> Another issue is processing a TLB miss on an I/O access.
<
Using the above strategy, there are no in-flight I/O-TLB misses.
<
> This may not be acceptable for some devices. The TLB is four-way associative
> with 1024 entries. It also allows three-way random updates. The fourth way is
> always specified. That way it can act a bit like a locked translation. An entry in
> the fourth way cannot be bumped out except on purpose. The plan is to reserve
> room in the fourth way for the TLB miss handler and its data so that it is always
> resident.
<
With the above strategy, you can make it direct mapped.
<
> On reset the TLB entries are all initialized by hardware to enable
> access to the boot rom and some data.
<
Words like ROM are capitalized as they stand in for Read Only Memory, it
also makes them easier to see in text.
<
> This takes about 4096 cycles during
> reset. In addition to the TLB, base registers are present but there are no bound
> registers (yet). Bounds are created by what is mapped into the memory a base
> register points to. Accessing an unmapped location relative to a base register
> results in a page fault. At the moment there are 2048 base registers so that
> apps can have access to a small set of base registers without the need for
> swapping base registers during app switches. It is envisioned that base registers
> are associated with apps and there would be a limited number of apps running in
> the system at once. The base register in use is specified as part of the virtual
> address – the upper 12 -bits.
<
Seems unnecessary. The I/O devices only see a 16MB virtual address space,
many dumber devices only have 24-bits of address even today--although many
have 32-bits. The I/O device creates an access into that 16MB space and the
I/O TLB maps it to the real physical address space of whatever size is app-
ropriate.
<
My advice is to just map the I/O space into real memory by the TLB entry--forget
about base and bounds. Although you might want to tag the TLB entry with which
device is allowed to use that mapping.
<
> A 43-bit virtual address is in use, but this is
> configurable in eight-bit increments. A 32-bit physical address is in use.. Base
> register spec is a bit like an ASID and I am trying to reconcile the concept with
> memory portholes. RWX access rights are specified by the base register. Each
<
It appears you are using base register to mean ROOT pointer.
<
> page of memory has a 20-bit key associated with it. Access to memory is possible
> only if the app has a matching key in its keyring.
<
Up to this point the whole of the text was about I/O devices and now you switch
back to applications.
<
In My 66000, the threads access the TLB though their root pointer by machine
(hypervisor) and context (OS) layers. I/O devices access their root pointer though
a table setup as <bus:device:function> which indexes a table of root pointers.
The mappings pointed at by the root pointer can be used by the I/O device as
easily as can be used by the I/O device; but in practice; the OS will only move a
TLB mapping from thread to I/O when an I/O event is active and then remove
it afterwards.
<
> There is a separate cache for
> memory keys which is accessed during the memory access.
> Would not accessing control registers through the MMU be slow?

On 8/10/2021 3:39 AM, robf...@gmail.com wrote:
> I was just working on TLB stuff today, so this topic fits right in. I do not follow
> everything here, but it is a great learning opportunity. I am picking up on some
> of the concepts.
> I have memory translation enabled all the time now for the ANY1 core after
> reading about it in this newsgroup. All the I/O devices are mapped into 16kB
> pages and are present within a 16MB address range.

Mine is enabled or disabled. Ability to map MMIO via the MMU has been
considered, but not done yet. It would likely mean special "this is
MMIO" cache-lines.

I am also mostly using 16K pages, as my experiments showed this as
"closer to optimal" (at least for the workloads tested) vs either 4K or 64K.

The MMU supports base sizes of 4K/16K/64K, with some larger sizes able
to effectively be emulated.

> One issue is that all the
> potential I/O cannot be mapped at once or there would be no room left in the
> TLB for anything else. Another issue is processing a TLB miss on an I/O access.
> This may not be acceptable for some devices.

My TLB can't fit everything either, and a TLB miss basically means using
an interrupt to deal with it.

> The TLB is four-way associative
> with 1024 entries. It also allows three-way random updates. The fourth way is
> always specified. That way it can act a bit like a locked translation. An entry in
> the fourth way cannot be bumped out except on purpose. The plan is to reserve
> room in the fourth way for the TLB miss handler and its data so that it is always
> resident.

OK. Mine is 4-way, with no fixed entry.
Current TLB sizes are 64x4 and 256x4.

> On reset the TLB entries are all initialized by hardware to enable
> access to the boot rom and some data. This takes about 4096 cycles during
> reset.

No direct equivalent, MMU is disabled at startup.
The L1 caches don't see much difference, when disabled the MMU
effectively forwards requests unmodified.

The MMU's job for memory access protections is setting bits in the
request, which are then later interpreted by the L1 cache. Though, it
reduces the more complicated checks (VUGID and ACL) to a simpler bit
pattern for the L1s. This means all of the access mode bits are relative
to what was allowed at the time the associated L1 miss occurred.

The current MMU has an ~ 4-cycle latency:
Cycle 1: Calculate TLB Index, Classify Request;
Cycle 2: Check for TLB Hit/Miss;
Cycle 3: VUGID and ACL Checking;
Cycle 4: Put output back onto ringbus.

> In addition to the TLB, base registers are present but there are no bound
> registers (yet). Bounds are created by what is mapped into the memory a base
> register points to. Accessing an unmapped location relative to a base register
> results in a page fault. At the moment there are 2048 base registers so that
> apps can have access to a small set of base registers without the need for
> swapping base registers during app switches. It is envisioned that base registers
> are associated with apps and there would be a limited number of apps running in
> the system at once.

No bounds-checking registers, only a few specialized base registers per
thread (GBR and TBR).

There is also VBR for interrupts, and TTB/STTB for the page tables, ...
The interrupt mechanism is fairly minimal, mostly saving off a few
registers and doing a computed branch relative to VBR.

Any saving/restoring of scratch registers has to be done manually.

While the original SH-4 and BJX1 ISAs had a "register banking" feature,
I ended up dropping this for BJX2 because:
Requires GPR file to effectively be 50% bigger;
These registers have relatively little other use (by definition);
ISRs are infrequent enough that it doesn't matter much for perf;
...

Though, does make a few things a little more awkward in that these parts
of the ISR need to work without making any non-reversible changes to any
of the scratch registers.

Could maybe make it less awkward, but it is debatable how much it matters.

> The base register in use is specified as part of the virtual
> address – the upper 12 -bits. A 43-bit virtual address is in use, but this is
> configurable in eight-bit increments. A 32-bit physical address is in use. Base
> register spec is a bit like an ASID and I am trying to reconcile the concept with
> memory portholes.

OK. With my old bus, it was 48/32, but I have moved to (effectively)
48/48 with the ringbus, just with most of the L2 ring devices
effectively ignoring (47:32 for physical addresses).

The usermode address range effectively has 47 bits, and a few bits could
(in theory) be cut off to use for such a "porthole" space.

Possibly, say:
0000_zzzz_zzzz .. 6FFF_zzzz_zzzz: Page-Table Range
7ppp_zzzz_zzzz: Does something like a GDT or LDT in x86.
Could provide 4096 spaces, each up to 4GB.

Everything in 8000..FFFF is given over to supervisor mode, currently:
8000..BFFF: Supervisor MMU
C000..CFFF: Absolute / Physical (44-bit)
D000..DFFF: Absolute / Physical, Volatile
E000..EFFF: Reserved
F000..FFFF: MMIO Space ('High Range' MMIO)

There is an MMIO Range at:
0000_Fzzz_zzzz ('Low Range' MMIO)
However, it is only enabled in 32-bit mode and/or MMU is disabled (*1).
FFFF_Fzzz_zzzz, Maps to the same area, but allowed in 48b+MMU mode.
Where:
FFFF_zzzz_zzzz, Absolute Address (32-bit)

*1: It is kinda ugly, but exists mostly for sake of allowing binary
compatibility for code built for a core with a 32-bit address space to
work on a core with a 48-bit address space (mostly because 48-bit
addressing is a burden on FPGAs like the XC7S25 or XC7A35T).

> RWX access rights are specified by the base register. Each
> page of memory has a 20-bit key associated with it. Access to memory is possible
> only if the app has a matching key in its keyring. There is a separate cache for
> memory keys which is accessed during the memory access.
> Would not accessing control registers through the MMU be slow?
>

In my case, access control is via the TLBE's, which check against the
keys in the keyring (for either VUGID or ACL style checks).

Currently, the keyring contains 4x 16-bit keys, mostly because 16-bits
divided up nicely (20 or 21 bits would have only allowed 3 keys in the
keyring).

At present, I am also using a 4-way ACL Cache, so any of the 4 KRR keys
may match with any of the 4 keys in the ACL Cache, or match against the
ID in the TLBE, depending on the mode set by the TLBE.

There is also an uncertain feature where KRR values can be
saved/restored through userland via special instructions (only enabled
if a proper key has been loaded via a supervisor mode instruction),
which turn the contents of the KRR into an encoded bit pattern, and
allow reloading this bit pattern via another instruction (there is
another supervisor-mode instruction that allows forging these encoded
KRR keys directly).

A difficulty with this is having an encoding which is both "sufficiently
hard to break" but also can be encoded and decoded directly. The current
encoding is based on XORs and randomized permutation.

However, the permutation is fixed (via auto-generated Verilog, the
permutation changes each time the ROM is rebuilt), as dynamic bit
shuffling is kinda "really expensive" in terms of LUTs. The permutation
is kind of a weak point though, since if code happens to know or figure
out the permutation, then the XOR encoding is fairly trivial to break.

This would pose a risk mostly if one were doing mass-produced devices
with identical FPGA bitstreams or similar.

Ideally, one would want dynamic shuffling with a relatively large space
of possible permutations, but this requires a way to do it both cheaply
and quickly, which is a problem.

While a weakly randomized permutation could work against some cases, if
one has an encoded keyring with its known decoded counterpart, then it
becomes trivial to use a brute force approach. Avoiding this would
likely require PID and ACLID assignment to also be randomized, as well
as filling unused spots in the KRR with randomized "garbage keys".

The fallback scenario is, unless this can be made sufficiently secure,
moving between keyrings will require using a system call.

I guess another possible option could be to only allow the relevant
instructions to be used from special "secure" memory pages
(Secure-Execute-Only, or 'SXO' pages). I was already using Execute-Only
(XO) pages for the keyring transfer code, mostly to try to keep the
encoded keys secret (the userland code can't see into these pages, and
any keys are stored in memory with additional randomized obfuscation, to
try to prevent code from "stealing" keys).

So, say, even if the usermode program gets ahold of a key, or manages to
crack its encoding, it still can't do anything with it since it is not
(itself) able to create the 'SXO' pages which would allow the relevant
instructions to be used.

Actually, SXO could possibly also allow a possible "safe" mechanism to
allow transfers between usermode and supervisor mode without going
through an ISR. It is possible that entry points into an SXO page would
also require a special instruction (likely a "special NOP") to mark the
entry point as valid. Trying to branch into the page without first
hitting this "Secure NOP" would itself trigger a fault (in effect, code
within the SXO page would operate in a special "Superuser Mode").

Click here to read the complete article

> How often do you think you will have 1000 or more I/O activities in concurrent

Never. But TLB entries are direct mapped within a way. So, the I/O is spread out
so that it does not reuse the same TLB entry. Also, some devices like the text
screen (128kB) or audio buffer use multiple TLB entries. There are only about 50
different I/O devices that could be in the SoC. These are spread out over three
bus bridges. The off-chip bus is also in the I/O address space and has a good
chunk of address space allocated for it.

>Using the above strategy, there are no in-flight I/O-TLB misses.

Okay, I was over-worried about I/O TLB entries being knocked out when new TLB
entries need to be mapped due to an external event. But I guess there is not really
a need to map new TLB entries if I/O is active.

Base register is probably a synonym for ROOT pointer. I am used to the x86
segmentation model or the RISCV base and bounds registers. PowerPC
segmentation is also an influence.
There is a base register or ROOT pointer for each of code, data, and stack. Although
they are often set to the same value. The access rights associated with the ROOT
pointers are different. Having separate ROOT pointers for each type of address allows
library functions to use the ROOT pointer of the caller where applicable (stack or
data). There are also ROOT pointers for I/O. One thing I noticed using ROOT pointers
is that global pointer registers in the GP register file are initialized to and sit at zero
(plus the ROOT pointer number in the high order bits). They basically become
placeholders for ROOT pointer registers.
It is possible to specify an address without using a global pointer register but it takes
up more code space, requiring multiple prefix instructions for the larger address.
Because the ROOT pointer spec is part of the address, it is easier if the ROOT pointer
that will be in use is known when the code is assembled. Otherwise, a bunch of
address fixups may be required when the app loads. Rather than load from tables,
I have these setup in the initialization for libraries. I know that the TinyBasic app will
always use group three of ROOT pointers. So, these pointers are setup directly
with move instructions. All the addresses in the TinyBasic app reference a global
pointer, so other than the initialization of the global pointers, the ROOT pointers
are invisible.

> How often do you think you will have 1000 or more I/
> Current TLB sizes are 64x4 and 256x4.

The only reason the TLB has so many entries is that it is a minimum size using FPGA
block-ram resources. Within each way the TLB is direct mapped. I was running out of
LUTs but had extra block rams available. Rather than use ¼ block ram I figured, might
as well use the whole thing.

> In my case, access control is via the TLBE's,

I wanted to avoid putting keys in the TLBE’s as there could then be multiple copies of
the key to manage if more than one mapping to a memory page exists. It would also
use too much bit space in the TLBE. It is also a setup that allows keyed access to be
optional without changing the rest of the MMU.
A key of zero means anybody can access the page.

>A difficulty with this is having an encoding which is both "sufficiently
>hard to break"

I have not planned for key scrambling. Is it really necessary?

On 8/10/2021 3:47 PM, robf...@gmail.com wrote:
>> How often do you think you will have 1000 or more I/O activities in concurrent
>
> Never. But TLB entries are direct mapped within a way. So, the I/O is spread out
> so that it does not reuse the same TLB entry. Also, some devices like the text
> screen (128kB) or audio buffer use multiple TLB entries. There are only about 50
> different I/O devices that could be in the SoC. These are spread out over three
> bus bridges. The off-chip bus is also in the I/O address space and has a good
> chunk of address space allocated for it.
>
>> Using the above strategy, there are no in-flight I/O-TLB misses.
>
> Okay, I was over-worried about I/O TLB entries being knocked out when new TLB
> entries need to be mapped due to an external event. But I guess there is not really
> a need to map new TLB entries if I/O is active.
>
> Base register is probably a synonym for ROOT pointer. I am used to the x86
> segmentation model or the RISCV base and bounds registers. PowerPC
> segmentation is also an influence.
> There is a base register or ROOT pointer for each of code, data, and stack. Although
> they are often set to the same value. The access rights associated with the ROOT
> pointers are different. Having separate ROOT pointers for each type of address allows
> library functions to use the ROOT pointer of the caller where applicable (stack or
> data). There are also ROOT pointers for I/O. One thing I noticed using ROOT pointers
> is that global pointer registers in the GP register file are initialized to and sit at zero
> (plus the ROOT pointer number in the high order bits). They basically become
> placeholders for ROOT pointer registers.
> It is possible to specify an address without using a global pointer register but it takes
> up more code space, requiring multiple prefix instructions for the larger address.
> Because the ROOT pointer spec is part of the address, it is easier if the ROOT pointer
> that will be in use is known when the code is assembled. Otherwise, a bunch of
> address fixups may be required when the app loads. Rather than load from tables,
> I have these setup in the initialization for libraries. I know that the TinyBasic app will
> always use group three of ROOT pointers. So, these pointers are setup directly
> with move instructions. All the addresses in the TinyBasic app reference a global
> pointer, so other than the initialization of the global pointers, the ROOT pointers
> are invisible.
>

As can be noted, my current design has "nothing at all" in common with
x86 segmentation.

There is GBR which a program may use to access globals, which is mostly
managed by the C ABI.

Originally I was doing PC-rel, but this doesn't really work well with
allowing the same binary to be loaded once and reused multiple times. In
effect, GBR is made to more-or-less point to the start of the ".data"
section.

>> How often do you think you will have 1000 or more I/
>> Current TLB sizes are 64x4 and 256x4.
>
> The only reason the TLB has so many entries is that it is a minimum size using FPGA
> block-ram resources. Within each way the TLB is direct mapped. I was running out of
> LUTs but had extra block rams available. Rather than use ¼ block ram I figured, might
> as well use the whole thing.
>

In my case, 256x4 maps to Block-RAM, whereas 64x4 uses LUTRAM.
The Block-RAM offers at least a little bit of flexibility in terms of
length and width.

LUTRAM is a little better for timing, but BlockRAM is a little bigger
and uses fewer LUTs.

>> In my case, access control is via the TLBE's,
>
> I wanted to avoid putting keys in the TLBE’s as there could then be multiple copies of
> the key to manage if more than one mapping to a memory page exists. It would also
> use too much bit space in the TLBE. It is also a setup that allows keyed access to be
> optional without changing the rest of the MMU.
> A key of zero means anybody can access the page.
>

I got the space to put keys and similar in, as a side-effect of using a
48 bit address space.

There are various schemes for how it can be encoded in the page-table.
Things like virtual memory mapped to a pagefile, ..., is its own set of
issues.

>> A difficulty with this is having an encoding which is both "sufficiently
>> hard to break"
>
> I have not planned for key scrambling. Is it really necessary?
>

If the keys ever cross paths with usermode, probably...

One doesn't want there to be a path where privilege escalation is
possible, but at the same time, interrupt handling is slow enough to be
preferably avoided.

I have since decided to move these instructions out of usermode though,
now they are limited to "Superuser mode" and "Secure-Execute Pages" (new
feature). This reduces the importance of the encryption slightly,
because breaking the encryption will no longer allow User code to break
out of its sandbox (since it will lack the instructions needed to make
use of its stolen information).

While Superuser mode is still effectively User mode, it can be slightly
less restrictive in a few areas because there will be an implicit
assumption that code running in Superuser Mode is non-hostile (and will
probably not contain code that is trying to break the keys to give
itself more access).

Subject	Author
Status: Working on MMU and memory protection, VUGID + Keyrings	BGB
Re: Status: Working on MMU and memory protection, VUGID + Keyrings	MitchAlsup
Re: Status: Working on MMU and memory protection, VUGID + Keyrings	Ivan Godard
Re: Status: Working on MMU and memory protection, VUGID + Keyrings	MitchAlsup
Re: Status: Working on MMU and memory protection, VUGID + Keyrings	BGB
Re: Status: Working on MMU and memory protection, VUGID + Keyrings	MitchAlsup
Re: Status: Working on MMU and memory protection, VUGID + Keyrings	BGB
Re: Status: Working on MMU and memory protection, VUGID + Keyrings	robf...@gmail.com
Re: Status: Working on MMU and memory protection, VUGID + Keyrings	MitchAlsup
Re: Status: Working on MMU and memory protection, VUGID + Keyrings	BGB
Re: Status: Working on MMU and memory protection, VUGID + Keyrings	robf...@gmail.com
Re: Status: Working on MMU and memory protection, VUGID + Keyrings	BGB

If God had a beard, he'd be a UNIX programmer.

devel / comp.arch / Status: Working on MMU and memory protection, VUGID + Keyrings