Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

System going down in 5 minutes.


devel / comp.arch / Idle: Capability Addressing, Future or Boondoggle

SubjectAuthor
* Idle: Capability Addressing, Future or BoondoggleBGB
+* Re: Idle: Capability Addressing, Future or BoondoggleMitchAlsup
|+- Re: Idle: Capability Addressing, Future or BoondoggleIvan Godard
|`* Re: Idle: Capability Addressing, Future or BoondoggleBGB
| `* Re: Idle: Capability Addressing, Future or BoondoggleStephen Fuld
|  `* Re: Idle: Capability Addressing, Future or BoondoggleIvan Godard
|   `* Re: Idle: Capability Addressing, Future or BoondoggleBGB
|    +- Re: Idle: Capability Addressing, Future or BoondoggleJohn Levine
|    +- Re: Idle: Capability Addressing, Future or BoondoggleIvan Godard
|    `- Re: Idle: Capability Addressing, Future or BoondoggleTerje Mathisen
+* Re: Idle: Capability Addressing, Future or BoondoggleStefan Monnier
|+* Re: Idle: Capability Addressing, Future or BoondoggleIvan Godard
||`* Re: Idle: Capability Addressing, Future or BoondoggleTerje Mathisen
|| `- Re: Idle: Capability Addressing, Future or BoondoggleMichael S
|+* Re: Idle: Capability Addressing, Future or BoondoggleTheo Markettos
||`* Re: Idle: Capability Addressing, Future or BoondoggleIvan Godard
|| `* Re: Idle: Capability Addressing, Future or BoondoggleTheo Markettos
||  `- Re: Idle: Capability Addressing, Future or BoondoggleIvan Godard
|`* Re: Idle: Capability Addressing, Future or BoondoggleBGB
| `* Re: Idle: Capability Addressing, Future or BoondoggleMitchAlsup
|  +* Re: Idle: Capability Addressing, Future or BoondoggleIvan Godard
|  |`- Re: Idle: Capability Addressing, Future or BoondoggleBGB
|  +* Re: Idle: Capability Addressing, Future or BoondoggleBGB
|  |`- Re: Idle: Capability Addressing, Future or BoondoggleIvan Godard
|  `* Re: Idle: Capability Addressing, Future or BoondoggleEricP
|   +* ASLR (was: Idle: Capability Addressing, Future or Boondoggle)Anton Ertl
|   |+* Re: ASLREricP
|   ||`* Re: ASLRAnton Ertl
|   || `- Re: ASLREricP
|   |`* Re: ASLR (was: Idle: Capability Addressing, Future or Boondoggle)BGB
|   | +* Re: ASLR (was: Idle: Capability Addressing, Future or Boondoggle)MitchAlsup
|   | |+* Re: ASLR (was: Idle: Capability Addressing, Future or Boondoggle)BGB
|   | ||`* Re: ASLRStefan Monnier
|   | || +- Re: ASLRMitchAlsup
|   | || `* Re: ASLRBGB
|   | ||  `* Re: ASLRMitchAlsup
|   | ||   `- Re: ASLRBGB
|   | |`* Re: ASLR (was: Idle: Capability Addressing, Future or Boondoggle)Ivan Godard
|   | | `- Re: ASLR (was: Idle: Capability Addressing, Future or Boondoggle)MitchAlsup
|   | `* Re: ASLR (was: Idle: Capability Addressing, Future or Boondoggle)Anton Ertl
|   |  `- Re: ASLR (was: Idle: Capability Addressing, Future or Boondoggle)BGB
|   `* Re: Idle: Capability Addressing, Future or BoondoggleIvan Godard
|    `* Re: Idle: Capability Addressing, Future or BoondoggleMitchAlsup
|     `- Re: Idle: Capability Addressing, Future or BoondoggleEricP
+* Re: Idle: Capability Addressing, Future or BoondoggleIvan Godard
|+- Re: Idle: Capability Addressing, Future or BoondoggleBGB
|`- Re: Idle: Capability Addressing, Future or BoondoggleTheo Markettos
`* Re: Idle: Capability Addressing, Future or BoondoggleTheo Markettos
 `- Re: Idle: Capability Addressing, Future or BoondoggleBGB

Pages:12
Idle: Capability Addressing, Future or Boondoggle

<ssv7ff$9n3$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23180&group=comp.arch#23180

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Idle: Capability Addressing, Future or Boondoggle
Date: Thu, 27 Jan 2022 16:48:13 -0600
Organization: A noiseless patient Spider
Lines: 121
Message-ID: <ssv7ff$9n3$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 27 Jan 2022 22:48:15 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="22fa8a10c8805f493293e5d1d58236cf";
logging-data="9955"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+uEQYF82g+pRMPMgQ81bn0"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:KDcz3q4uYfTwx21w8rE7mI7WR/o=
Content-Language: en-US
 by: BGB - Thu, 27 Jan 2022 22:48 UTC

Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM
Morello architecture, which is effectively a modified Aarch64, except:
Expands GPRs to 129 bits (internally);
Integer ISA remains 64-bit;
Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
addresses (the rest of the bits are used as protection flags and bounds
checks);
In effect, all pointers expand to 128 bits in memory;
Code isn't allowed to craft its own pointers (directly) because doing so
would break its memory protection scheme (the 129th bit is stored
separately, seemingly to track which memory locations contain pointers,
and to disallow loading a pointer from memory which has been used for data);
....

From the original thread:
https://www.theregister.com/2022/01/21/arm_morello_testing/
https://www.arm.com/architecture/cpu/morello
https://developer.arm.com/documentation/ddi0606/latest
Also found:
https://github.com/ARM-software/abi-aa/blob/main/aaelf64-morello/aaelf64-morello.rst

In effect, it seems like the program can't be allowed to craft or modify
its own pointers directly, as code doing so would break its whole memory
protection scheme. In effect, pointers would need to be derived from
other pointers.

I guess it works, but I am slightly skeptical of this idea:
Making all the pointers twice the size is not ideal for memory footprint;
This would have a potentially significant impact on existing codebases:
They can no longer assume sizeof(long)==sizeof(void *);
Code can no longer bit-twiddle their pointers;
...

While there are ways one would allow for 64-bit pointers, there isn't a
good way to do so without (also) blowing a massive hole in the whole
security scheme.

It effectively means that any code which relies on NaN tagging, tagged
pointers, or other sorts of pointer hackery, would need to rewritten to
work on such an architecture.

I slightly prefer my "throw Unix-style file protections and ACLs at RAM"
approach, since this need not have any visible impact on how C operates
(pointers can remain 64 bits, code is free to bit-twiddle its pointers,
....).

Some properties of capabilities could be emulated (albeit imperfectly)
by using more conventional ASLR strategies (would be more effective in
an expanded 96-bit address space, as this could make it computationally
infeasible to "guess" the addresses to things).

Though, admittedly, if I were designing it now, I would likely have
skipped over the UID/GID part of VUGID and gone for using ACL checks
exclusively (the ACL check mechanism is more "general purpose" and also
takes up fewer bits in the TLB and page-table entries due to not needing
to store the access-mode bits directly in the TLBE/PTE).

In other cases, the protections from "true" capabilities become
unnecessary, since an MMU ACL check does not need to care how one
arrived at an address, merely whether or not the executing code is
allowed to access it (also cheaper to implement as well, at least with
the ACL miss logic offloaded to an ISR).

One can also gain implicit bounds checking via the huge expanses of
"nothingness" which would surround most larger memory allocations.

Within 96-bit land, it could make sense to enforce a minimum distance of
32GB between mmaps, as this would in effect make it "technically
impossible" for an out-of-bounds access on one object to collide with
any other object.

The "big expanse of nothingness" approach would not offer protection for
normal heap objects or stack allocations, but capabilities will not
offer this either unless there are instructions to compose a smaller
bounds-checked capability from a bigger one (would require more looking
into).

In my case, there were separate (software enforced) schemes for
bounds-checked arrays (as part of the pointer type-tagging scheme):
0xxx: Pointer, xxx=TypeTag
2xxx: Bounded Array, xxx=size (0..4095)
Cxxx: Displaced Array, xxx=Offset (0..4095)

Similar could be done with XMOV pointers, just with the bounds-check
expanded to 28 bits:
(127:112): Bounds(27:12)
(111: 64): Address(95:48)
( 63: 60): Tag (0010)
( 59: 48): Bounds(11:0)
( 47: 0): Address(47:0)

It is possible that, if it really mattered, a case for XMOV instructions
with integrated bounds-checking could be added (to avoid needing to
spend a few extra clock cycles on external bounds checks).

Using out-of-band tag bits to make registers and memory "tamper
resistant" is an interesting idea. There are a few places where this
could be useful.

Main issue is how to do this cost-effectively.
Option 1: Another specialized L1 cache (undesirable);
Option 2: Make cache lines several bits larger.
L2 now needs to deal with it, somehow...

Most ideas I can think of would either require another specialized
cache, or effectively result in non-power-of-2 memory addressing, or both.

Any thoughts?...

Re: Idle: Capability Addressing, Future or Boondoggle

<00b0ef24-81ec-449e-ac4e-603005cf81aan@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23181&group=comp.arch#23181

  copy link   Newsgroups: comp.arch
X-Received: by 2002:ac8:5908:: with SMTP id 8mr4427504qty.61.1643326833443;
Thu, 27 Jan 2022 15:40:33 -0800 (PST)
X-Received: by 2002:a05:6808:13ce:: with SMTP id d14mr9073574oiw.261.1643326833203;
Thu, 27 Jan 2022 15:40:33 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Thu, 27 Jan 2022 15:40:32 -0800 (PST)
In-Reply-To: <ssv7ff$9n3$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:fdab:9770:403c:c245;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:fdab:9770:403c:c245
References: <ssv7ff$9n3$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <00b0ef24-81ec-449e-ac4e-603005cf81aan@googlegroups.com>
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Thu, 27 Jan 2022 23:40:33 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 163
 by: MitchAlsup - Thu, 27 Jan 2022 23:40 UTC

On Thursday, January 27, 2022 at 4:48:18 PM UTC-6, BGB wrote:
> Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM
> Morello architecture, which is effectively a modified Aarch64, except:
> Expands GPRs to 129 bits (internally);
> Integer ISA remains 64-bit;
> Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
> addresses (the rest of the bits are used as protection flags and bounds
> checks);
> In effect, all pointers expand to 128 bits in memory;
> Code isn't allowed to craft its own pointers (directly) because doing so
> would break its memory protection scheme (the 129th bit is stored
> separately, seemingly to track which memory locations contain pointers,
> and to disallow loading a pointer from memory which has been used for data);
> ...
>
> From the original thread:
> https://www.theregister.com/2022/01/21/arm_morello_testing/
> https://www.arm.com/architecture/cpu/morello
> https://developer.arm.com/documentation/ddi0606/latest
> Also found:
>
> https://github.com/ARM-software/abi-aa/blob/main/aaelf64-morello/aaelf64-morello.rst
>
>
> In effect, it seems like the program can't be allowed to craft or modify
> its own pointers directly, as code doing so would break its whole memory
> protection scheme. In effect, pointers would need to be derived from
> other pointers.
>
I am sitting around wondering whether there is an exploit based on having
the OS as the upper (negative) part of the users address space. My guess
is that there is......so, unless they rewrite Linux this is not much more than
an experiment to see if anyone will "buy" this.
>
> I guess it works, but I am slightly skeptical of this idea:
> Making all the pointers twice the size is not ideal for memory footprint;
> This would have a potentially significant impact on existing codebases:
> They can no longer assume sizeof(long)==sizeof(void *);
> Code can no longer bit-twiddle their pointers;
<
Which might be a GOOD thing, but I digress.
<
> ...
>
> While there are ways one would allow for 64-bit pointers, there isn't a
> good way to do so without (also) blowing a massive hole in the whole
> security scheme.
<
<
I do not see how to create an array/matrix that consumes 60+ bits of address
space.
>
> It effectively means that any code which relies on NaN tagging, tagged
> pointers, or other sorts of pointer hackery, would need to rewritten to
> work on such an architecture.
>
No pity from me, here.
>
>
> I slightly prefer my "throw Unix-style file protections and ACLs at RAM"
> approach, since this need not have any visible impact on how C operates
> (pointers can remain 64 bits, code is free to bit-twiddle its pointers,
> ...).
>
> Some properties of capabilities could be emulated (albeit imperfectly)
> by using more conventional ASLR strategies (would be more effective in
> an expanded 96-bit address space, as this could make it computationally
> infeasible to "guess" the addresses to things).
<
My guess is that RoP attacks will still be feasible, don't see how this
alters MeltDown or Spectré like attacks.
>
> Though, admittedly, if I were designing it now, I would likely have
> skipped over the UID/GID part of VUGID and gone for using ACL checks
> exclusively (the ACL check mechanism is more "general purpose" and also
> takes up fewer bits in the TLB and page-table entries due to not needing
> to store the access-mode bits directly in the TLBE/PTE).
>
>
> In other cases, the protections from "true" capabilities become
> unnecessary, since an MMU ACL check does not need to care how one
> arrived at an address, merely whether or not the executing code is
> allowed to access it (also cheaper to implement as well, at least with
> the ACL miss logic offloaded to an ISR).
>
>
> One can also gain implicit bounds checking via the huge expanses of
> "nothingness" which would surround most larger memory allocations.
>
> Within 96-bit land, it could make sense to enforce a minimum distance of
> 32GB between mmaps, as this would in effect make it "technically
> impossible" for an out-of-bounds access on one object to collide with
> any other object.
>
>
> The "big expanse of nothingness" approach would not offer protection for
> normal heap objects or stack allocations, but capabilities will not
> offer this either unless there are instructions to compose a smaller
> bounds-checked capability from a bigger one (would require more looking
> into).
>
>
> In my case, there were separate (software enforced) schemes for
> bounds-checked arrays (as part of the pointer type-tagging scheme):
> 0xxx: Pointer, xxx=TypeTag
> 2xxx: Bounded Array, xxx=size (0..4095)
> Cxxx: Displaced Array, xxx=Offset (0..4095)
>
>
> Similar could be done with XMOV pointers, just with the bounds-check
> expanded to 28 bits:
> (127:112): Bounds(27:12)
> (111: 64): Address(95:48)
> ( 63: 60): Tag (0010)
> ( 59: 48): Bounds(11:0)
> ( 47: 0): Address(47:0)
>
> It is possible that, if it really mattered, a case for XMOV instructions
> with integrated bounds-checking could be added (to avoid needing to
> spend a few extra clock cycles on external bounds checks).
>
>
>
> Using out-of-band tag bits to make registers and memory "tamper
> resistant" is an interesting idea. There are a few places where this
> could be useful.
>
> Main issue is how to do this cost-effectively.
> Option 1: Another specialized L1 cache (undesirable);
> Option 2: Make cache lines several bits larger.
> L2 now needs to deal with it, somehow...
>
> Most ideas I can think of would either require another specialized
> cache, or effectively result in non-power-of-2 memory addressing, or both..
>
>
> Any thoughts?...
<
Will go don in the history books as another attempt at capabilities.

Re: Idle: Capability Addressing, Future or Boondoggle

<jwv5yq4u2zp.fsf-monnier+comp.arch@gnu.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23182&group=comp.arch#23182

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: monn...@iro.umontreal.ca (Stefan Monnier)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Thu, 27 Jan 2022 19:23:32 -0500
Organization: A noiseless patient Spider
Lines: 30
Message-ID: <jwv5yq4u2zp.fsf-monnier+comp.arch@gnu.org>
References: <ssv7ff$9n3$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="597ebb66b41bd3cf289f70e86675edb5";
logging-data="8380"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18lKmPZPqcsgd4I44ZrDjRu"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/29.0.50 (gnu/linux)
Cancel-Lock: sha1:jiNj10VfQTJqQk/nuelTSDvdbVE=
sha1:vA/P0uyRCZDjwkfotv4HBZUkY2s=
 by: Stefan Monnier - Fri, 28 Jan 2022 00:23 UTC

> Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM Morello
> architecture, which is effectively a modified Aarch64, except:
> Expands GPRs to 129 bits (internally);
> Integer ISA remains 64-bit;
> Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
> addresses (the rest of the bits are used as protection flags and bounds
> checks);
> In effect, all pointers expand to 128 bits in memory;
> Code isn't allowed to craft its own pointers (directly) because doing so
> would break its memory protection scheme (the 129th bit is stored
> separately, seemingly to track which memory locations contain pointers, and
> to disallow loading a pointer from memory which has been used for data);

AFAIK the main issue with such things is what they do about dangling
pointers, i.e. dangling capabilities. Do they rely on a GC that's part
of the "trusted runtime"? or do they disallow deallocation altogether?
or do they rely on address-space randomization to make such dangling
capabilities "harmless" (they'll hopefully only make you crash but are
harder to exploit)?

The designs that keep the actual capabilities in a kind of
separate/secured table (so the untrusted code only handles references to
these capabilities and can thus do anything it wants within its sandbox
without needing any special XX=1bit values) have a much easier time
since they can much more easily ensure the absence of
references/capabilities before deallocating resources, or they can mark
them as dead).

Stefan

Re: Idle: Capability Addressing, Future or Boondoggle

<ssve9b$hf8$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23183&group=comp.arch#23183

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Thu, 27 Jan 2022 16:44:26 -0800
Organization: A noiseless patient Spider
Lines: 146
Message-ID: <ssve9b$hf8$1@dont-email.me>
References: <ssv7ff$9n3$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 28 Jan 2022 00:44:29 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="35da5404627e0cb291966648ee31e44c";
logging-data="17896"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+2bUMC5XBu+QYJSSe+ZEsY"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:AMcW4Z6fhKnJXEzGMVC2uXPz6RE=
In-Reply-To: <ssv7ff$9n3$1@dont-email.me>
Content-Language: en-US
 by: Ivan Godard - Fri, 28 Jan 2022 00:44 UTC

On 1/27/2022 2:48 PM, BGB wrote:
> Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM
> Morello architecture, which is effectively a modified Aarch64, except:
> Expands GPRs to 129 bits (internally);
> Integer ISA remains 64-bit;
> Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
> addresses (the rest of the bits are used as protection flags and bounds
> checks);
> In effect, all pointers expand to 128 bits in memory;
> Code isn't allowed to craft its own pointers (directly) because doing so
> would break its memory protection scheme (the 129th bit is stored
> separately, seemingly to track which memory locations contain pointers,
> and to disallow loading a pointer from memory which has been used for
> data);
> ...
>
> From the original thread:
>   https://www.theregister.com/2022/01/21/arm_morello_testing/
>   https://www.arm.com/architecture/cpu/morello
>   https://developer.arm.com/documentation/ddi0606/latest
> Also found:
>
> https://github.com/ARM-software/abi-aa/blob/main/aaelf64-morello/aaelf64-morello.rst
>
>
>
> In effect, it seems like the program can't be allowed to craft or modify
> its own pointers directly, as code doing so would break its whole memory
> protection scheme. In effect, pointers would need to be derived from
> other pointers.
>
>
> I guess it works, but I am slightly skeptical of this idea:
> Making all the pointers twice the size is not ideal for memory footprint;
> This would have a potentially significant impact on existing codebases:
>   They can no longer assume sizeof(long)==sizeof(void *);
>   Code can no longer bit-twiddle their pointers;
>   ...
>
> While there are ways one would allow for 64-bit pointers, there isn't a
> good way to do so without (also) blowing a massive hole in the whole
> security scheme.
>
> It effectively means that any code which relies on NaN tagging, tagged
> pointers, or other sorts of pointer hackery, would need to rewritten to
> work on such an architecture.
>
>
>
> I slightly prefer my "throw Unix-style file protections and ACLs at RAM"
> approach, since this need not have any visible impact on how C operates
> (pointers can remain 64 bits, code is free to bit-twiddle its pointers,
> ...).
>
> Some properties of capabilities could be emulated (albeit imperfectly)
> by using more conventional ASLR strategies (would be more effective in
> an expanded 96-bit address space, as this could make it computationally
> infeasible to "guess" the addresses to things).
>
> Though, admittedly, if I were designing it now, I would likely have
> skipped over the UID/GID part of VUGID and gone for using ACL checks
> exclusively (the ACL check mechanism is more "general purpose" and also
> takes up fewer bits in the TLB and page-table entries due to not needing
> to store the access-mode bits directly in the TLBE/PTE).
>
>
> In other cases, the protections from "true" capabilities become
> unnecessary, since an MMU ACL check does not need to care how one
> arrived at an address, merely whether or not the executing code is
> allowed to access it (also cheaper to implement as well, at least with
> the ACL miss logic offloaded to an ISR).
>
>
> One can also gain implicit bounds checking via the huge expanses of
> "nothingness" which would surround most larger memory allocations.
>
> Within 96-bit land, it could make sense to enforce a minimum distance of
> 32GB between mmaps, as this would in effect make it "technically
> impossible" for an out-of-bounds access on one object to collide with
> any other object.
>
>
> The "big expanse of nothingness" approach would not offer protection for
> normal heap objects or stack allocations, but capabilities will not
> offer this either unless there are instructions to compose a smaller
> bounds-checked capability from a bigger one (would require more looking
> into).
>
>
> In my case, there were separate (software enforced) schemes for
> bounds-checked arrays (as part of the pointer type-tagging scheme):
>   0xxx: Pointer, xxx=TypeTag
>   2xxx: Bounded Array, xxx=size (0..4095)
>   Cxxx: Displaced Array, xxx=Offset (0..4095)
>
>
> Similar could be done with XMOV pointers, just with the bounds-check
> expanded to 28 bits:
>   (127:112): Bounds(27:12)
>   (111: 64): Address(95:48)
>   ( 63: 60): Tag (0010)
>   ( 59: 48): Bounds(11:0)
>   ( 47:  0): Address(47:0)
>
> It is possible that, if it really mattered, a case for XMOV instructions
> with integrated bounds-checking could be added (to avoid needing to
> spend a few extra clock cycles on external bounds checks).
>
>
>
> Using out-of-band tag bits to make registers and memory "tamper
> resistant" is an interesting idea. There are a few places where this
> could be useful.
>
> Main issue is how to do this cost-effectively.
>   Option 1: Another specialized L1 cache (undesirable);
>   Option 2: Make cache lines several bits larger.
>     L2 now needs to deal with it, somehow...
>
> Most ideas I can think of would either require another specialized
> cache, or effectively result in non-power-of-2 memory addressing, or both.
>
>
> Any thoughts?...

Many thoughts.

I'm a great fan of caps systems; I wish Mill could have been a caps
architecture.

The problem is that a move to 128 bit pointers breaks existing codes
that assume what the pointer size is, due to externally defined
interfaces such as data layouts. The market was willing to tolerate such
breaks in the moves 16->32->64 bit addresses because the older, smaller
sizes became hopelessly unusable. But move to 126? No - people will say
"My code runs fine with 64-bit pointers. These caps thingummies claim to
offer a bunch of new benefits, but I don't yet have to have them, so
I'll look into it - someday."

A secondary issue is that there's an enormous mass of unstructured
monolithic code out there, that would need majorly redesign to use caps
for anything more than bounds checking. Bounds checking is the least of
the gain from caps.But the legacy data format issue is why we went with
a grant system. It helps that we can do bounds checking without needing
expanded data.

Re: Idle: Capability Addressing, Future or Boondoggle

<ssvegu$gop$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23184&group=comp.arch#23184

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Thu, 27 Jan 2022 16:48:30 -0800
Organization: A noiseless patient Spider
Lines: 76
Message-ID: <ssvegu$gop$1@dont-email.me>
References: <ssv7ff$9n3$1@dont-email.me>
<00b0ef24-81ec-449e-ac4e-603005cf81aan@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 28 Jan 2022 00:48:30 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="35da5404627e0cb291966648ee31e44c";
logging-data="17177"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18WzKid0TC9iIuPAFrhbLyV"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:VMSLFEmNmV9pV4dx48lScJ4u56s=
In-Reply-To: <00b0ef24-81ec-449e-ac4e-603005cf81aan@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Fri, 28 Jan 2022 00:48 UTC

On 1/27/2022 3:40 PM, MitchAlsup wrote:
> On Thursday, January 27, 2022 at 4:48:18 PM UTC-6, BGB wrote:
>> Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM
>> Morello architecture, which is effectively a modified Aarch64, except:
>> Expands GPRs to 129 bits (internally);
>> Integer ISA remains 64-bit;
>> Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
>> addresses (the rest of the bits are used as protection flags and bounds
>> checks);
>> In effect, all pointers expand to 128 bits in memory;
>> Code isn't allowed to craft its own pointers (directly) because doing so
>> would break its memory protection scheme (the 129th bit is stored
>> separately, seemingly to track which memory locations contain pointers,
>> and to disallow loading a pointer from memory which has been used for data);
>> ...
>>
>> From the original thread:
>> https://www.theregister.com/2022/01/21/arm_morello_testing/
>> https://www.arm.com/architecture/cpu/morello
>> https://developer.arm.com/documentation/ddi0606/latest
>> Also found:
>>
>> https://github.com/ARM-software/abi-aa/blob/main/aaelf64-morello/aaelf64-morello.rst
>>
>>
>> In effect, it seems like the program can't be allowed to craft or modify
>> its own pointers directly, as code doing so would break its whole memory
>> protection scheme. In effect, pointers would need to be derived from
>> other pointers.
>>
> I am sitting around wondering whether there is an exploit based on having
> the OS as the upper (negative) part of the users address space. My guess
> is that there is......so, unless they rewrite Linux this is not much more than
> an experiment to see if anyone will "buy" this.
>>
>> I guess it works, but I am slightly skeptical of this idea:
>> Making all the pointers twice the size is not ideal for memory footprint;
>> This would have a potentially significant impact on existing codebases:
>> They can no longer assume sizeof(long)==sizeof(void *);
>> Code can no longer bit-twiddle their pointers;
> <
> Which might be a GOOD thing, but I digress.
> <
>> ...
>>
>> While there are ways one would allow for 64-bit pointers, there isn't a
>> good way to do so without (also) blowing a massive hole in the whole
>> security scheme.
> <
> <
> I do not see how to create an array/matrix that consumes 60+ bits of address
> space.
>>
>> It effectively means that any code which relies on NaN tagging, tagged
>> pointers, or other sorts of pointer hackery, would need to rewritten to
>> work on such an architecture.
>>
> No pity from me, here.
>>
>>
>> I slightly prefer my "throw Unix-style file protections and ACLs at RAM"
>> approach, since this need not have any visible impact on how C operates
>> (pointers can remain 64 bits, code is free to bit-twiddle its pointers,
>> ...).
>>
>> Some properties of capabilities could be emulated (albeit imperfectly)
>> by using more conventional ASLR strategies (would be more effective in
>> an expanded 96-bit address space, as this could make it computationally
>> infeasible to "guess" the addresses to things).
> <
> My guess is that RoP attacks will still be feasible, don't see how this
> alters MeltDown or Spectré like attacks.

ROP can be blocked because the return address is a cap and can't be
diddled. Meltdown and Spectre don't depend on addressing and so are
unhindered.

Re: Idle: Capability Addressing, Future or Boondoggle

<ssves4$km1$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23185&group=comp.arch#23185

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Thu, 27 Jan 2022 16:54:28 -0800
Organization: A noiseless patient Spider
Lines: 40
Message-ID: <ssves4$km1$1@dont-email.me>
References: <ssv7ff$9n3$1@dont-email.me>
<jwv5yq4u2zp.fsf-monnier+comp.arch@gnu.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 28 Jan 2022 00:54:28 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="35da5404627e0cb291966648ee31e44c";
logging-data="21185"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/H//sHGM/rXDRr2Y46x98i"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:2u0VCrtukymQm+zmnQxufVoaA4E=
In-Reply-To: <jwv5yq4u2zp.fsf-monnier+comp.arch@gnu.org>
Content-Language: en-US
 by: Ivan Godard - Fri, 28 Jan 2022 00:54 UTC

On 1/27/2022 4:23 PM, Stefan Monnier wrote:
>> Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM Morello
>> architecture, which is effectively a modified Aarch64, except:
>> Expands GPRs to 129 bits (internally);
>> Integer ISA remains 64-bit;
>> Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
>> addresses (the rest of the bits are used as protection flags and bounds
>> checks);
>> In effect, all pointers expand to 128 bits in memory;
>> Code isn't allowed to craft its own pointers (directly) because doing so
>> would break its memory protection scheme (the 129th bit is stored
>> separately, seemingly to track which memory locations contain pointers, and
>> to disallow loading a pointer from memory which has been used for data);
>
> AFAIK the main issue with such things is what they do about dangling
> pointers, i.e. dangling capabilities. Do they rely on a GC that's part
> of the "trusted runtime"? or do they disallow deallocation altogether?
> or do they rely on address-space randomization to make such dangling
> capabilities "harmless" (they'll hopefully only make you crash but are
> harder to exploit)?
>
> The designs that keep the actual capabilities in a kind of
> separate/secured table (so the untrusted code only handles references to
> these capabilities and can thus do anything it wants within its sandbox
> without needing any special XX=1bit values) have a much easier time
> since they can much more easily ensure the absence of
> references/capabilities before deallocating resources, or they can mark
> them as dead).
>
>
> Stefan

Isn't it Terje who says anything can be done with another layer of
indirection?

Revoke of permissions (and scavenge of permitted resources) is a very
hard problem in any permission system, including caps - and paging
tables too for that matter. These days narrow-cast caps (a.k.a handles)
typically use usecount GC. Which in practice works OK for well-behaved
users, Let a DOS attacker on your system though...

Re: Idle: Capability Addressing, Future or Boondoggle

<ssvkuk$m3k$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23186&group=comp.arch#23186

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Thu, 27 Jan 2022 20:38:10 -0600
Organization: A noiseless patient Spider
Lines: 250
Message-ID: <ssvkuk$m3k$1@dont-email.me>
References: <ssv7ff$9n3$1@dont-email.me>
<00b0ef24-81ec-449e-ac4e-603005cf81aan@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 28 Jan 2022 02:38:12 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="173bd9d21c5738ccf991e2b2f387919d";
logging-data="22644"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/hEOaquW4CMljwyamg4fN2"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:J2iYOeqnonSIvbUxC36RJvfFjhw=
In-Reply-To: <00b0ef24-81ec-449e-ac4e-603005cf81aan@googlegroups.com>
Content-Language: en-US
 by: BGB - Fri, 28 Jan 2022 02:38 UTC

On 1/27/2022 5:40 PM, MitchAlsup wrote:
> On Thursday, January 27, 2022 at 4:48:18 PM UTC-6, BGB wrote:
>> Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM
>> Morello architecture, which is effectively a modified Aarch64, except:
>> Expands GPRs to 129 bits (internally);
>> Integer ISA remains 64-bit;
>> Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
>> addresses (the rest of the bits are used as protection flags and bounds
>> checks);
>> In effect, all pointers expand to 128 bits in memory;
>> Code isn't allowed to craft its own pointers (directly) because doing so
>> would break its memory protection scheme (the 129th bit is stored
>> separately, seemingly to track which memory locations contain pointers,
>> and to disallow loading a pointer from memory which has been used for data);
>> ...
>>
>> From the original thread:
>> https://www.theregister.com/2022/01/21/arm_morello_testing/
>> https://www.arm.com/architecture/cpu/morello
>> https://developer.arm.com/documentation/ddi0606/latest
>> Also found:
>>
>> https://github.com/ARM-software/abi-aa/blob/main/aaelf64-morello/aaelf64-morello.rst
>>
>>
>> In effect, it seems like the program can't be allowed to craft or modify
>> its own pointers directly, as code doing so would break its whole memory
>> protection scheme. In effect, pointers would need to be derived from
>> other pointers.
>>
> I am sitting around wondering whether there is an exploit based on having
> the OS as the upper (negative) part of the users address space. My guess
> is that there is......so, unless they rewrite Linux this is not much more than
> an experiment to see if anyone will "buy" this.

Unless there are fairly significant modifications to a lot of existing
code, it is quite possible that this would be dead on arrival.

Like, even if it is still (more or less) the same basic Aarch64 ISA at
its core, needing an extensive porting effort to make the existing
software ecosystem work on it is likely to be a deal-breaker for general
adoption.

This isn't exactly an "everything will work as-is" or even a "recompile
and it's all good" level of a change.

>>
>> I guess it works, but I am slightly skeptical of this idea:
>> Making all the pointers twice the size is not ideal for memory footprint;
>> This would have a potentially significant impact on existing codebases:
>> They can no longer assume sizeof(long)==sizeof(void *);
>> Code can no longer bit-twiddle their pointers;
> <
> Which might be a GOOD thing, but I digress.
> <

This would, however, break a lot of real world code.

>> ...
>>
>> While there are ways one would allow for 64-bit pointers, there isn't a
>> good way to do so without (also) blowing a massive hole in the whole
>> security scheme.
> <
> <
> I do not see how to create an array/matrix that consumes 60+ bits of address
> space.

Yeah, apparently this would not be possible on either Morello, or
"natively" on BJX2.

Morello:
The capability encodings will not allow for it.

BJX2:
Will not fit within a 48 bit quadrant;
Nor within a 33 bit displacement;
But, could be emulated via ALU ops...

__lea_mega_b: //X4=Base, R6=Disp
MOV 0x0000FFFFFFFFFFFF, R16
MOV 0xFFFF000000000000, R17
AND R4, R16, R18 | AND R4, R17, R2
MOV R5, R3 | ADD R18, R6, R18
AND R18, R16, R20 | SHAD.Q R18, -48, R21
OR R20, R2 | ADD R21, R3
RTS

Not ideal, but probably works (could emulate a full 64-bit displacement
within the 96 bit addressing scheme).

Currently most things assume either an 'int' or 'unsigned int' displacement.

>>
>> It effectively means that any code which relies on NaN tagging, tagged
>> pointers, or other sorts of pointer hackery, would need to rewritten to
>> work on such an architecture.
>>
> No pity from me, here.

Things like NaN tagging and similar are not exactly uncommon.

>>
>>
>> I slightly prefer my "throw Unix-style file protections and ACLs at RAM"
>> approach, since this need not have any visible impact on how C operates
>> (pointers can remain 64 bits, code is free to bit-twiddle its pointers,
>> ...).
>>
>> Some properties of capabilities could be emulated (albeit imperfectly)
>> by using more conventional ASLR strategies (would be more effective in
>> an expanded 96-bit address space, as this could make it computationally
>> infeasible to "guess" the addresses to things).
> <
> My guess is that RoP attacks will still be feasible, don't see how this
> alters MeltDown or Spectré like attacks.

Yeah, dunno...

Some amount of temporal randomization might also be needed.

Things like stack canaries can at least help protect the return address
from tampering due to buffer overruns or similar.

One other partial protection (now enabled again in my compiler) is that
the layout of code within binaries is randomized on every rebuild. So,
the binary can be loaded both to a randomized address and also with
every function and variable shuffled into a random order. Should be
pretty hard to exploit at least assuming the program is recompiled
periodically.

ALSR works OK so long as the RNG is sufficiently unpredictable, which in
turn depends on the availability of a decent entropy source.

A few examples:
Detecting "noise" via a microphone or antenna;
Metastable clocks;
The low order bits of a thermometer;
Entropy from an external source, such as the timing and checksums of
encountered network packets (on a network-connected device);
....

Then, say, the CPU sits around mining entropy and running an internal
random number generator, where code can fetch random numbers on request
(such as via CPUID or similar), or periodically reseeding some random
numbers used internally within the core.

Though, granted, some of this is still a little theoretical in my case,
since ASLR depends some on having programs running from virtual memory,
where virtual memory is still pretty experimental and thus far I haven't
been running programs from virtual memory addresses. May need to get
back to working on this part.

>>
>> Though, admittedly, if I were designing it now, I would likely have
>> skipped over the UID/GID part of VUGID and gone for using ACL checks
>> exclusively (the ACL check mechanism is more "general purpose" and also
>> takes up fewer bits in the TLB and page-table entries due to not needing
>> to store the access-mode bits directly in the TLBE/PTE).
>>
>>
>> In other cases, the protections from "true" capabilities become
>> unnecessary, since an MMU ACL check does not need to care how one
>> arrived at an address, merely whether or not the executing code is
>> allowed to access it (also cheaper to implement as well, at least with
>> the ACL miss logic offloaded to an ISR).
>>
>>
>> One can also gain implicit bounds checking via the huge expanses of
>> "nothingness" which would surround most larger memory allocations.
>>
>> Within 96-bit land, it could make sense to enforce a minimum distance of
>> 32GB between mmaps, as this would in effect make it "technically
>> impossible" for an out-of-bounds access on one object to collide with
>> any other object.
>>
>>
>> The "big expanse of nothingness" approach would not offer protection for
>> normal heap objects or stack allocations, but capabilities will not
>> offer this either unless there are instructions to compose a smaller
>> bounds-checked capability from a bigger one (would require more looking
>> into).
>>
>>
>> In my case, there were separate (software enforced) schemes for
>> bounds-checked arrays (as part of the pointer type-tagging scheme):
>> 0xxx: Pointer, xxx=TypeTag
>> 2xxx: Bounded Array, xxx=size (0..4095)
>> Cxxx: Displaced Array, xxx=Offset (0..4095)
>>
>>
>> Similar could be done with XMOV pointers, just with the bounds-check
>> expanded to 28 bits:
>> (127:112): Bounds(27:12)
>> (111: 64): Address(95:48)
>> ( 63: 60): Tag (0010)
>> ( 59: 48): Bounds(11:0)
>> ( 47: 0): Address(47:0)
>>
>> It is possible that, if it really mattered, a case for XMOV instructions
>> with integrated bounds-checking could be added (to avoid needing to
>> spend a few extra clock cycles on external bounds checks).
>>
>>
>>
>> Using out-of-band tag bits to make registers and memory "tamper
>> resistant" is an interesting idea. There are a few places where this
>> could be useful.
>>
>> Main issue is how to do this cost-effectively.
>> Option 1: Another specialized L1 cache (undesirable);
>> Option 2: Make cache lines several bits larger.
>> L2 now needs to deal with it, somehow...
>>
>> Most ideas I can think of would either require another specialized
>> cache, or effectively result in non-power-of-2 memory addressing, or both.
>>
>>
>> Any thoughts?...
> <
> Will go don in the history books as another attempt at capabilities.


Click here to read the complete article
Re: Idle: Capability Addressing, Future or Boondoggle

<ssvnji$7gl$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23187&group=comp.arch#23187

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Thu, 27 Jan 2022 21:23:29 -0600
Organization: A noiseless patient Spider
Lines: 203
Message-ID: <ssvnji$7gl$1@dont-email.me>
References: <ssv7ff$9n3$1@dont-email.me> <ssve9b$hf8$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 28 Jan 2022 03:23:30 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="173bd9d21c5738ccf991e2b2f387919d";
logging-data="7701"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19BgIytLoMwCCSDBLZ8x3+2"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:xikIk3zDkhl0elDE0glvYFtdVyk=
In-Reply-To: <ssve9b$hf8$1@dont-email.me>
Content-Language: en-US
 by: BGB - Fri, 28 Jan 2022 03:23 UTC

On 1/27/2022 6:44 PM, Ivan Godard wrote:
> On 1/27/2022 2:48 PM, BGB wrote:
>> Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM
>> Morello architecture, which is effectively a modified Aarch64, except:
>> Expands GPRs to 129 bits (internally);
>> Integer ISA remains 64-bit;
>> Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
>> addresses (the rest of the bits are used as protection flags and
>> bounds checks);
>> In effect, all pointers expand to 128 bits in memory;
>> Code isn't allowed to craft its own pointers (directly) because doing
>> so would break its memory protection scheme (the 129th bit is stored
>> separately, seemingly to track which memory locations contain
>> pointers, and to disallow loading a pointer from memory which has been
>> used for data);
>> ...
>>
>>  From the original thread:
>>    https://www.theregister.com/2022/01/21/arm_morello_testing/
>>    https://www.arm.com/architecture/cpu/morello
>>    https://developer.arm.com/documentation/ddi0606/latest
>> Also found:
>>
>> https://github.com/ARM-software/abi-aa/blob/main/aaelf64-morello/aaelf64-morello.rst
>>
>>
>>
>> In effect, it seems like the program can't be allowed to craft or
>> modify its own pointers directly, as code doing so would break its
>> whole memory protection scheme. In effect, pointers would need to be
>> derived from other pointers.
>>
>>
>> I guess it works, but I am slightly skeptical of this idea:
>> Making all the pointers twice the size is not ideal for memory footprint;
>> This would have a potentially significant impact on existing codebases:
>>    They can no longer assume sizeof(long)==sizeof(void *);
>>    Code can no longer bit-twiddle their pointers;
>>    ...
>>
>> While there are ways one would allow for 64-bit pointers, there isn't
>> a good way to do so without (also) blowing a massive hole in the whole
>> security scheme.
>>
>> It effectively means that any code which relies on NaN tagging, tagged
>> pointers, or other sorts of pointer hackery, would need to rewritten
>> to work on such an architecture.
>>
>>
>>
>> I slightly prefer my "throw Unix-style file protections and ACLs at
>> RAM" approach, since this need not have any visible impact on how C
>> operates (pointers can remain 64 bits, code is free to bit-twiddle its
>> pointers, ...).
>>
>> Some properties of capabilities could be emulated (albeit imperfectly)
>> by using more conventional ASLR strategies (would be more effective in
>> an expanded 96-bit address space, as this could make it
>> computationally infeasible to "guess" the addresses to things).
>>
>> Though, admittedly, if I were designing it now, I would likely have
>> skipped over the UID/GID part of VUGID and gone for using ACL checks
>> exclusively (the ACL check mechanism is more "general purpose" and
>> also takes up fewer bits in the TLB and page-table entries due to not
>> needing to store the access-mode bits directly in the TLBE/PTE).
>>
>>
>> In other cases, the protections from "true" capabilities become
>> unnecessary, since an MMU ACL check does not need to care how one
>> arrived at an address, merely whether or not the executing code is
>> allowed to access it (also cheaper to implement as well, at least with
>> the ACL miss logic offloaded to an ISR).
>>
>>
>> One can also gain implicit bounds checking via the huge expanses of
>> "nothingness" which would surround most larger memory allocations.
>>
>> Within 96-bit land, it could make sense to enforce a minimum distance
>> of 32GB between mmaps, as this would in effect make it "technically
>> impossible" for an out-of-bounds access on one object to collide with
>> any other object.
>>
>>
>> The "big expanse of nothingness" approach would not offer protection
>> for normal heap objects or stack allocations, but capabilities will
>> not offer this either unless there are instructions to compose a
>> smaller bounds-checked capability from a bigger one (would require
>> more looking into).
>>
>>
>> In my case, there were separate (software enforced) schemes for
>> bounds-checked arrays (as part of the pointer type-tagging scheme):
>>    0xxx: Pointer, xxx=TypeTag
>>    2xxx: Bounded Array, xxx=size (0..4095)
>>    Cxxx: Displaced Array, xxx=Offset (0..4095)
>>
>>
>> Similar could be done with XMOV pointers, just with the bounds-check
>> expanded to 28 bits:
>>    (127:112): Bounds(27:12)
>>    (111: 64): Address(95:48)
>>    ( 63: 60): Tag (0010)
>>    ( 59: 48): Bounds(11:0)
>>    ( 47:  0): Address(47:0)
>>
>> It is possible that, if it really mattered, a case for XMOV
>> instructions with integrated bounds-checking could be added (to avoid
>> needing to spend a few extra clock cycles on external bounds checks).
>>
>>
>>
>> Using out-of-band tag bits to make registers and memory "tamper
>> resistant" is an interesting idea. There are a few places where this
>> could be useful.
>>
>> Main issue is how to do this cost-effectively.
>>    Option 1: Another specialized L1 cache (undesirable);
>>    Option 2: Make cache lines several bits larger.
>>      L2 now needs to deal with it, somehow...
>>
>> Most ideas I can think of would either require another specialized
>> cache, or effectively result in non-power-of-2 memory addressing, or
>> both.
>>
>>
>> Any thoughts?...
>
> Many thoughts.
>
> I'm a great fan of caps systems; I wish Mill could have been a caps
> architecture.
>
> The problem is that a move to 128 bit pointers breaks existing codes
> that assume what the pointer size is, due to externally defined
> interfaces such as data layouts. The market was willing to tolerate such
> breaks in the moves 16->32->64 bit addresses because the older, smaller
> sizes became hopelessly unusable. But move to 126? No - people will say
> "My code runs fine with 64-bit pointers. These caps thingummies claim to
> offer a bunch of new benefits, but I don't yet have to have them, so
> I'll look into it - someday."
>

Similar issue.

For BJX2, the default pointer size remains at 64 bits with a 48-bit
logical address (luckily the VUGID/ACL checks don't care about this part).

There is a 96-bit address space via XMOV and "int __huge *addr;" and
similar, but I don't expect this would see much beyond niche usage.

Though, XMOV does have the niche use-case of being able to fake separate
virtual address spaces within a single larger virtual address space.

Say, spawn a thread in a different quadrant, and it may as well be in
its own 48-bit address space. Short of it being able to decipher the
ASLR addresses, it has little chance of being able to figure out the
addresses of other stuff within the larger shared virtual address space.

It will then use 48 bit addresses via 64 bit pointers within its own
quadrant, and able to be unaware as to what might exist within the other
quadrants.

Though, there is a potential risk here if the OS kernel is always in, say:
0000_00000000:8000_xxxxxxxx

Or, the root process always in "quadrant zero", ...

This would be somewhat easy to guess, so ideally the OS kernel and
similar would need to itself be ASLR'ed (in addition to protection ring
and VUGID checks and similar).

So, at least, the bigger pointers in my case can give a bigger address
space (unlike Morello / CHERI).

> A secondary issue is that there's an enormous mass of unstructured
> monolithic code out there, that would need majorly redesign to use caps
> for anything more than bounds checking. Bounds checking is the least of
> the gain from caps.But the legacy data format issue is why we went with
> a grant system. It helps that we can do bounds checking without needing
> expanded data.
>

Yeah.

The need for significant redesign is a big concern of mine regarding
capability addressing.

I suspect this is a major advantage of my VUGID/ACL system: it is pretty
much invisible to the application...

You could also bolt it onto ARM, or x64, or RISC-V, and pretty much
nothing would need to notice that anything was different.

Well, apart from the ability to now impose additional security checks on
memory objects within a process, or to spawn threads under more
restrictive keyrings, ...


Click here to read the complete article
Re: Idle: Capability Addressing, Future or Boondoggle

<st020a$nce$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23189&group=comp.arch#23189

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: sfu...@alumni.cmu.edu.invalid (Stephen Fuld)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Thu, 27 Jan 2022 22:20:55 -0800
Organization: A noiseless patient Spider
Lines: 20
Message-ID: <st020a$nce$1@dont-email.me>
References: <ssv7ff$9n3$1@dont-email.me>
<00b0ef24-81ec-449e-ac4e-603005cf81aan@googlegroups.com>
<ssvkuk$m3k$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 28 Jan 2022 06:20:58 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="76e4e5052de439f53379af31db2a6fa2";
logging-data="23950"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19ChZl4+jMkVoZiCvRLuKrR5nj+umEg2hU="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:mFA4Pa6+Yt/ObUFJUr8yGqTuNaU=
In-Reply-To: <ssvkuk$m3k$1@dont-email.me>
Content-Language: en-US
 by: Stephen Fuld - Fri, 28 Jan 2022 06:20 UTC

On 1/27/2022 6:38 PM, BGB wrote:

snip

> Most historical machines which have tried using capability addressing
> have either died off, or resulted in machines where their later
> descendants abandoned the use of capabilities (eg: System/38 was
> replaced by systems built on the Power ISA, ...).

I thought that when IBM switched from a proprietary processor to a Power
for the AS/400, they had a custom variant chip that had 65 bits and
used that extra bit for the protection scheme.

But I may be misremembering. :-(

--
- Stephen Fuld
(e-mail address disguised to prevent spam)

Re: Idle: Capability Addressing, Future or Boondoggle

<st04b4$1nv$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23190&group=comp.arch#23190

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Thu, 27 Jan 2022 23:00:53 -0800
Organization: A noiseless patient Spider
Lines: 22
Message-ID: <st04b4$1nv$1@dont-email.me>
References: <ssv7ff$9n3$1@dont-email.me>
<00b0ef24-81ec-449e-ac4e-603005cf81aan@googlegroups.com>
<ssvkuk$m3k$1@dont-email.me> <st020a$nce$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 28 Jan 2022 07:00:52 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="35da5404627e0cb291966648ee31e44c";
logging-data="1791"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+8y1N9ZGdNoZGx+70v5WZ0"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:Du1ePonqFkjJXWTys1fsHOw5IYU=
In-Reply-To: <st020a$nce$1@dont-email.me>
Content-Language: en-US
 by: Ivan Godard - Fri, 28 Jan 2022 07:00 UTC

On 1/27/2022 10:20 PM, Stephen Fuld wrote:
> On 1/27/2022 6:38 PM, BGB wrote:
>
> snip
>
>> Most historical machines which have tried using capability addressing
>> have either died off, or resulted in machines where their later
>> descendants abandoned the use of capabilities (eg: System/38 was
>> replaced by systems built on the Power ISA, ...).
>
> I thought that when IBM switched from a proprietary processor to a Power
>  for the AS/400, they had a custom variant chip that had 65 bits and
> used that extra bit for the protection scheme.
>
> But I may be misremembering.  :-(
>
>
>

AS400 apps continued to run as before, with caps; power is in effect
the micro-engine. Same as the Unisys systems, where (I think still) the
micro-engine is x86, but the apps still see the original ISA.

Re: Idle: Capability Addressing, Future or Boondoggle

<st0gl0$1oca$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23191&group=comp.arch#23191

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!rd9pRsUZyxkRLAEK7e/Uzw.user.46.165.242.91.POSTED!not-for-mail
From: terje.ma...@tmsw.no (Terje Mathisen)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Fri, 28 Jan 2022 11:31:04 +0100
Organization: Aioe.org NNTP Server
Message-ID: <st0gl0$1oca$1@gioia.aioe.org>
References: <ssv7ff$9n3$1@dont-email.me>
<jwv5yq4u2zp.fsf-monnier+comp.arch@gnu.org> <ssves4$km1$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="57738"; posting-host="rd9pRsUZyxkRLAEK7e/Uzw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101
Firefox/68.0 SeaMonkey/2.53.10.2
X-Notice: Filtered by postfilter v. 0.9.2
 by: Terje Mathisen - Fri, 28 Jan 2022 10:31 UTC

Ivan Godard wrote:
> On 1/27/2022 4:23 PM, Stefan Monnier wrote:
>>> Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM
>>> Morello
>>> architecture, which is effectively a modified Aarch64, except:
>>> Expands GPRs to 129 bits (internally);
>>> Integer ISA remains 64-bit;
>>> Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
>>> addresses (the rest of the bits are used as protection flags and bounds
>>> checks);
>>> In effect, all pointers expand to 128 bits in memory;
>>> Code isn't allowed to craft its own pointers (directly) because doing so
>>> would break its memory protection scheme (the 129th bit is stored
>>> separately, seemingly to track which memory locations contain
>>> pointers, and
>>> to disallow loading a pointer from memory which has been used for data);
>>
>> AFAIK the main issue with such things is what they do about dangling
>> pointers, i.e. dangling capabilities.  Do they rely on a GC that's part
>> of the "trusted runtime"?  or do they disallow deallocation altogether?
>> or do they rely on address-space randomization to make such dangling
>> capabilities "harmless" (they'll hopefully only make you crash but are
>> harder to exploit)?
>>
>> The designs that keep the actual capabilities in a kind of
>> separate/secured table (so the untrusted code only handles references to
>> these capabilities and can thus do anything it wants within its sandbox
>> without needing any special XX=1bit values) have a much easier time
>> since they can much more easily ensure the absence of
>> references/capabilities before deallocating resources, or they can mark
>> them as dead).
>>
>
> Isn't it Terje who says anything can be done with another layer of
> indirection?

I do say that but it was old knowledge long before I first quoted it.
>
> Revoke of permissions (and scavenge of permitted resources) is a very
> hard problem in any permission system, including caps - and paging
> tables too for that matter. These days narrow-cast caps (a.k.a handles)
> typically use usecount GC. Which in practice works OK for well-behaved
> users, Let a DOS attacker on your system though...

Handles does work, it can probably be done well within a binary order of
magnitude without opening up lots of DOS opportunities.

OTOH, we have lots of examples, typically related to IO (mmap, direct
IO, remote DMA etc) where performance requirements trump almost
everything else even though we have had handle-based file/stream apis
since forever.

Terje

--
- <Terje.Mathisen at tmsw.no>
"almost all programming can be viewed as an exercise in caching"

Re: Idle: Capability Addressing, Future or Boondoggle

<BBe*1HpFy@news.chiark.greenend.org.uk>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23192&group=comp.arch#23192

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!nntp.terraraq.uk!nntp-feed.chiark.greenend.org.uk!ewrotcd!.POSTED!not-for-mail
From: theom+n...@chiark.greenend.org.uk (Theo Markettos)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: 28 Jan 2022 12:30:15 +0000 (GMT)
Organization: University of Cambridge, England
Lines: 175
Message-ID: <BBe*1HpFy@news.chiark.greenend.org.uk>
References: <ssv7ff$9n3$1@dont-email.me>
NNTP-Posting-Host: chiark.greenend.org.uk
X-Trace: chiark.greenend.org.uk 1643373017 11667 212.13.197.229 (28 Jan 2022 12:30:17 GMT)
X-Complaints-To: abuse@chiark.greenend.org.uk
NNTP-Posting-Date: Fri, 28 Jan 2022 12:30:17 +0000 (UTC)
User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX) (Linux/3.16.0-11-amd64 (x86_64))
Originator: theom@chiark.greenend.org.uk ([212.13.197.229])
 by: Theo Markettos - Fri, 28 Jan 2022 12:30 UTC

BGB <cr88192@gmail.com> wrote:
> Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM
> Morello architecture, which is effectively a modified Aarch64, except:
> Expands GPRs to 129 bits (internally);
> Integer ISA remains 64-bit;
> Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
> addresses (the rest of the bits are used as protection flags and bounds
> checks);
> In effect, all pointers expand to 128 bits in memory;
> Code isn't allowed to craft its own pointers (directly) because doing so
> would break its memory protection scheme (the 129th bit is stored
> separately, seemingly to track which memory locations contain pointers,
> and to disallow loading a pointer from memory which has been used for data);
> ...
>
> From the original thread:
> https://www.theregister.com/2022/01/21/arm_morello_testing/
> https://www.arm.com/architecture/cpu/morello
> https://developer.arm.com/documentation/ddi0606/latest
> Also found:
>
> https://github.com/ARM-software/abi-aa/blob/main/aaelf64-morello/aaelf64-morello.rst

Declaration of interest: I'm on the CHERI team at the University of
Cambridge, although speaking personally.

Morello is an implementation of the CHERI capability architecture. CHERI is
archtitecture-neutral, but every implementation needs a certain
amount of localisation for the target architecture. Morello is an
experimental localisation for ARMv8, and a specific implementation in a chip
to evaluate whether that's any good, in a modern CPU.

More CHERI background:
https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-941.pdf
https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-947.pdf
https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-951.pdf

Therefore, in most cases the discussion is about CHERI, as an abstract
model, rather than Morello, which is an implementation of the model. There
will of course be details about the specific implementation in ARMv8, and
that's one thing the Morello programme aims to find out.

> In effect, it seems like the program can't be allowed to craft or modify
> its own pointers directly, as code doing so would break its whole memory
> protection scheme. In effect, pointers would need to be derived from
> other pointers.
>
>
> I guess it works, but I am slightly skeptical of this idea:
> Making all the pointers twice the size is not ideal for memory footprint;
> This would have a potentially significant impact on existing codebases:
> They can no longer assume sizeof(long)==sizeof(void *);
> Code can no longer bit-twiddle their pointers;
> ...
>
> While there are ways one would allow for 64-bit pointers, there isn't a
> good way to do so without (also) blowing a massive hole in the whole
> security scheme.

First thing to say is that CHERI is optional. You can still use 64 bit
pointers in the usual load/store instructions, and have them checked against
a 'default data capabiility' in a coarse grained way. In this 'hybrid' mode
you can then choose particular pointers to be capabilities (via source
annotations). That doesn't get you quite so much fine grained protection by
default, but it means you can pay the cost when you want to and not when you
don't.

In many cases, code is not massively pointer heavy and so the cost is
relatively small. Some code uses pointers more densely (eg Javascript
runtimes) and that might need more careful design (eg use of hybrid mode).
But a lot of C you can just compile up with every pointer be a capability in
we're talking single-digits percent.

> It effectively means that any code which relies on NaN tagging, tagged
> pointers, or other sorts of pointer hackery, would need to rewritten to
> work on such an architecture.

True. It is questionable whether some of those are good ideas, however...
(it seems to be a truth universally acknowledged that, whenever a system is
designed, somebody will come up with cunning wheezes that use it in ways it
was never designed for. Which is fine, until the next system comes along
that doesn't work the same way)

In general the code changes are relatively limited, and often mechanical.
For example in 6 million LOC of KDE, 0.026% needed modification:
https://www.capabilitieslimited.co.uk/pdfs/20210917-capltd-cheri-desktop-report-version1-FINAL.pdf

> I slightly prefer my "throw Unix-style file protections and ACLs at RAM"
> approach, since this need not have any visible impact on how C operates
> (pointers can remain 64 bits, code is free to bit-twiddle its pointers,
> ...).

How would that work within a typical C program? Are you checking those
things with software, or with hardware?

There have been a number of approaches which involve keeping shadow state
out of normal RAM, eg Intel MPX. That's fine if you have a relatively small
amount of shadow state, but doesn't scale very well.

> Some properties of capabilities could be emulated (albeit imperfectly)
> by using more conventional ASLR strategies (would be more effective in
> an expanded 96-bit address space, as this could make it computationally
> infeasible to "guess" the addresses to things).

The problem with ASLR is that it only holds until you have sight of a
pointer. Once you have a pointer, you can trivially compute the ASLR slide.
You can mitigate to some extent by sparse linking, putting each object at a
different ASLR offset, but that loses some of your spatial locality
(thrashing your TLB etc).

> Though, admittedly, if I were designing it now, I would likely have
> skipped over the UID/GID part of VUGID and gone for using ACL checks
> exclusively (the ACL check mechanism is more "general purpose" and also
> takes up fewer bits in the TLB and page-table entries due to not needing
> to store the access-mode bits directly in the TLBE/PTE).
>
>
> In other cases, the protections from "true" capabilities become
> unnecessary, since an MMU ACL check does not need to care how one
> arrived at an address, merely whether or not the executing code is
> allowed to access it (also cheaper to implement as well, at least with
> the ACL miss logic offloaded to an ISR).

I'm not sure I quite follow. What extra state are you adding to implement
the ACL? In the page table, or elsewhere?

We already have R/W/X permissions of course, are you adding a generic 'ACL
ID' which is then looked up in a shadow list? How do you represent who is
allowed and who is not in the list?

> Within 96-bit land, it could make sense to enforce a minimum distance of
> 32GB between mmaps, as this would in effect make it "technically
> impossible" for an out-of-bounds access on one object to collide with
> any other object.

What's special about 32GB? Why can an attacker not craft a 32GB increase?
(I accept it's harder to do this accidentally)

> Similar could be done with XMOV pointers, just with the bounds-check
> expanded to 28 bits:
> (127:112): Bounds(27:12)
> (111: 64): Address(95:48)
> ( 63: 60): Tag (0010)
> ( 59: 48): Bounds(11:0)
> ( 47: 0): Address(47:0)
>
> It is possible that, if it really mattered, a case for XMOV instructions
> with integrated bounds-checking could be added (to avoid needing to
> spend a few extra clock cycles on external bounds checks).

I'm not familiar with XMOV, but how are you enforcing pointer integrity
here? If I can manipulate pointers, what's stopping me crafting one with
whatever bounds I want?

> Using out-of-band tag bits to make registers and memory "tamper
> resistant" is an interesting idea. There are a few places where this
> could be useful.
>
> Main issue is how to do this cost-effectively.
> Option 1: Another specialized L1 cache (undesirable);
> Option 2: Make cache lines several bits larger.
> L2 now needs to deal with it, somehow...
>
> Most ideas I can think of would either require another specialized
> cache, or effectively result in non-power-of-2 memory addressing, or both.

It should be said that CHERI is a point in the space of a tagged memory
architecture. There are a number of others out there, using more tag bits
(up to 64 bits per word). CHERI opts to keep more in-band and only the
single out-of-band tag bit to maintain the in-band integrity. This aims to
keep the costs minimal if you don't use the features. If you're
prepared to pay a higher cost per memory word (more than 1/128 memory
overhead) then there are other things you can do.

Theo

Re: Idle: Capability Addressing, Future or Boondoggle

<BBe*dKpFy@news.chiark.greenend.org.uk>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23193&group=comp.arch#23193

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!news.szaf.org!nntp-feed.chiark.greenend.org.uk!ewrotcd!.POSTED!not-for-mail
From: theom+n...@chiark.greenend.org.uk (Theo Markettos)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: 28 Jan 2022 12:39:43 +0000 (GMT)
Organization: University of Cambridge, England
Lines: 28
Message-ID: <BBe*dKpFy@news.chiark.greenend.org.uk>
References: <ssv7ff$9n3$1@dont-email.me> <jwv5yq4u2zp.fsf-monnier+comp.arch@gnu.org>
NNTP-Posting-Host: chiark.greenend.org.uk
X-Trace: chiark.greenend.org.uk 1643373585 11667 212.13.197.229 (28 Jan 2022 12:39:45 GMT)
X-Complaints-To: abuse@chiark.greenend.org.uk
NNTP-Posting-Date: Fri, 28 Jan 2022 12:39:45 +0000 (UTC)
User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX) (Linux/3.16.0-11-amd64 (x86_64))
Originator: theom@chiark.greenend.org.uk ([212.13.197.229])
 by: Theo Markettos - Fri, 28 Jan 2022 12:39 UTC

Stefan Monnier <monnier@iro.umontreal.ca> wrote:
> AFAIK the main issue with such things is what they do about dangling
> pointers, i.e. dangling capabilities. Do they rely on a GC that's part
> of the "trusted runtime"? or do they disallow deallocation altogether?
> or do they rely on address-space randomization to make such dangling
> capabilities "harmless" (they'll hopefully only make you crash but are
> harder to exploit)?

There are several approaches. One is a GC-like sweep to zap dangling
capabilities. Due to the capability tags we know where the capabilities
are, and which pages are allowed to hold capabilities, and we can skip the
others. That makes it a lot more efficient:

https://www.cl.cam.ac.uk/research/security/ctsrd/pdfs/2020oakland-cornucopia.pdf

> The designs that keep the actual capabilities in a kind of
> separate/secured table (so the untrusted code only handles references to
> these capabilities and can thus do anything it wants within its sandbox
> without needing any special XX=1bit values) have a much easier time
> since they can much more easily ensure the absence of
> references/capabilities before deallocating resources, or they can mark
> them as dead).

Simply zapping a tag is enough to turn a capability into not-a-capability.
So ensuring the absence of capabilities is easy, it's making sure you keep
the ones that you're still supposed to have is where the work comes in.

Theo

Re: Idle: Capability Addressing, Future or Boondoggle

<BBe*LNpFy@news.chiark.greenend.org.uk>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23194&group=comp.arch#23194

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!news.niel.me!nntp.terraraq.uk!nntp-feed.chiark.greenend.org.uk!ewrotcd!.POSTED!not-for-mail
From: theom+n...@chiark.greenend.org.uk (Theo Markettos)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: 28 Jan 2022 12:54:47 +0000 (GMT)
Organization: University of Cambridge, England
Lines: 50
Message-ID: <BBe*LNpFy@news.chiark.greenend.org.uk>
References: <ssv7ff$9n3$1@dont-email.me> <ssve9b$hf8$1@dont-email.me>
NNTP-Posting-Host: chiark.greenend.org.uk
X-Trace: chiark.greenend.org.uk 1643374489 31218 212.13.197.229 (28 Jan 2022 12:54:49 GMT)
X-Complaints-To: abuse@chiark.greenend.org.uk
NNTP-Posting-Date: Fri, 28 Jan 2022 12:54:49 +0000 (UTC)
User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX) (Linux/3.16.0-11-amd64 (x86_64))
Originator: theom@chiark.greenend.org.uk ([212.13.197.229])
 by: Theo Markettos - Fri, 28 Jan 2022 12:54 UTC

Ivan Godard <ivan@millcomputing.com> wrote:
> The problem is that a move to 128 bit pointers breaks existing codes
> that assume what the pointer size is, due to externally defined
> interfaces such as data layouts. The market was willing to tolerate such
> breaks in the moves 16->32->64 bit addresses because the older, smaller
> sizes became hopelessly unusable. But move to 126? No - people will say
> "My code runs fine with 64-bit pointers. These caps thingummies claim to
> offer a bunch of new benefits, but I don't yet have to have them, so
> I'll look into it - someday."

That turns out to be less than you might think. For a lot of the codebases
we've looked at, they've already gone through the 32->64 transition in the
recent (well, 15 years) past, and a lot of that cleanup already happened. A
lot of code still runs on both 32 and 64 (eg i386 and x86-64) and the
necessary changes made it pointer length agnostic.

What's more troublesome is people stuffing pointers inside integers -
uint64_t foo = (uint64_t) &bar;
but that's what intptr_t exists for, and it's not a lot of work to make it
this:
intptr_t foo = (intptr_t) &bar;

If that's something you're stuffing in a file or putting down a network
socket you're doing it wrong: pointers are not supposed to leak from address
spaces, and with the rise of ASLR it's not very meaningful to do this.

(and of course the same applies to the 32/64 transition)

It's not always plain sailing of course, and there are some things that are
particularly troublesome (eg language runtimes) but much application code
isn't so tricky.

We're mostly looking at open source code, though, which tends to undergo
some kind of modernisation as development proceeds. It's possible there are
some antique closed-source codebases out there that are still targeting 32
bit machines, but they're going to suffer other problems in the modern
world.

> A secondary issue is that there's an enormous mass of unstructured
> monolithic code out there, that would need majorly redesign to use caps
> for anything more than bounds checking. Bounds checking is the least of
> the gain from caps.But the legacy data format issue is why we went with
> a grant system. It helps that we can do bounds checking without needing
> expanded data.

Compartmentalisation, ie how to chop things up into distrusting pieces, is
indeed a much harder problem and one where there isn't one true answer. The
plan is to give people the tools and see what they can do with them.

Theo

Re: Idle: Capability Addressing, Future or Boondoggle

<st1hj5$adl$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23199&group=comp.arch#23199

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Fri, 28 Jan 2022 11:53:11 -0800
Organization: A noiseless patient Spider
Lines: 20
Message-ID: <st1hj5$adl$1@dont-email.me>
References: <ssv7ff$9n3$1@dont-email.me>
<jwv5yq4u2zp.fsf-monnier+comp.arch@gnu.org>
<BBe*dKpFy@news.chiark.greenend.org.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 28 Jan 2022 19:53:10 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="35da5404627e0cb291966648ee31e44c";
logging-data="10677"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+acG/OhS3OY/CYW5rkbOQC"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:/bL6ScDZKjCwUlm46WLettLLL4M=
In-Reply-To: <BBe*dKpFy@news.chiark.greenend.org.uk>
Content-Language: en-US
 by: Ivan Godard - Fri, 28 Jan 2022 19:53 UTC

On 1/28/2022 4:39 AM, Theo Markettos wrote:
> Stefan Monnier <monnier@iro.umontreal.ca> wrote:
>> AFAIK the main issue with such things is what they do about dangling
>> pointers, i.e. dangling capabilities. Do they rely on a GC that's part
>> of the "trusted runtime"? or do they disallow deallocation altogether?
>> or do they rely on address-space randomization to make such dangling
>> capabilities "harmless" (they'll hopefully only make you crash but are
>> harder to exploit)?
>
> There are several approaches. One is a GC-like sweep to zap dangling
> capabilities. Due to the capability tags we know where the capabilities
> are, and which pages are allowed to hold capabilities, and we can skip the
> others. That makes it a lot more efficient:

That's why the tagged-architecture Burroughs mainframes had the MSEQ
(Masked Search for Equal) instruction. The mask and comparison were full
width including the tag, so you could pick up any word (51 bits counting
3 bits tag) with a non-data tag. The search engine could run at RAM
speed (ferrite core memory, not DRAM).

Re: Idle: Capability Addressing, Future or Boondoggle

<st1mii$ece$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23201&group=comp.arch#23201

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Fri, 28 Jan 2022 15:18:08 -0600
Organization: A noiseless patient Spider
Lines: 511
Message-ID: <st1mii$ece$1@dont-email.me>
References: <ssv7ff$9n3$1@dont-email.me>
<BBe*1HpFy@news.chiark.greenend.org.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 28 Jan 2022 21:18:10 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="173bd9d21c5738ccf991e2b2f387919d";
logging-data="14734"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19NzDQ31T++wIJmQdFOAqSH"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:QK1w7HWZH/hwf0kuCqfdPgDohQ4=
In-Reply-To: <BBe*1HpFy@news.chiark.greenend.org.uk>
Content-Language: en-US
 by: BGB - Fri, 28 Jan 2022 21:18 UTC

On 1/28/2022 6:30 AM, Theo Markettos wrote:
> BGB <cr88192@gmail.com> wrote:
>> Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM
>> Morello architecture, which is effectively a modified Aarch64, except:
>> Expands GPRs to 129 bits (internally);
>> Integer ISA remains 64-bit;
>> Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
>> addresses (the rest of the bits are used as protection flags and bounds
>> checks);
>> In effect, all pointers expand to 128 bits in memory;
>> Code isn't allowed to craft its own pointers (directly) because doing so
>> would break its memory protection scheme (the 129th bit is stored
>> separately, seemingly to track which memory locations contain pointers,
>> and to disallow loading a pointer from memory which has been used for data);
>> ...
>>
>> From the original thread:
>> https://www.theregister.com/2022/01/21/arm_morello_testing/
>> https://www.arm.com/architecture/cpu/morello
>> https://developer.arm.com/documentation/ddi0606/latest
>> Also found:
>>
>> https://github.com/ARM-software/abi-aa/blob/main/aaelf64-morello/aaelf64-morello.rst
>
> Declaration of interest: I'm on the CHERI team at the University of
> Cambridge, although speaking personally.
>
> Morello is an implementation of the CHERI capability architecture. CHERI is
> archtitecture-neutral, but every implementation needs a certain
> amount of localisation for the target architecture. Morello is an
> experimental localisation for ARMv8, and a specific implementation in a chip
> to evaluate whether that's any good, in a modern CPU.
>
> More CHERI background:
> https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-941.pdf
> https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-947.pdf
> https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-951.pdf
>
> Therefore, in most cases the discussion is about CHERI, as an abstract
> model, rather than Morello, which is an implementation of the model. There
> will of course be details about the specific implementation in ARMv8, and
> that's one thing the Morello programme aims to find out.
>

OK.

>> In effect, it seems like the program can't be allowed to craft or modify
>> its own pointers directly, as code doing so would break its whole memory
>> protection scheme. In effect, pointers would need to be derived from
>> other pointers.
>>
>>
>> I guess it works, but I am slightly skeptical of this idea:
>> Making all the pointers twice the size is not ideal for memory footprint;
>> This would have a potentially significant impact on existing codebases:
>> They can no longer assume sizeof(long)==sizeof(void *);
>> Code can no longer bit-twiddle their pointers;
>> ...
>>
>> While there are ways one would allow for 64-bit pointers, there isn't a
>> good way to do so without (also) blowing a massive hole in the whole
>> security scheme.
>
> First thing to say is that CHERI is optional. You can still use 64 bit
> pointers in the usual load/store instructions, and have them checked against
> a 'default data capabiility' in a coarse grained way. In this 'hybrid' mode
> you can then choose particular pointers to be capabilities (via source
> annotations). That doesn't get you quite so much fine grained protection by
> default, but it means you can pay the cost when you want to and not when you
> don't.
>

Admittedly, my skimming of the specs was fairly quick and dirty (mostly
triggered by a thread on comp.lang.c), so I may have missed a few things.

> In many cases, code is not massively pointer heavy and so the cost is
> relatively small. Some code uses pointers more densely (eg Javascript
> runtimes) and that might need more careful design (eg use of hybrid mode).
> But a lot of C you can just compile up with every pointer be a capability in
> we're talking single-digits percent.
>

Yeah, it is mostly things like JavaScript, LuaJit, etc, where I expect
that many of the issues would come up.

While I could note some problem areas in the Quake engine family, my
ports of these to 64-bits already hacked over a few cases in a way which
"should" tolerate further widening.

Namely, there were a few places where 32-bit addresses were used without
any good way to widen things to 64 bits and not break things (such as
the "progs.dat" VM in Quake 1), I had instead switched to relative
addressing (in Quake 1, treating these addresses as being relative to
the "hunk base", as I had noted that they tended not to point to
anything outside of this area).

It had seemed like one could construct capabilities to things using a
collection of "global capabilities" as a base, but the existence of such
things would not be a good thing for the security model.

>> It effectively means that any code which relies on NaN tagging, tagged
>> pointers, or other sorts of pointer hackery, would need to rewritten to
>> work on such an architecture.
>
> True. It is questionable whether some of those are good ideas, however...
> (it seems to be a truth universally acknowledged that, whenever a system is
> designed, somebody will come up with cunning wheezes that use it in ways it
> was never designed for. Which is fine, until the next system comes along
> that doesn't work the same way)
>
> In general the code changes are relatively limited, and often mechanical.
> For example in 6 million LOC of KDE, 0.026% needed modification:
> https://www.capabilitieslimited.co.uk/pdfs/20210917-capltd-cheri-desktop-report-version1-FINAL.pdf
>

OK.

>> I slightly prefer my "throw Unix-style file protections and ACLs at RAM"
>> approach, since this need not have any visible impact on how C operates
>> (pointers can remain 64 bits, code is free to bit-twiddle its pointers,
>> ...).
>
> How would that work within a typical C program? Are you checking those
> things with software, or with hardware?
>

The actual ACL enforcement is done in hardware via the MMU.

From the software POV, this part is pretty much exactly the same as
more traditional "protection ring" MMU.

The main difference is that there is a supervisor-only register (KRR, or
Keyring Register), which generally user-land code is not allowed to
access (and which holds their keys within the VUGID/ACL system).

Basically, the MMU contains several structures:
A TLB, implemented as a larger 4-way set-associative cache.
It holds a mapping between virtual and physical addresses.
It also contains information about memory protection, ...
An ACL cache, which is a much smaller fully-associative cache.
It is currently 4-entry, but 8-entry would likely be better.
It encodes, for a pair of IDs A accessing B, what is allowed.

So, the MMU sees that ACL checking is used for a page (during the TLB
fetch), then checks this against the entries in the ACL cache, checking
for a match with one of the keys held in the active KRR (Keyring
Register) which itself holds up to 4 IDs (same format as the ACLID, *).

If a match is found, it selects one of the appropriate groups of
protection bits (organized based on "User/Group/Other").

These bits are combined with the top-level page bits from the TLB to
determine what level of access is allowed for the page as a whole.

*: In the current implementation, ACLID values are treated as an
unsigned 16-bit number, with 0 being special (0 either disables this
check, or is ignored, depending on its location).

The MMU however neither performs its own page walks nor ACL lookups.

If there isn't a matching page, the CPU generates a TLB Miss interrupt,
and the OS or similar will need to walk the page tables or similar to
resolve the issue (however, since this part is software, other
"non-page-table" structures are also possible).

So, software walks the table, builds the TLB entry, and then hands it
back to the MMU via a "LDTLB" instruction. After this point, the ISR
returns, and the CPU (again) tries to access the page in question (this
may or may not generate interrupts, the process will continue until no
more interrupts are generated).

If there is a mismatch with the ACL cache (entries in the keyring don't
match those in the ACL cache), the MMU will generate an "ACL Miss"
interrupt (very similar to the TLB Miss scenario).

The ACLID is looked up in a structure vaguely resembling another page
table, with leaf entries which point to associative arrays holding
mappings between the current ACLID and KRR keys. The entry matching what
is requested is then fetched from the table, and sent back to the MMU
via an "LDACL" instruction, or if no match is seen, the ISR will
generate a "No Access" entry, and load this into this via LDACL.


Click here to read the complete article
Re: Idle: Capability Addressing, Future or Boondoggle

<yBe*PKrFy@news.chiark.greenend.org.uk>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23202&group=comp.arch#23202

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!aioe.org!nntp.terraraq.uk!nntp-feed.chiark.greenend.org.uk!ewrotcd!.POSTED!not-for-mail
From: theom+n...@chiark.greenend.org.uk (Theo Markettos)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: 28 Jan 2022 21:48:20 +0000 (GMT)
Organization: University of Cambridge, England
Lines: 18
Message-ID: <yBe*PKrFy@news.chiark.greenend.org.uk>
References: <ssv7ff$9n3$1@dont-email.me> <jwv5yq4u2zp.fsf-monnier+comp.arch@gnu.org> <BBe*dKpFy@news.chiark.greenend.org.uk> <st1hj5$adl$1@dont-email.me>
NNTP-Posting-Host: chiark.greenend.org.uk
X-Trace: chiark.greenend.org.uk 1643406502 21203 212.13.197.229 (28 Jan 2022 21:48:22 GMT)
X-Complaints-To: abuse@chiark.greenend.org.uk
NNTP-Posting-Date: Fri, 28 Jan 2022 21:48:22 +0000 (UTC)
User-Agent: tin/1.8.3-20070201 ("Scotasay") (UNIX) (Linux/3.16.0-11-amd64 (x86_64))
Originator: theom@chiark.greenend.org.uk ([212.13.197.229])
 by: Theo Markettos - Fri, 28 Jan 2022 21:48 UTC

Ivan Godard <ivan@millcomputing.com> wrote:
> That's why the tagged-architecture Burroughs mainframes had the MSEQ
> (Masked Search for Equal) instruction. The mask and comparison were full
> width including the tag, so you could pick up any word (51 bits counting
> 3 bits tag) with a non-data tag. The search engine could run at RAM
> speed (ferrite core memory, not DRAM).

Another trick is having a hierarchical tag cache. One bit in an upper level
of the cache refers to (for example) an entire page, whose individual tag
bits are stored in the lowest level. The upper level bit being set
indicates that there's at least one capability present in the page. That
makes it very easy to skip pages which don't contain capabilities when
scanning.

It's also possible to mark pages in the page table as to whether
capabilities can be stored there, which also constrains where they can go.

Theo

Re: Idle: Capability Addressing, Future or Boondoggle

<st27mb$gvk$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23205&group=comp.arch#23205

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Fri, 28 Jan 2022 18:10:18 -0800
Organization: A noiseless patient Spider
Lines: 28
Message-ID: <st27mb$gvk$1@dont-email.me>
References: <ssv7ff$9n3$1@dont-email.me>
<jwv5yq4u2zp.fsf-monnier+comp.arch@gnu.org>
<BBe*dKpFy@news.chiark.greenend.org.uk> <st1hj5$adl$1@dont-email.me>
<yBe*PKrFy@news.chiark.greenend.org.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 29 Jan 2022 02:10:19 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="96dfe7e9118fb69d3a41b408559493db";
logging-data="17396"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18WNarnj4ClqOuxhp5LgqEV"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:fsBrqaW2z8C5kv3LM+Lr+B1lj5M=
In-Reply-To: <yBe*PKrFy@news.chiark.greenend.org.uk>
Content-Language: en-US
 by: Ivan Godard - Sat, 29 Jan 2022 02:10 UTC

On 1/28/2022 1:48 PM, Theo Markettos wrote:
> Ivan Godard <ivan@millcomputing.com> wrote:
>> That's why the tagged-architecture Burroughs mainframes had the MSEQ
>> (Masked Search for Equal) instruction. The mask and comparison were full
>> width including the tag, so you could pick up any word (51 bits counting
>> 3 bits tag) with a non-data tag. The search engine could run at RAM
>> speed (ferrite core memory, not DRAM).
>
> Another trick is having a hierarchical tag cache. One bit in an upper level
> of the cache refers to (for example) an entire page, whose individual tag
> bits are stored in the lowest level. The upper level bit being set
> indicates that there's at least one capability present in the page. That
> makes it very easy to skip pages which don't contain capabilities when
> scanning.
>
> It's also possible to mark pages in the page table as to whether
> capabilities can be stored there, which also constrains where they can go.
>
> Theo

That approach permits relatively fast search, but adds to the expense of
store and memcpy (etc). As it is common to pass caps (pointers) as
function args and results, the call overhead increases noticeably unless
you have special caching that covers the stack and never write it back -
but that exposes an attack surface in inter-thread references in multicore.

There doesn't seem to be an easy solution for caps, unless you are
Samsung and can make caps memory as cheap as regular.

Re: Idle: Capability Addressing, Future or Boondoggle

<st3qd7$6n6$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23209&group=comp.arch#23209

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Sat, 29 Jan 2022 10:35:49 -0600
Organization: A noiseless patient Spider
Lines: 29
Message-ID: <st3qd7$6n6$1@dont-email.me>
References: <ssv7ff$9n3$1@dont-email.me>
<00b0ef24-81ec-449e-ac4e-603005cf81aan@googlegroups.com>
<ssvkuk$m3k$1@dont-email.me> <st020a$nce$1@dont-email.me>
<st04b4$1nv$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 29 Jan 2022 16:35:51 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="1d21dc40a1333712d24716d567c24071";
logging-data="6886"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+4U8p18kLZJkCu/phT7NdR"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.1
Cancel-Lock: sha1:MufROi9XjLHqIv5/LOh8Qf0iN2k=
In-Reply-To: <st04b4$1nv$1@dont-email.me>
Content-Language: en-US
 by: BGB - Sat, 29 Jan 2022 16:35 UTC

On 1/28/2022 1:00 AM, Ivan Godard wrote:
> On 1/27/2022 10:20 PM, Stephen Fuld wrote:
>> On 1/27/2022 6:38 PM, BGB wrote:
>>
>> snip
>>
>>> Most historical machines which have tried using capability addressing
>>> have either died off, or resulted in machines where their later
>>> descendants abandoned the use of capabilities (eg: System/38 was
>>> replaced by systems built on the Power ISA, ...).
>>
>> I thought that when IBM switched from a proprietary processor to a
>> Power   for the AS/400, they had a custom variant chip that had 65
>> bits and used that extra bit for the protection scheme.
>>
>> But I may be misremembering.  :-(
>>
>>
>>
>
> AS400 apps continued  to run as before, with caps; power is in effect
> the micro-engine. Same as the Unisys systems, where (I think still) the
> micro-engine is x86, but the apps still see the original ISA.

As far as I understood it, the Power ISA was being used to run something
more like an emulator. So, the original capability system was then
implemented in software rather than hardware.

May be wrong on the specifics, as no real first-hand experience with this.

Re: Idle: Capability Addressing, Future or Boondoggle

<st3sog$2hv9$1@gal.iecc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23210&group=comp.arch#23210

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!not-for-mail
From: joh...@taugh.com (John Levine)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Sat, 29 Jan 2022 17:16:00 -0000 (UTC)
Organization: Taughannock Networks
Message-ID: <st3sog$2hv9$1@gal.iecc.com>
References: <ssv7ff$9n3$1@dont-email.me> <st020a$nce$1@dont-email.me> <st04b4$1nv$1@dont-email.me> <st3qd7$6n6$1@dont-email.me>
Injection-Date: Sat, 29 Jan 2022 17:16:00 -0000 (UTC)
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="83945"; mail-complaints-to="abuse@iecc.com"
In-Reply-To: <ssv7ff$9n3$1@dont-email.me> <st020a$nce$1@dont-email.me> <st04b4$1nv$1@dont-email.me> <st3qd7$6n6$1@dont-email.me>
Cleverness: some
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: johnl@iecc.com (John Levine)
 by: John Levine - Sat, 29 Jan 2022 17:16 UTC

According to BGB <cr88192@gmail.com>:
>>> I thought that when IBM switched from a proprietary processor to a
>>> Power   for the AS/400, they had a custom variant chip that had 65
>>> bits and used that extra bit for the protection scheme.

>> AS400 apps continued  to run as before, with caps; power is in effect
>> the micro-engine. Same as the Unisys systems, where (I think still) the
>> micro-engine is x86, but the apps still see the original ISA.
>
>As far as I understood it, the Power ISA was being used to run something
>more like an emulator. So, the original capability system was then
>implemented in software rather than hardware.

More or less. The virtual architecture is called
Technology-Independent Machine Interface (TIMI). The first time a
program is run on a macine, the TIMI code is translated into the local
machine code, then the program runs.

It's a very high level operating system that includes a relational database
and other things that would be separate applications on conventional systems.

On Unisys I don't know whether they do instruction at a time emulation, whole
program translation, or something in between like per-routine JIT.

--
Regards,
John Levine, johnl@taugh.com, Primary Perpetrator of "The Internet for Dummies",
Please consider the environment before reading this e-mail. https://jl.ly

Re: Idle: Capability Addressing, Future or Boondoggle

<st43e6$dlo$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23211&group=comp.arch#23211

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Sat, 29 Jan 2022 13:09:56 -0600
Organization: A noiseless patient Spider
Lines: 164
Message-ID: <st43e6$dlo$1@dont-email.me>
References: <ssv7ff$9n3$1@dont-email.me>
<jwv5yq4u2zp.fsf-monnier+comp.arch@gnu.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 29 Jan 2022 19:09:58 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="1d21dc40a1333712d24716d567c24071";
logging-data="14008"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18MlHOE6poNY11uAlMgihsg"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.1
Cancel-Lock: sha1:Z1HdcZZBrG25d2q6G8HQdy9tHVU=
In-Reply-To: <jwv5yq4u2zp.fsf-monnier+comp.arch@gnu.org>
Content-Language: en-US
 by: BGB - Sat, 29 Jan 2022 19:09 UTC

On 1/27/2022 6:23 PM, Stefan Monnier wrote:
>> Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM Morello
>> architecture, which is effectively a modified Aarch64, except:
>> Expands GPRs to 129 bits (internally);
>> Integer ISA remains 64-bit;
>> Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
>> addresses (the rest of the bits are used as protection flags and bounds
>> checks);
>> In effect, all pointers expand to 128 bits in memory;
>> Code isn't allowed to craft its own pointers (directly) because doing so
>> would break its memory protection scheme (the 129th bit is stored
>> separately, seemingly to track which memory locations contain pointers, and
>> to disallow loading a pointer from memory which has been used for data);
>
> AFAIK the main issue with such things is what they do about dangling
> pointers, i.e. dangling capabilities. Do they rely on a GC that's part
> of the "trusted runtime"? or do they disallow deallocation altogether?
> or do they rely on address-space randomization to make such dangling
> capabilities "harmless" (they'll hopefully only make you crash but are
> harder to exploit)?
>

I guess apparently people have worked around some of these issues.

But, yeah, ASLR is potentially also effective, given enough bits.

Many traditional weaknesses of ASLR come from not having enough bits in
the address space, leading to it being possible to guess or brute force
the addresses.

So, say:
32-bit address space, you can get maybe a few bits of entropy.
Trivial to brute force if the code doesn't crash first.
If one has 100s of MB of heap (typical) there is little "void space"
The randomization is typically "offset jitter";
RNG based alloc would quickly fragment the space.
...
48-bit space, can get a little more entropy (say, ~ 12 bits or so).
This can render "slides" mostly impractical;
Large enough that one can use RNG to allocate address space;
However, a "clever" algo could still brute-force things;
...
96-bit, significant entropy and voids are possible;
It can also become infeasible to brute-force addresses.

Known layout of things like target binaries or libraries would be a
problem, but as noted elsewhere, randomized shuffling in the compiler
should help here (main weakness would be long-lived "stale" binaries).
In this case, the compiler's RNG needs to be unpredictable though (so
that every build is unique).

If the binaries were compiled via an AOT though, then the AOT could
impose a maximum lifespan (forcing periodic regeneration).

A capability system would not necessarily be immune to attacks on known
library layouts, say:
Program A uses library B;
Program A has Read+Exec to B
Exec so A can call into it;
Read so that B can return string literals / etc to A.
B has a capability to C (secure) in a known location;
A then "steals" the capability via loading it from this location.
Granted, less of an issue if Read and ReadCapability are separate;
Still an issue if ReadCapability is allowed for struct-passing, ...

This would be much harder is the compiler randomizes B's layout.

Things like ACL checks would also help, since even if one can read the
pointer, unless A has permission to use pointers into C, then the
pointer is "mostly useless".

Though, does imply a need for an A->B call to be able to temporarily add
its own entry into the keyring, and then restore the prior keyring when
returning to A.

How to securely and efficiently do keyring updates (in a general sense)
has thus far remained an unresolved issue. I had considered special
semi-privledged instructions to save/restore/update the keyring. My
previous ideas had involved hardware-based encryption, but this is still
an issue (no real good way to get it both "sufficiently strong" and also
"sufficiently cheap").

Another (more recent, but more limited) strategy (which avoids the need
for an actual keyring update) is to add a virtual key based on the page
that is currently being executed (if the page is set to the appropriate
mode).

So, A has its initial KRR, and calls B. Because execution is inside of
B's pages, it (virtually) adds B's ACLID to the keyring, and this gains
any special accesses that B's ACLID has. On returning from B (or calling
into C), B's virtual keyring entry disappears (though, a call into C may
add C's ACLID to the keyring in its place).

Though, this does mean that C will not gain any implicit access to B's
data via B's ACLID, since B's ACLID will disappear as soon as control
passes from B to C (though, it may retain access to A's data if A's
ACLID is within the thread level KRR).

This partly works based on the idea that access to an ACLID will not
necessarily have the same privilege as when executing as said ACLID
(where the ACLID functions instead as a VUGID).

Though, my current implementation of this mechanism is a bit of an ugly
hack: It is passed via side-channels between the L1 I$ and MMU, since
this information would have been a bit awkward and ugly to pass via the
ringbus.

Actually, more specifically, it is implemented as a hack by using
"non-interrupting-interrupts" as a signaling mechanism (eg, the pathway
that would normally be used for raising interrupts is being used, if
effect, so that the TLB can send "smoke signals" back to the L1 I$ and
similar about the request it is waiting on).

One potential concern is that, since this is based on where code is
being executed from, rather than how it got there, this does leave the
potential attack surface of being able to jump to unorthodox entry
points within a library (beyond just its exports). ASLR can help at
least (unless hostile code uses pattern matching to search for a
particular machine-code sequence).

Another possibility being to add a level of indirection and only allow
A->B calls via dedicated trampolines (T). So, A has execute to T, T has
execute to B, but A does not have execute to B). This would at least
stop A from being able to decide its own entry points (but, would
increase the total number of ACLs in use).

> The designs that keep the actual capabilities in a kind of
> separate/secured table (so the untrusted code only handles references to
> these capabilities and can thus do anything it wants within its sandbox
> without needing any special XX=1bit values) have a much easier time
> since they can much more easily ensure the absence of
> references/capabilities before deallocating resources, or they can mark
> them as dead).
>

I had considered schemes like this, but didn't do much because (at a
high level) there was no obvious difference between what one could do
via a handle+offset scheme, and what one could do via the existing MMU.

For example, the TLB-Miss handler could easily see the address falls
within the range assigned to a descriptor table, and pull information
from this table rather than using the page table (and, in effect, the
use of a page table itself is more of a convention than an architectural
requirement with a software-managed TLB).

So, no special hardware support needed, could fake something like an x86
style GDT or LDT via the MMU, provided the limitation that the base and
limit are page aligned (well, and probably that one flushes the TLB when
updating or revoking these descriptors).

Re: Idle: Capability Addressing, Future or Boondoggle

<st479f$8ip$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23212&group=comp.arch#23212

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Sat, 29 Jan 2022 12:15:43 -0800
Organization: A noiseless patient Spider
Lines: 33
Message-ID: <st479f$8ip$1@dont-email.me>
References: <ssv7ff$9n3$1@dont-email.me>
<00b0ef24-81ec-449e-ac4e-603005cf81aan@googlegroups.com>
<ssvkuk$m3k$1@dont-email.me> <st020a$nce$1@dont-email.me>
<st04b4$1nv$1@dont-email.me> <st3qd7$6n6$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sat, 29 Jan 2022 20:15:43 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="96dfe7e9118fb69d3a41b408559493db";
logging-data="8793"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+1K4wMyZzT+Aqw0/uEJZC1"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:cpZ79ZVFpU0PXGWIqCFXVbSPyJk=
In-Reply-To: <st3qd7$6n6$1@dont-email.me>
Content-Language: en-US
 by: Ivan Godard - Sat, 29 Jan 2022 20:15 UTC

On 1/29/2022 8:35 AM, BGB wrote:
> On 1/28/2022 1:00 AM, Ivan Godard wrote:
>> On 1/27/2022 10:20 PM, Stephen Fuld wrote:
>>> On 1/27/2022 6:38 PM, BGB wrote:
>>>
>>> snip
>>>
>>>> Most historical machines which have tried using capability
>>>> addressing have either died off, or resulted in machines where their
>>>> later descendants abandoned the use of capabilities (eg: System/38
>>>> was replaced by systems built on the Power ISA, ...).
>>>
>>> I thought that when IBM switched from a proprietary processor to a
>>> Power   for the AS/400, they had a custom variant chip that had 65
>>> bits and used that extra bit for the protection scheme.
>>>
>>> But I may be misremembering.  :-(
>>>
>>>
>>>
>>
>> AS400 apps continued  to run as before, with caps; power is in effect
>> the micro-engine. Same as the Unisys systems, where (I think still)
>> the micro-engine is x86, but the apps still see the original ISA.
>
> As far as I understood it, the Power ISA was being used to run something
> more like an emulator. So, the original capability system was then
> implemented in software rather than hardware.
>
> May be wrong on the specifics, as no real first-hand experience with this.

Yes, exactly. It's only a matter of viewpoint whether you see such a
system as an emulator or as a micro-architecture.

Re: Idle: Capability Addressing, Future or Boondoggle

<e645afd8-86bf-460e-a6c7-020902d84440n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23213&group=comp.arch#23213

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a37:a905:: with SMTP id s5mr9208787qke.111.1643489452623;
Sat, 29 Jan 2022 12:50:52 -0800 (PST)
X-Received: by 2002:a05:6820:514:: with SMTP id m20mr6175667ooj.28.1643489452333;
Sat, 29 Jan 2022 12:50:52 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 29 Jan 2022 12:50:52 -0800 (PST)
In-Reply-To: <st43e6$dlo$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:291:29f0:4d01:ec69:d614:8a56;
posting-account=H_G_JQkAAADS6onOMb-dqvUozKse7mcM
NNTP-Posting-Host: 2600:1700:291:29f0:4d01:ec69:d614:8a56
References: <ssv7ff$9n3$1@dont-email.me> <jwv5yq4u2zp.fsf-monnier+comp.arch@gnu.org>
<st43e6$dlo$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <e645afd8-86bf-460e-a6c7-020902d84440n@googlegroups.com>
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
From: MitchAl...@aol.com (MitchAlsup)
Injection-Date: Sat, 29 Jan 2022 20:50:52 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 175
 by: MitchAlsup - Sat, 29 Jan 2022 20:50 UTC

On Saturday, January 29, 2022 at 1:10:01 PM UTC-6, BGB wrote:
> On 1/27/2022 6:23 PM, Stefan Monnier wrote:
> >> Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM Morello
> >> architecture, which is effectively a modified Aarch64, except:
> >> Expands GPRs to 129 bits (internally);
> >> Integer ISA remains 64-bit;
> >> Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
> >> addresses (the rest of the bits are used as protection flags and bounds
> >> checks);
> >> In effect, all pointers expand to 128 bits in memory;
> >> Code isn't allowed to craft its own pointers (directly) because doing so
> >> would break its memory protection scheme (the 129th bit is stored
> >> separately, seemingly to track which memory locations contain pointers, and
> >> to disallow loading a pointer from memory which has been used for data);
> >
> > AFAIK the main issue with such things is what they do about dangling
> > pointers, i.e. dangling capabilities. Do they rely on a GC that's part
> > of the "trusted runtime"? or do they disallow deallocation altogether?
> > or do they rely on address-space randomization to make such dangling
> > capabilities "harmless" (they'll hopefully only make you crash but are
> > harder to exploit)?
> >
> I guess apparently people have worked around some of these issues.
>
> But, yeah, ASLR is potentially also effective, given enough bits.
>
I just read up on ASLR and found:
http://www.cs.ucr.edu/~nael/pubs/micro16.pdf
<
Another attack surface that is not present in My 66000 implementations
under consideration.....
The small implementation 1.3-wide does not need/use branch prediction
The large implementation 6.0-wide does not use a BTB -- it has something
that smells a lot like a BTB but is connected to ~300 other bits and is fully
tag compared. Only a hashed vector of BC are used and these do not
predict taken or untaken, but agree and disagree.
>
> Many traditional weaknesses of ASLR come from not having enough bits in
> the address space, leading to it being possible to guess or brute force
> the addresses.
<
Needing a BTB for performance and not checking all of the address bits
in a tag or hashing the address bits into a complete tag opens the door
to exploits. Microarchitectural state becomes visible via a high precision
timer by allowing for the measurement of instruction delay.
>
> So, say:
> 32-bit address space, you can get maybe a few bits of entropy.
> Trivial to brute force if the code doesn't crash first.
> If one has 100s of MB of heap (typical) there is little "void space"
> The randomization is typically "offset jitter";
> RNG based alloc would quickly fragment the space.
> ...
> 48-bit space, can get a little more entropy (say, ~ 12 bits or so).
> This can render "slides" mostly impractical;
> Large enough that one can use RNG to allocate address space;
> However, a "clever" algo could still brute-force things;
> ...
> 96-bit, significant entropy and voids are possible;
> It can also become infeasible to brute-force addresses.
>
>
> Known layout of things like target binaries or libraries would be a
> problem, but as noted elsewhere, randomized shuffling in the compiler
> should help here (main weakness would be long-lived "stale" binaries).
> In this case, the compiler's RNG needs to be unpredictable though (so
> that every build is unique).
>
> If the binaries were compiled via an AOT though, then the AOT could
> impose a maximum lifespan (forcing periodic regeneration).
>
>
> A capability system would not necessarily be immune to attacks on known
> library layouts, say:
> Program A uses library B;
> Program A has Read+Exec to B
> Exec so A can call into it;
> Read so that B can return string literals / etc to A.
> B has a capability to C (secure) in a known location;
> A then "steals" the capability via loading it from this location.
> Granted, less of an issue if Read and ReadCapability are separate;
> Still an issue if ReadCapability is allowed for struct-passing, ...
>
> This would be much harder is the compiler randomizes B's layout.
>
Since the CAP system remains attackable--what does it actually buy in
terms of security ?
>
>
> Things like ACL checks would also help, since even if one can read the
> pointer, unless A has permission to use pointers into C, then the
> pointer is "mostly useless".
>
>
> Though, does imply a need for an A->B call to be able to temporarily add
> its own entry into the keyring, and then restore the prior keyring when
> returning to A.
>
> How to securely and efficiently do keyring updates (in a general sense)
> has thus far remained an unresolved issue. I had considered special
> semi-privledged instructions to save/restore/update the keyring. My
> previous ideas had involved hardware-based encryption, but this is still
> an issue (no real good way to get it both "sufficiently strong" and also
> "sufficiently cheap").
>
>
>
> Another (more recent, but more limited) strategy (which avoids the need
> for an actual keyring update) is to add a virtual key based on the page
> that is currently being executed (if the page is set to the appropriate
> mode).
>
> So, A has its initial KRR, and calls B. Because execution is inside of
> B's pages, it (virtually) adds B's ACLID to the keyring, and this gains
> any special accesses that B's ACLID has. On returning from B (or calling
> into C), B's virtual keyring entry disappears (though, a call into C may
> add C's ACLID to the keyring in its place).
>
> Though, this does mean that C will not gain any implicit access to B's
> data via B's ACLID, since B's ACLID will disappear as soon as control
> passes from B to C (though, it may retain access to A's data if A's
> ACLID is within the thread level KRR).
>
>
> This partly works based on the idea that access to an ACLID will not
> necessarily have the same privilege as when executing as said ACLID
> (where the ACLID functions instead as a VUGID).
>
> Though, my current implementation of this mechanism is a bit of an ugly
> hack: It is passed via side-channels between the L1 I$ and MMU, since
> this information would have been a bit awkward and ugly to pass via the
> ringbus.
>
>
> Actually, more specifically, it is implemented as a hack by using
> "non-interrupting-interrupts" as a signaling mechanism (eg, the pathway
> that would normally be used for raising interrupts is being used, if
> effect, so that the TLB can send "smoke signals" back to the L1 I$ and
> similar about the request it is waiting on).
>
>
> One potential concern is that, since this is based on where code is
> being executed from, rather than how it got there, this does leave the
> potential attack surface of being able to jump to unorthodox entry
> points within a library (beyond just its exports). ASLR can help at
> least (unless hostile code uses pattern matching to search for a
> particular machine-code sequence).
>
> Another possibility being to add a level of indirection and only allow
> A->B calls via dedicated trampolines (T). So, A has execute to T, T has
> execute to B, but A does not have execute to B). This would at least
> stop A from being able to decide its own entry points (but, would
> increase the total number of ACLs in use).
> > The designs that keep the actual capabilities in a kind of
> > separate/secured table (so the untrusted code only handles references to
> > these capabilities and can thus do anything it wants within its sandbox
> > without needing any special XX=1bit values) have a much easier time
> > since they can much more easily ensure the absence of
> > references/capabilities before deallocating resources, or they can mark
> > them as dead).
> >
> I had considered schemes like this, but didn't do much because (at a
> high level) there was no obvious difference between what one could do
> via a handle+offset scheme, and what one could do via the existing MMU.
>
>
> For example, the TLB-Miss handler could easily see the address falls
> within the range assigned to a descriptor table, and pull information
> from this table rather than using the page table (and, in effect, the
> use of a page table itself is more of a convention than an architectural
> requirement with a software-managed TLB).
>
> So, no special hardware support needed, could fake something like an x86
> style GDT or LDT via the MMU, provided the limitation that the base and
> limit are page aligned (well, and probably that one flushes the TLB when
> updating or revoking these descriptors).


Click here to read the complete article
Re: Idle: Capability Addressing, Future or Boondoggle

<7b614821-c818-420a-9445-b004a1479bafn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23214&group=comp.arch#23214

  copy link   Newsgroups: comp.arch
X-Received: by 2002:a05:622a:153:: with SMTP id v19mr10476911qtw.323.1643492704571;
Sat, 29 Jan 2022 13:45:04 -0800 (PST)
X-Received: by 2002:a05:6808:2003:: with SMTP id q3mr2746791oiw.133.1643492704298;
Sat, 29 Jan 2022 13:45:04 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.arch
Date: Sat, 29 Jan 2022 13:45:04 -0800 (PST)
In-Reply-To: <st0gl0$1oca$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2a0d:6fc2:55b0:ca00:25ca:a4d3:eff:246;
posting-account=ow8VOgoAAAAfiGNvoH__Y4ADRwQF1hZW
NNTP-Posting-Host: 2a0d:6fc2:55b0:ca00:25ca:a4d3:eff:246
References: <ssv7ff$9n3$1@dont-email.me> <jwv5yq4u2zp.fsf-monnier+comp.arch@gnu.org>
<ssves4$km1$1@dont-email.me> <st0gl0$1oca$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <7b614821-c818-420a-9445-b004a1479bafn@googlegroups.com>
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
From: already5...@yahoo.com (Michael S)
Injection-Date: Sat, 29 Jan 2022 21:45:04 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 58
 by: Michael S - Sat, 29 Jan 2022 21:45 UTC

On Friday, January 28, 2022 at 12:31:00 PM UTC+2, Terje Mathisen wrote:
> Ivan Godard wrote:
> > On 1/27/2022 4:23 PM, Stefan Monnier wrote:
> >>> Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM
> >>> Morello
> >>> architecture, which is effectively a modified Aarch64, except:
> >>> Expands GPRs to 129 bits (internally);
> >>> Integer ISA remains 64-bit;
> >>> Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
> >>> addresses (the rest of the bits are used as protection flags and bounds
> >>> checks);
> >>> In effect, all pointers expand to 128 bits in memory;
> >>> Code isn't allowed to craft its own pointers (directly) because doing so
> >>> would break its memory protection scheme (the 129th bit is stored
> >>> separately, seemingly to track which memory locations contain
> >>> pointers, and
> >>> to disallow loading a pointer from memory which has been used for data);
> >>
> >> AFAIK the main issue with such things is what they do about dangling
> >> pointers, i.e. dangling capabilities. Do they rely on a GC that's part
> >> of the "trusted runtime"? or do they disallow deallocation altogether?
> >> or do they rely on address-space randomization to make such dangling
> >> capabilities "harmless" (they'll hopefully only make you crash but are
> >> harder to exploit)?
> >>
> >> The designs that keep the actual capabilities in a kind of
> >> separate/secured table (so the untrusted code only handles references to
> >> these capabilities and can thus do anything it wants within its sandbox
> >> without needing any special XX=1bit values) have a much easier time
> >> since they can much more easily ensure the absence of
> >> references/capabilities before deallocating resources, or they can mark
> >> them as dead).
> >>
> >
> > Isn't it Terje who says anything can be done with another layer of
> > indirection?
> I do say that but it was old knowledge long before I first quoted it.

Most often attributed to Dr. David Wheeler.

> >
> > Revoke of permissions (and scavenge of permitted resources) is a very
> > hard problem in any permission system, including caps - and paging
> > tables too for that matter. These days narrow-cast caps (a.k.a handles)
> > typically use usecount GC. Which in practice works OK for well-behaved
> > users, Let a DOS attacker on your system though...
> Handles does work, it can probably be done well within a binary order of
> magnitude without opening up lots of DOS opportunities.
>
> OTOH, we have lots of examples, typically related to IO (mmap, direct
> IO, remote DMA etc) where performance requirements trump almost
> everything else even though we have had handle-based file/stream apis
> since forever.
>
> Terje
>
> --
> - <Terje.Mathisen at tmsw.no>
> "almost all programming can be viewed as an exercise in caching"

Re: Idle: Capability Addressing, Future or Boondoggle

<st4csn$hca$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=23215&group=comp.arch#23215

  copy link   Newsgroups: comp.arch
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: iva...@millcomputing.com (Ivan Godard)
Newsgroups: comp.arch
Subject: Re: Idle: Capability Addressing, Future or Boondoggle
Date: Sat, 29 Jan 2022 13:51:18 -0800
Organization: A noiseless patient Spider
Lines: 108
Message-ID: <st4csn$hca$1@dont-email.me>
References: <ssv7ff$9n3$1@dont-email.me>
<jwv5yq4u2zp.fsf-monnier+comp.arch@gnu.org> <st43e6$dlo$1@dont-email.me>
<e645afd8-86bf-460e-a6c7-020902d84440n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 29 Jan 2022 21:51:20 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="96dfe7e9118fb69d3a41b408559493db";
logging-data="17802"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX188sfHXJpkehzqtrZD1s2YG"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.5.0
Cancel-Lock: sha1:E4wI8B/KYxgsKHi8hMNsmXZhw1M=
In-Reply-To: <e645afd8-86bf-460e-a6c7-020902d84440n@googlegroups.com>
Content-Language: en-US
 by: Ivan Godard - Sat, 29 Jan 2022 21:51 UTC

On 1/29/2022 12:50 PM, MitchAlsup wrote:
> On Saturday, January 29, 2022 at 1:10:01 PM UTC-6, BGB wrote:
>> On 1/27/2022 6:23 PM, Stefan Monnier wrote:
>>>> Well, elsewhere (on "comp.lang.c"), a topic came up about the ARM Morello
>>>> architecture, which is effectively a modified Aarch64, except:
>>>> Expands GPRs to 129 bits (internally);
>>>> Integer ISA remains 64-bit;
>>>> Uses 128-bit "capabilities" as pointers, which support 56 or 64 bit
>>>> addresses (the rest of the bits are used as protection flags and bounds
>>>> checks);
>>>> In effect, all pointers expand to 128 bits in memory;
>>>> Code isn't allowed to craft its own pointers (directly) because doing so
>>>> would break its memory protection scheme (the 129th bit is stored
>>>> separately, seemingly to track which memory locations contain pointers, and
>>>> to disallow loading a pointer from memory which has been used for data);
>>>
>>> AFAIK the main issue with such things is what they do about dangling
>>> pointers, i.e. dangling capabilities. Do they rely on a GC that's part
>>> of the "trusted runtime"? or do they disallow deallocation altogether?
>>> or do they rely on address-space randomization to make such dangling
>>> capabilities "harmless" (they'll hopefully only make you crash but are
>>> harder to exploit)?
>>>
>> I guess apparently people have worked around some of these issues.
>>
>> But, yeah, ASLR is potentially also effective, given enough bits.
>>
> I just read up on ASLR and found:
> http://www.cs.ucr.edu/~nael/pubs/micro16.pdf
> <
> Another attack surface that is not present in My 66000 implementations
> under consideration.....
> The small implementation 1.3-wide does not need/use branch prediction
> The large implementation 6.0-wide does not use a BTB -- it has something
> that smells a lot like a BTB but is connected to ~300 other bits and is fully
> tag compared. Only a hashed vector of BC are used and these do not
> predict taken or untaken, but agree and disagree.
>>
>> Many traditional weaknesses of ASLR come from not having enough bits in
>> the address space, leading to it being possible to guess or brute force
>> the addresses.
> <
> Needing a BTB for performance and not checking all of the address bits
> in a tag or hashing the address bits into a complete tag opens the door
> to exploits. Microarchitectural state becomes visible via a high precision
> timer by allowing for the measurement of instruction delay.
>>
>> So, say:
>> 32-bit address space, you can get maybe a few bits of entropy.
>> Trivial to brute force if the code doesn't crash first.
>> If one has 100s of MB of heap (typical) there is little "void space"
>> The randomization is typically "offset jitter";
>> RNG based alloc would quickly fragment the space.
>> ...
>> 48-bit space, can get a little more entropy (say, ~ 12 bits or so).
>> This can render "slides" mostly impractical;
>> Large enough that one can use RNG to allocate address space;
>> However, a "clever" algo could still brute-force things;
>> ...
>> 96-bit, significant entropy and voids are possible;
>> It can also become infeasible to brute-force addresses.
>>
>>
>> Known layout of things like target binaries or libraries would be a
>> problem, but as noted elsewhere, randomized shuffling in the compiler
>> should help here (main weakness would be long-lived "stale" binaries).
>> In this case, the compiler's RNG needs to be unpredictable though (so
>> that every build is unique).
>>
>> If the binaries were compiled via an AOT though, then the AOT could
>> impose a maximum lifespan (forcing periodic regeneration).
>>
>>
>> A capability system would not necessarily be immune to attacks on known
>> library layouts, say:
>> Program A uses library B;
>> Program A has Read+Exec to B
>> Exec so A can call into it;
>> Read so that B can return string literals / etc to A.
>> B has a capability to C (secure) in a known location;
>> A then "steals" the capability via loading it from this location.
>> Granted, less of an issue if Read and ReadCapability are separate;
>> Still an issue if ReadCapability is allowed for struct-passing, ...
>>
>> This would be much harder is the compiler randomizes B's layout.
>>
> Since the CAP system remains attackable--what does it actually buy in
> terms of security ?

There are many attack vectors, some of which caps do not address and
some of which *no* security system can address: nothing is proof against
a pretty girl and a suitcase of money. However there are many vectors
that caps *are* proof against and what passes for "security" in
conventional systems are not.

Posters here have been considering systems that address bounds check and
dangling pointers. These are only a part of a true caps system, which
deals with names: in caps, anything you can name has an unforgeable
name, and you can only use such a name for its proper function. For
example, in caps a function return address is a cap, and the only things
you can do with it are return through it and discard it, and you can't
return through anything else; sic transit ROP.

If you don't capify all names then you haven't got a caps system, and
what you didn't do will leave attack holes. In the OP example, there's
third party visibility because there's no way described to do cap
boxing. The example is a good illustration of why boxing is needed,
though I prefer the simpler example of a numeric integration library.

Pages:12
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor