novaBBS - comp.lang.forth - Hardening Defined Words

Hardening Defined Words

<tcklr5$3jiko$1@dont-email.me>

https://www.novabbs.com/devel/article-flat.php?id=19560&group=comp.lang.forth#19560

Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: krishna....@ccreweb.org (Krishna Myneni)
Newsgroups: comp.lang.forth
Subject: Hardening Defined Words
Date: Fri, 5 Aug 2022 22:06:11 -0500
Organization: A noiseless patient Spider
Lines: 123
Message-ID: <tcklr5$3jiko$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 6 Aug 2022 03:06:13 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="830e69c26277ffb2dc09d3f1a0b27f88";
logging-data="3787416"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+R3QgMZCHP2hk13Ii0KO+V"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.11.0
Cancel-Lock: sha1:laS5a6zeL8yFDrMTd29Mj8NyRMY=
Content-Language: en-US

by: Krishna Myneni - Sat, 6 Aug 2022 03:06 UTC

Summary: for some non-native Forth systems, it should be possible to
relocate the compiled code of a colon definition into memory which can
be marked read-only, to protect against corruption. For this to be
feasible the run-time xt for a word should have at least one level of
indirection to the code being executed by the virtual machine.

Consider the following ordinary colon definition in kForth, an indirect
threaded code Forth interpreter/compiler:

: foo 0 ;

' foo execute . \ same as typing FOO
0 ok

see foo
565403F101F0 #0
565403F101F9 RET
ok

Now, let's do some bad things to FOO.

0 ' foo ! \ store a zero at the execution address for FOO
foo
Segmentation fault (core dumped)
$

Start kForth again and define FOO as above.

' foo a@ execute-bc . \ execute the byte code for FOO

One may infer that the xt for FOO is an address at which the compiled
byte code for FOO resides. The byte code is the code executed by
kForth's virtual machine.

Now, let's define a word BAR and demonstrate that we can modify the byte
code for BAR directly from the Forth interpreter.

: bar 10 0 do i . loop ;

see bar
560DD5DABDA0 #10
560DD5DABDA9 #0
560DD5DABDB2 >R
560DD5DABDB3 >R
560DD5DABDB4 IP>R
560DD5DABDB5 I
560DD5DABDB6 .
560DD5DABDB7 LOOP
560DD5DABDB8 RET
ok

To see the actual byte code of BAR,

' bar a@ 32 dump

560DD5DABDA0 : 49 0A 00 00 00 00 00 00 00 49 00 00 00 00
00 00 I........I......
560DD5DABDB0 : 00 00 DC DC DE 69 2E E9 EE 00 00 00 00 00
00 00 .....i..........

( the RET instruction for the virtual machine is byte EE ).

Now, we may corrupt the byte code, for example, by changing the loop
count to 5, instead of 10:

5 ' bar a@ 1+ !

Now, when BAR is executed, it will output "0 1 2 3 4 ok"

It is possible to use mmap and mprotect system calls (or equivalents
under Windows) to relocate the byte code to a new memory region and mark
that memory region as read-only, thereby avoiding this type of
corruption. It is relatively simple to do this from Forth itself,
although the details are obviously system-dependent. In this way, we
can, in principle, protect the executed code for a colon definition.

It's important to note that the dictionary structure for the word itself
is not able to be protected from being overwritten in this scheme.
Protecting the dictionary headers for colon definitions would require a
significant change in architecture, but it's not out of the question.

Although I used kForth as the example system since I'm familiar with its
internals, other systems may be able to do the same. I don't know the
internals of Gforth, but one can see that at least one level of
indirection appears to be involved in going from the xt to the executed
code, e.g., in Gforth,

see execute
Code execute
404AB9: mov $50[r13],r15
404ABD: mov rdx,[r14]
404AC0: add r14,$08
404AC4: mov rcx,-$10[rdx]
404AC8: jmp ecx
end-code

Here, the assembly code gives us the hint that r14 is the TOS (top of
stack) and there seems to be one level of indirection from the xt on top
of the stack to the code which is subsequently executed. The code
pointed to by xt can be overwritten, e.g., in Gforth,

: bar 10 0 do i . loop ; ok
bar 0 1 2 3 4 5 6 7 8 9 ok

0 ' bar @ ! ok
bar
*the terminal*:3:1: error: Stack underflow

I don't know enough about Gforth internals to be able to say that a
relocation of the code for BAR to a region which can be protected as
read only is possible. Perhaps one of the Gforth developers can say
definitively whether or not this is possible.

--
Krishna Myneni

Re: Hardening Defined Words

<tckpha$gju$1@gioia.aioe.org>

Subject	Author
Hardening Defined Words	Krishna Myneni
Re: Hardening Defined Words	dxforth
Re: Hardening Defined Words	Krishna Myneni
Re: Hardening Defined Words	Anton Ertl
Re: Hardening Defined Words	Marcel Hendrix
Re: Hardening Defined Words	Anton Ertl
Re: Hardening Defined Words	Marcel Hendrix
Re: Hardening Defined Words	Anton Ertl
Re: Hardening Defined Words	Krishna Myneni
Re: Hardening Defined Words	Krishna Myneni
Re: Hardening Defined Words	Krishna Myneni
Re: Hardening Defined Words	Anton Ertl
Re: Hardening Defined Words	Krishna Myneni
Re: Hardening Defined Words	none
Re: Hardening Defined Words	S Jack
Re: Hardening Defined Words	none
Re: Hardening Defined Words	Krishna Myneni
Re: Hardening Defined Words	dxforth
Re: Hardening Defined Words	Krishna Myneni
Re: Hardening Defined Words	Andy Valencia
Re: Hardening Defined Words	none
Re: Hardening Defined Words	minf...@arcor.de
Re: Hardening Defined Words	dxforth
Re: Hardening Defined Words	Krishna Myneni
Re: Hardening Defined Words	minf...@arcor.de
Re: Hardening Defined Words	antispam
Re: Hardening Defined Words	Krishna Myneni
Re: Hardening Defined Words	Hans Bezemer

If you can't understand it, it is intuitively obvious.

devel / comp.lang.forth / Hardening Defined Words