Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Life is NP-hard, and then you die. -- Dave Cock


devel / comp.lang.c / Re: A Famous Security Bug

SubjectAuthor
* A Famous Security BugStefan Ram
+* Re: A Famous Security BugKaz Kylheku
|+* Re: A Famous Security BugScott Lurndal
||`* Re: A Famous Security BugKeith Thompson
|| `- Re: A Famous Security BugKeith Thompson
|+* Re: A Famous Security BugDavid Brown
||`* Re: A Famous Security BugKaz Kylheku
|| +* Re: A Famous Security BugChris M. Thomasson
|| |`* Re: A Famous Security BugScott Lurndal
|| | `* Re: A Famous Security BugChris M. Thomasson
|| |  `* Re: A Famous Security BugScott Lurndal
|| |   `* Re: A Famous Security BugChris M. Thomasson
|| |    `- Re: A Famous Security BugChris M. Thomasson
|| +* Re: A Famous Security BugKeith Thompson
|| |+* Re: A Famous Security BugKaz Kylheku
|| ||+* Re: A Famous Security BugKeith Thompson
|| |||`* Re: A Famous Security BugKaz Kylheku
|| ||| +* Re: A Famous Security BugJames Kuyper
|| ||| |`- Re: A Famous Security BugKaz Kylheku
|| ||| +- Re: A Famous Security BugDavid Brown
|| ||| `* Re: A Famous Security BugKeith Thompson
|| |||  `* Re: A Famous Security BugKaz Kylheku
|| |||   `* Re: A Famous Security BugDavid Brown
|| |||    `* Re: A Famous Security BugKaz Kylheku
|| |||     +* Re: A Famous Security BugDavid Brown
|| |||     |`- Re: A Famous Security BugKaz Kylheku
|| |||     `* Re: A Famous Security BugJames Kuyper
|| |||      `* Re: A Famous Security BugKaz Kylheku
|| |||       `* Re: A Famous Security BugDavid Brown
|| |||        `* Re: A Famous Security BugKaz Kylheku
|| |||         +* Re: A Famous Security BugDavid Brown
|| |||         |`* Re: A Famous Security BugKaz Kylheku
|| |||         | `- Re: A Famous Security BugDavid Brown
|| |||         `- Re: A Famous Security BugChris M. Thomasson
|| ||+- Re: A Famous Security BugJames Kuyper
|| ||`* Re: A Famous Security BugDavid Brown
|| || `* Re: A Famous Security BugKaz Kylheku
|| ||  `- Re: A Famous Security BugDavid Brown
|| |`* Re: A Famous Security BugJames Kuyper
|| | `* Re: A Famous Security BugKaz Kylheku
|| |  `- Re: A Famous Security BugJames Kuyper
|| `- Re: A Famous Security BugDavid Brown
|`* Re: A Famous Security BugAnton Shepelev
| +- Re: A Famous Security BugKeith Thompson
| +* Re: A Famous Security BugKaz Kylheku
| |+* Re: A Famous Security BugDavid Brown
| ||`* Re: A Famous Security BugKaz Kylheku
| || +- Re: A Famous Security BugJames Kuyper
| || `* Re: A Famous Security BugDavid Brown
| ||  `* Re: A Famous Security BugRichard Kettlewell
| ||   +- Re: A Famous Security BugKaz Kylheku
| ||   +* Re: A Famous Security BugDavid Brown
| ||   |`- Re: A Famous Security BugKaz Kylheku
| ||   `* Re: A Famous Security BugTim Rentsch
| ||    `* Re: A Famous Security BugMalcolm McLean
| ||     `* Re: A Famous Security BugTim Rentsch
| ||      +- Re: A Famous Security BugDavid Brown
| ||      `- Re: A Famous Security BugKeith Thompson
| |`* Re: A Famous Security BugAnton Shepelev
| | `- Re: A Famous Security BugScott Lurndal
| +- Re: A Famous Security BugTim Rentsch
| `* Re: A Famous Security BugJames Kuyper
|  `* Re: A Famous Security Bugbart
|   +* Re: A Famous Security BugKeith Thompson
|   |`* Re: A Famous Security BugKaz Kylheku
|   | `* Re: A Famous Security BugDavid Brown
|   |  +- Re: A Famous Security BugScott Lurndal
|   |  `* Re: A Famous Security Bugbart
|   |   `- Re: A Famous Security BugDavid Brown
|   `* Re: A Famous Security BugJames Kuyper
|    `* Re: A Famous Security Bugbart
|     +* Re: A Famous Security BugDavid Brown
|     |`* Re: A Famous Security Bugbart
|     | +* Re: A Famous Security BugDavid Brown
|     | |`* Re: A Famous Security Bugbart
|     | | +* Re: A Famous Security BugKeith Thompson
|     | | |+- Re: A Famous Security BugDavid Brown
|     | | |+* Re: A Famous Security BugMichael S
|     | | ||+- Re: A Famous Security BugDavid Brown
|     | | ||`- Re: A Famous Security BugKeith Thompson
|     | | |`* Re: A Famous Security Bugbart
|     | | | `* Re: A Famous Security BugMichael S
|     | | |  +* Re: A Famous Security Bugbart
|     | | |  |+* Re: A Famous Security BugDavid Brown
|     | | |  ||`* Re: A Famous Security BugMalcolm McLean
|     | | |  || `- Re: A Famous Security BugMichael S
|     | | |  |`- Re: A Famous Security BugScott Lurndal
|     | | |  `* Re: A Famous Security BugDavid Brown
|     | | |   `- Re: A Famous Security BugScott Lurndal
|     | | `* Re: A Famous Security BugDavid Brown
|     | |  `* Re: A Famous Security BugMichael S
|     | |   `* Re: A Famous Security BugDavid Brown
|     | |    +* Re: A Famous Security BugMichael S
|     | |    |+- Re: A Famous Security BugDavid Brown
|     | |    |`- Re: A Famous Security Bugbart
|     | |    `* Re: A Famous Security Bugbart
|     | |     +* Re: A Famous Security BugMichael S
|     | |     |`* Re: A Famous Security Bugbart
|     | |     | +* Re: A Famous Security BugDavid Brown
|     | |     | |`- Re: A Famous Security BugScott Lurndal
|     | |     | `* Re: A Famous Security BugMichael S
|     | |     `- Re: A Famous Security BugDavid Brown
|     | `* Re: A Famous Security BugMichael S
|     +- Re: A Famous Security BugTim Rentsch
|     +- Re: A Famous Security BugMichael S
|     +* Re: A Famous Security BugMichael S
|     `- Re: A Famous Security BugJames Kuyper
+- Re: A Famous Security BugJoerg Mertens
+* Re: A Famous Security BugChris M. Thomasson
`* Re: A Famous Security BugStefan Ram

Pages:123456
Re: A Famous Security Bug

<87le66xq72.fsf@nosuchdomain.example.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34764&group=comp.lang.c#34764

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Keith.S....@gmail.com (Keith Thompson)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Mon, 25 Mar 2024 08:54:09 -0700
Organization: None to speak of
Lines: 35
Message-ID: <87le66xq72.fsf@nosuchdomain.example.com>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<utkea9$31sr2$1@dont-email.me> <utktul$35ng8$1@dont-email.me>
<utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me>
<utn1a0$3ogob$1@dont-email.me> <utnh5m$3sdhk$1@dont-email.me>
<utpenn$dtnq$1@dont-email.me> <utq0gh$i9hm$1@dont-email.me>
<87sf0fxsm0.fsf@nosuchdomain.example.com>
<20240325014203.000048f7@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain
Injection-Date: Mon, 25 Mar 2024 16:54:09 +0100 (CET)
Injection-Info: dont-email.me; posting-host="fc8b2ed6dfef3a0d834742b2f7b293ba";
logging-data="1247388"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18d7qncv20UiwEiSyEiH+Bx"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Cancel-Lock: sha1:npND5gRr7Eox8xGQQU9ky2Odu54=
sha1:gxCtdvkMoyH+GmlL7G4R3cJjZC4=
 by: Keith Thompson - Mon, 25 Mar 2024 15:54 UTC

Michael S <already5chosen@yahoo.com> writes:
> On Sun, 24 Mar 2024 13:49:43 -0700
> Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
>> bart <bc@freeuk.com> writes:
>> [...]
>> > But what people want are the conveniences and familiarity of a HLL,
>> > without the bloody-mindedness of an optimising C compiler.
>> [...]
>>
>> Exactly which people want that?
>>
>> The evidence suggests that, while some people undoubtedly want that
>> (and it's a perfectly legitimate desire), there isn't enough demand
>> to induce anyone to actually produce such a thing and for it to catch
>> on.
>
> Such things are produced all the time. A yes, they fail to catch on.
> The most recent [half-hearted] attempt that didn't realize yet that it
> has no chance is called zig.

Does Zig have those characteristics because its language definition say
so, or because there's a single implementation that happens to work that
way? I took a quick look at the documentation and didn't see anything
definitive.

>> Developers have had decades to define and implement the kind of
>> language you're talking about. Why haven't they?
>>
>
> Because C is juggernaut?

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Medtronic
void Void(void) { Void(); } /* The recursive call of the void */

Re: A Famous Security Bug

<uts7e0$1686i$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34765&group=comp.lang.c#34765

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bc...@freeuk.com (bart)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Mon, 25 Mar 2024 16:06:24 +0000
Organization: A noiseless patient Spider
Lines: 90
Message-ID: <uts7e0$1686i$1@dont-email.me>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<utkea9$31sr2$1@dont-email.me> <utktul$35ng8$1@dont-email.me>
<utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me>
<utn1a0$3ogob$1@dont-email.me> <utnh5m$3sdhk$1@dont-email.me>
<utpenn$dtnq$1@dont-email.me> <utq0gh$i9hm$1@dont-email.me>
<utqaak$kfuv$2@dont-email.me> <20240325141628.00006170@yahoo.com>
<utrqgp$12v02$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 25 Mar 2024 17:06:24 +0100 (CET)
Injection-Info: dont-email.me; posting-host="f53bde5462ef908e46a536c53c557cbe";
logging-data="1253586"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/1dAGYjcxakQWGIfK+iBbE"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:PGOB7pedxsFapSTnpoRqnuWWtHU=
In-Reply-To: <utrqgp$12v02$1@dont-email.me>
Content-Language: en-GB
 by: bart - Mon, 25 Mar 2024 16:06 UTC

On 25/03/2024 12:26, David Brown wrote:
> On 25/03/2024 12:16, Michael S wrote:
>> On Sun, 24 Mar 2024 23:43:32 +0100
>> David Brown <david.brown@hesbynett.no> wrote:
>>>
>>> I could be  wrong here, of course.
>>>
>>
>> It seems, you are.
>>
>
> It happens - and it was not unexpected here, as I said.  I don't have
> all these compilers installed to test.
>
> But it would be helpful if you had a /little/ more information.  If you
> don't know why some compilers generate binaries that have memory mapped
> at 0x400000, and others do not, fair enough.  I am curious, but it's not
> at all important.
>

In the PE EXE format, the default image load base is specified in a
special header in the file:

Magic: 20B
Link version: 1.0
Code size: 512 200
Idata size: 1024 400
Zdata size: 512
Entry point: 4096 1000 in data:0
Code base: 4096
Image base: 4194304 400000
Section align: 4096

By convention it is at 0x40'0000 (I've no idea why).

More recently, dynamic loading, regardless of what it says in the PE
header, has become popular with linkers. So, while there is still a
fixed value in the Image Base file, which might be 0x140000000, it gets
loaded at some random address, usually in high memory above 2GB.

I don't know what's responsible for that, but presumably the OS must be
in on the act.

To make this possible, both for loading above 2GB, and for loading at an
address not known by the linker, the code inside the EXE must be
position-independent, and have relocation info for any absolute 64-bit
static addresses. 32-bit static addresses won't work.

If I take this C program:

#include <stdio.h>
int main(void) {
printf("%p\n", main);
}

This shows 0000000000401000 when compiled with mcc or tcc, or
0000000000401020 with lccwin32 (the exact address of 'main' relative to
the image base will vary). With DMC (32 bits) it's 0040210. All load at
0x400000.

With gcc, it shows: 00007ff6e63a1591.

Dynamic loading can be disabled by passing --disable-dynamicbase to ld,
then it might show something like 0000000140001000, which corresponds to
the default Image Base file in the EXE header

Not dynamic, but still high.

(My compilers, both for C and M, did not generate code suitable for
high-loading until a few months ago. That didn't matter since the EXEs
loaded at the fixed 0x400000 adddress. But it can matter for DLL files
and will do for OBJ files, since the latter would need to use an
external linker.

So if I do this with a mix of mcc and gcc:

C:\c>mcc test -c
Compiling test.c to test.obj

C:\c>gcc test.obj

C:\c>a
00007FF613311540

I get the same high-loaded address. I don't think that Tiny C has that
support yet for high-loading code.)

To summarise: the high-loading is not directly to do with compilers, but
the program that generates the EXE. But the compiler does need to
generate code that could be loaded high if needed.

Re: A Famous Security Bug

<uts9br$16nq5$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34766&group=comp.lang.c#34766

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bc...@freeuk.com (bart)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Mon, 25 Mar 2024 16:39:23 +0000
Organization: A noiseless patient Spider
Lines: 49
Message-ID: <uts9br$16nq5$1@dont-email.me>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<utkea9$31sr2$1@dont-email.me> <utktul$35ng8$1@dont-email.me>
<utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me>
<utn1a0$3ogob$1@dont-email.me> <utnh5m$3sdhk$1@dont-email.me>
<utpenn$dtnq$1@dont-email.me> <utq0gh$i9hm$1@dont-email.me>
<utqaak$kfuv$2@dont-email.me> <20240325141628.00006170@yahoo.com>
<utrqgp$12v02$1@dont-email.me> <20240325161117.00002318@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 25 Mar 2024 17:39:23 +0100 (CET)
Injection-Info: dont-email.me; posting-host="f53bde5462ef908e46a536c53c557cbe";
logging-data="1269573"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19f48nYUvGtBykuMpeMqLy8"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:JHYu16MwlSx4+1OwYEe6PIX2KGM=
In-Reply-To: <20240325161117.00002318@yahoo.com>
Content-Language: en-GB
 by: bart - Mon, 25 Mar 2024 16:39 UTC

On 25/03/2024 13:11, Michael S wrote:
> On Mon, 25 Mar 2024 13:26:01 +0100
> David Brown <david.brown@hesbynett.no> wrote:
>
>> On 25/03/2024 12:16, Michael S wrote:
>>> On Sun, 24 Mar 2024 23:43:32 +0100
>>> David Brown <david.brown@hesbynett.no> wrote:
>>>>
>>>> I could be wrong here, of course.
>>>>
>>>
>>> It seems, you are.
>>>
>>
>> It happens - and it was not unexpected here, as I said. I don't have
>> all these compilers installed to test.
>>
>> But it would be helpful if you had a /little/ more information. If
>> you don't know why some compilers generate binaries that have memory
>> mapped at 0x400000, and others do not, fair enough. I am curious,
>> but it's not at all important.
>>
>
> I am not an expert, but it does not look like the problem is directly
> related to compiler or linker. All 32-bit Windows compilers/linkers,
> including gcc, clang and MSVC, by default put symbol ___ImageBase at
> address 4 MB. However loader relocates it to wherever it wants,
> typically much higher.
> I don't know for sure why loader does it to images generated by gcc,
> clang and MSVC and does not do it to images generated by lccwin and
> others, but I have an educated guess: most likely, these other compilers
> link by default with an option similar to Microsoft's /Fixed
> https://learn.microsoft.com/en-us/cpp/build/reference/fixed-fixed-base-address?view=msvc-170

It's all up to the options written to the EXE file headers.

By setting the same options (plus generating base-reloc tables, plus
ensuring the code can run above 2GB), I can get the EXEs written by my
two compilers (for C and for my language) to be loaded at a high address
too.

My compilers don't use a linker.

Some of those options are normally used only for DLLs; they would need
to be set for EXEs too.

This was just an experiment; I will try adding it as a formal option to
each compiler.

Re: A Famous Security Bug

<20240325195118.0000333a@yahoo.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34767&group=comp.lang.c#34767

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5...@yahoo.com (Michael S)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Mon, 25 Mar 2024 18:51:18 +0200
Organization: A noiseless patient Spider
Lines: 63
Message-ID: <20240325195118.0000333a@yahoo.com>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<utkea9$31sr2$1@dont-email.me>
<utktul$35ng8$1@dont-email.me>
<utm06k$3glqc$1@dont-email.me>
<utme8b$3jtip$1@dont-email.me>
<utn1a0$3ogob$1@dont-email.me>
<utnh5m$3sdhk$1@dont-email.me>
<utpenn$dtnq$1@dont-email.me>
<utq0gh$i9hm$1@dont-email.me>
<utqaak$kfuv$2@dont-email.me>
<20240325141628.00006170@yahoo.com>
<utrqgp$12v02$1@dont-email.me>
<uts7e0$1686i$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Injection-Date: Mon, 25 Mar 2024 17:51:27 +0100 (CET)
Injection-Info: dont-email.me; posting-host="ff58747de83365f3f96c18f69047f6c9";
logging-data="1264193"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/Xh3GLlj9IIi03DlqoRXgPIqQCMzshGpc="
Cancel-Lock: sha1:sANgYBBK/ILq+EmVmMywHyjR16Q=
X-Newsreader: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-w64-mingw32)
 by: Michael S - Mon, 25 Mar 2024 16:51 UTC

On Mon, 25 Mar 2024 16:06:24 +0000
bart <bc@freeuk.com> wrote:

> On 25/03/2024 12:26, David Brown wrote:
> > On 25/03/2024 12:16, Michael S wrote:
> >> On Sun, 24 Mar 2024 23:43:32 +0100
> >> David Brown <david.brown@hesbynett.no> wrote:
> >>>
> >>> I could be  wrong here, of course.
> >>>
> >>
> >> It seems, you are.
> >>
> >
> > It happens - and it was not unexpected here, as I said.  I don't
> > have all these compilers installed to test.
> >
> > But it would be helpful if you had a /little/ more information.  If
> > you don't know why some compilers generate binaries that have
> > memory mapped at 0x400000, and others do not, fair enough.  I am
> > curious, but it's not at all important.
> >
>
> In the PE EXE format, the default image load base is specified in a
> special header in the file:
>
> Magic: 20B
> Link version: 1.0
> Code size: 512 200
> Idata size: 1024 400
> Zdata size: 512
> Entry point: 4096 1000 in data:0
> Code base: 4096
> Image base: 4194304 400000
> Section align: 4096
>
> By convention it is at 0x40'0000 (I've no idea why).
>
> More recently, dynamic loading, regardless of what it says in the PE
> header, has become popular with linkers. So, while there is still a
> fixed value in the Image Base file, which might be 0x140000000, it
> gets loaded at some random address, usually in high memory above 2GB.
>
> I don't know what's responsible for that, but presumably the OS must
> be in on the act.
>
> To make this possible, both for loading above 2GB, and for loading at
> an address not known by the linker, the code inside the EXE must be
> position-independent, and have relocation info for any absolute
> 64-bit static addresses. 32-bit static addresses won't work.
>

I don't understand why you say that EXE must be position-independent.
I never learned PE format in depth (and learned only absolute minimum of
elf, just enough to be able to load images in simple embedded
scenario), but my impression always was that PE EXE contains plenty of
relocation info for a loader, so it (loader) can modify (I think
professional argot uses the word 'fix') non-PIC at load time to run at
any chosen position.
Am I wrong about it?

Re: A Famous Security Bug

<xoiMN.162476$46Te.38731@fx38.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34769&group=comp.lang.c#34769

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!feeder1-2.proxad.net!proxad.net!feeder1-1.proxad.net!193.141.40.65.MISMATCH!npeer.as286.net!npeer-ng0.as286.net!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer01.ams4!peer.am4.highwinds-media.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx38.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: A Famous Security Bug
Newsgroups: comp.lang.c
References: <bug-20240320191736@ram.dialup.fu-berlin.de> <20240321211306.779b21d126e122556c34a346@gmail.moc> <utkea9$31sr2$1@dont-email.me> <utktul$35ng8$1@dont-email.me> <utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me> <utn1a0$3ogob$1@dont-email.me> <utnh5m$3sdhk$1@dont-email.me> <utpenn$dtnq$1@dont-email.me> <utq0gh$i9hm$1@dont-email.me> <87sf0fxsm0.fsf@nosuchdomain.example.com> <utqbo0$kvt3$1@dont-email.me> <20240325023947.00006752@yahoo.com> <utqmip$na54$1@dont-email.me>
Lines: 51
Message-ID: <xoiMN.162476$46Te.38731@fx38.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Mon, 25 Mar 2024 17:21:33 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Mon, 25 Mar 2024 17:21:33 GMT
X-Received-Bytes: 3132
 by: Scott Lurndal - Mon, 25 Mar 2024 17:21 UTC

bart <bc@freeuk.com> writes:
>On 24/03/2024 23:39, Michael S wrote:
>> On Sun, 24 Mar 2024 23:07:44 +0000
>> bart <bc@freeuk.com> wrote:
>>
>>> On 24/03/2024 20:49, Keith Thompson wrote:
>>>> bart <bc@freeuk.com> writes:
>>>> [...]
>>>>> But what people want are the conveniences and familiarity of a HLL,
>>>>> without the bloody-mindedness of an optimising C compiler.
>>>> [...]
>>>>
>>>> Exactly which people want that?
>>>>
>>>> The evidence suggests that, while some people undoubtedly want that
>>>> (and it's a perfectly legitimate desire), there isn't enough demand
>>>> to induce anyone to actually produce such a thing and for it to
>>>> catch on. Developers have had decades to define and implement the
>>>> kind of language you're talking about. Why haven't they?
>>>>
>>> Perhaps many settle for using C but using a lesser C compiler or one
>>> with optimisation turned off.
>>>
>>
>> What is "lesser C compiler"?
>> Something like IAR ? Yes, people use it.
>> Something like TI? People use it when they have no other choice.
>> 20 years ago there were Diab Data, Kiel and few others. I didn't hear
>> about them lately.
>> Microchip, I'd guess, still has its own compilers for many of their
>> families, but that's because they have to. "Bigger" compilers dont want
>> to support this chips.
>> On the opposite edge of scale, IBM has compilers for their mainframes
>> and for POWER/AIX. The former are used widely. The later are quickly
>> losing to "bigger' compilers running on the same platform.
>
>> As to tcc, mcc, lccwin etc... those only used by hobbyists.
>
>AFAIK lccwin can be used commercially.

Which sidesteps the assertion that it is only used by
hobbyists.

>And I would recommend tcc especially for transpiled code. Because it can
>process it very quickly, but also because the code should already be
>verified so it doesn't need deep analysis.

Yes, you probably would. I wouldn't.

<snip bart complaints>

Re: A Famous Security Bug

<jriMN.162477$46Te.50791@fx38.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34770&group=comp.lang.c#34770

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx38.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: A Famous Security Bug
Newsgroups: comp.lang.c
References: <bug-20240320191736@ram.dialup.fu-berlin.de> <20240321211306.779b21d126e122556c34a346@gmail.moc> <utkea9$31sr2$1@dont-email.me> <utktul$35ng8$1@dont-email.me> <utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me> <utn1a0$3ogob$1@dont-email.me> <utnh5m$3sdhk$1@dont-email.me> <utpenn$dtnq$1@dont-email.me> <utq0gh$i9hm$1@dont-email.me> <87sf0fxsm0.fsf@nosuchdomain.example.com> <utqbo0$kvt3$1@dont-email.me> <20240325023947.00006752@yahoo.com> <utre2c$102ht$1@dont-email.me>
Lines: 10
Message-ID: <jriMN.162477$46Te.50791@fx38.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Mon, 25 Mar 2024 17:24:31 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Mon, 25 Mar 2024 17:24:31 GMT
X-Received-Bytes: 1434
 by: Scott Lurndal - Mon, 25 Mar 2024 17:24 UTC

David Brown <david.brown@hesbynett.no> writes:
>On 25/03/2024 00:39, Michael S wrote:

>I tried out Diab Data for the 68k some 25 years ago. It was /way/
>better than anything else around, but outside our budget at the time.

We used them for our 88k based systems in the early 90', they were,
as you say, way better than anything else at the time (Moto
was shipping a version of PCC, and gcc was rather primitive).

Re: A Famous Security Bug

<utseh7$181cd$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34771&group=comp.lang.c#34771

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Mon, 25 Mar 2024 19:07:35 +0100
Organization: A noiseless patient Spider
Lines: 95
Message-ID: <utseh7$181cd$1@dont-email.me>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<utkea9$31sr2$1@dont-email.me> <utktul$35ng8$1@dont-email.me>
<utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me>
<utn1a0$3ogob$1@dont-email.me> <utnh5m$3sdhk$1@dont-email.me>
<utpenn$dtnq$1@dont-email.me> <utq0gh$i9hm$1@dont-email.me>
<utqaak$kfuv$2@dont-email.me> <20240325141628.00006170@yahoo.com>
<utrqgp$12v02$1@dont-email.me> <uts7e0$1686i$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 25 Mar 2024 19:07:36 +0100 (CET)
Injection-Info: dont-email.me; posting-host="8bb1363eb201d723c16a92a0d69da8a9";
logging-data="1312141"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX199sdb8q2U5r6AdFZeMRA3FubFkpvs2Ib8="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:jNXWfRIY5DO/1hoLu//8GKyTcyQ=
In-Reply-To: <uts7e0$1686i$1@dont-email.me>
Content-Language: en-GB
 by: David Brown - Mon, 25 Mar 2024 18:07 UTC

On 25/03/2024 17:06, bart wrote:
> On 25/03/2024 12:26, David Brown wrote:
>> On 25/03/2024 12:16, Michael S wrote:
>>> On Sun, 24 Mar 2024 23:43:32 +0100
>>> David Brown <david.brown@hesbynett.no> wrote:
>>>>
>>>> I could be  wrong here, of course.
>>>>
>>>
>>> It seems, you are.
>>>
>>
>> It happens - and it was not unexpected here, as I said.  I don't have
>> all these compilers installed to test.
>>
>> But it would be helpful if you had a /little/ more information.  If
>> you don't know why some compilers generate binaries that have memory
>> mapped at 0x400000, and others do not, fair enough.  I am curious, but
>> it's not at all important.
>>
>
> In the PE EXE format, the default image load base is specified in a
> special header in the file:
>
>   Magic:            20B
>   Link version:     1.0
>   Code size:        512 200
>   Idata size:       1024 400
>   Zdata size:       512
>   Entry point:      4096 1000 in data:0
>   Code base:        4096
>   Image base:       4194304 400000
>   Section align:    4096
>
> By convention it is at 0x40'0000 (I've no idea why).
>
> More recently, dynamic loading, regardless of what it says in the PE
> header, has become popular with linkers. So, while there is still a
> fixed value in the Image Base file, which might be 0x140000000, it gets
> loaded at some random address, usually in high memory above 2GB.
>
> I don't know what's responsible for that, but presumably the OS must be
> in on the act.
>
> To make this possible, both for loading above 2GB, and for loading at an
> address not known by the linker, the code inside the EXE must be
> position-independent, and have relocation info for any absolute 64-bit
> static addresses. 32-bit static addresses won't work.
>
> If I take this C program:
>
>     #include <stdio.h>
>     int main(void) {
>         printf("%p\n", main);
>     }
>
> This shows 0000000000401000 when compiled with mcc or tcc, or
> 0000000000401020 with lccwin32 (the exact address of 'main' relative to
> the image base will vary). With DMC (32 bits) it's 0040210. All load at
> 0x400000.
>
> With gcc, it shows: 00007ff6e63a1591.
>
> Dynamic loading can be disabled by passing --disable-dynamicbase to ld,
> then it might show something like 0000000140001000, which corresponds to
> the default Image Base file in the EXE header
>
> Not dynamic, but still high.
>
> (My compilers, both for C and M, did not generate code suitable for
> high-loading until a few months ago. That didn't matter since the EXEs
> loaded at the fixed 0x400000 adddress. But it can matter for DLL files
> and will do for OBJ files, since the latter would need to use an
> external linker.
>
> So if I do this with a mix of mcc and gcc:
>
>   C:\c>mcc test -c
>   Compiling test.c to test.obj
>
>   C:\c>gcc test.obj
>
>   C:\c>a
>   00007FF613311540
>
> I get the same high-loaded address. I don't think that Tiny C has that
> support yet for high-loading code.)
>
> To summarise: the high-loading is not directly to do with compilers, but
> the program that generates the EXE. But the compiler does need to
> generate code that could be loaded high if needed.

Thanks for that explanation - it fills in some blanks in my understanding.

Re: A Famous Security Bug

<utsemf$18477$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34772&group=comp.lang.c#34772

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bc...@freeuk.com (bart)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Mon, 25 Mar 2024 18:10:23 +0000
Organization: A noiseless patient Spider
Lines: 110
Message-ID: <utsemf$18477$1@dont-email.me>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<utkea9$31sr2$1@dont-email.me> <utktul$35ng8$1@dont-email.me>
<utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me>
<utn1a0$3ogob$1@dont-email.me> <utnh5m$3sdhk$1@dont-email.me>
<utpenn$dtnq$1@dont-email.me> <utq0gh$i9hm$1@dont-email.me>
<utqaak$kfuv$2@dont-email.me> <20240325141628.00006170@yahoo.com>
<utrqgp$12v02$1@dont-email.me> <uts7e0$1686i$1@dont-email.me>
<20240325195118.0000333a@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 25 Mar 2024 19:10:23 +0100 (CET)
Injection-Info: dont-email.me; posting-host="f53bde5462ef908e46a536c53c557cbe";
logging-data="1315047"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18s8KakKqSq/MZ65mkjK8XE"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:c9Tfr1GuzvZePaTvvJZ36VErocE=
In-Reply-To: <20240325195118.0000333a@yahoo.com>
Content-Language: en-GB
 by: bart - Mon, 25 Mar 2024 18:10 UTC

On 25/03/2024 16:51, Michael S wrote:
> On Mon, 25 Mar 2024 16:06:24 +0000
> bart <bc@freeuk.com> wrote:
>
>> On 25/03/2024 12:26, David Brown wrote:
>>> On 25/03/2024 12:16, Michael S wrote:
>>>> On Sun, 24 Mar 2024 23:43:32 +0100
>>>> David Brown <david.brown@hesbynett.no> wrote:
>>>>>
>>>>> I could be  wrong here, of course.
>>>>>
>>>>
>>>> It seems, you are.
>>>>
>>>
>>> It happens - and it was not unexpected here, as I said.  I don't
>>> have all these compilers installed to test.
>>>
>>> But it would be helpful if you had a /little/ more information.  If
>>> you don't know why some compilers generate binaries that have
>>> memory mapped at 0x400000, and others do not, fair enough.  I am
>>> curious, but it's not at all important.
>>>
>>
>> In the PE EXE format, the default image load base is specified in a
>> special header in the file:
>>
>> Magic: 20B
>> Link version: 1.0
>> Code size: 512 200
>> Idata size: 1024 400
>> Zdata size: 512
>> Entry point: 4096 1000 in data:0
>> Code base: 4096
>> Image base: 4194304 400000
>> Section align: 4096
>>
>> By convention it is at 0x40'0000 (I've no idea why).
>>
>> More recently, dynamic loading, regardless of what it says in the PE
>> header, has become popular with linkers. So, while there is still a
>> fixed value in the Image Base file, which might be 0x140000000, it
>> gets loaded at some random address, usually in high memory above 2GB.
>>
>> I don't know what's responsible for that, but presumably the OS must
>> be in on the act.
>>
>> To make this possible, both for loading above 2GB, and for loading at
>> an address not known by the linker, the code inside the EXE must be
>> position-independent, and have relocation info for any absolute
>> 64-bit static addresses. 32-bit static addresses won't work.
>>
>
> I don't understand why you say that EXE must be position-independent.
> I never learned PE format in depth (and learned only absolute minimum of
> elf, just enough to be able to load images in simple embedded
> scenario), but my impression always was that PE EXE contains plenty of
> relocation info for a loader, so it (loader) can modify (I think
> professional argot uses the word 'fix') non-PIC at load time to run at
> any chosen position.
> Am I wrong about it?

A PE EXE designed to run only at the image base given won't be
position-independent, so it can't be moved anywwhere else.

There isn't enough info to make it possible, especially before
position-independent addressing modes for x64 came along (that is, using
offset to the RIP intruction pointer instead of 32-bit absolute addresses).

Take this C program:

int abc;
int* ptr = &abc;

int main(void) {
int x;
x = abc;
}

Some of the assembly generated is this:

abc: resb 4

ptr: dq abc
...
mov eax, [abc]

That last reference is an absolute 32-bit address, for example it might
have address 0x00403000 when loaded at 0x400000.

If the program is instead loaded at 0x78230000, there is no reloc info
to tell it that that particular 32-bit value, plus the 64-bit field
initialising ptr, must be adjusted.

RIP-relative addressing (I think sometimes called PIC), can fix that
second reference:

mov eax, [rip:abc]

But it only works for code, not data; that initialisation is still absolute.

When a DLL is generated instead, those will need to be moved (to avoid
multiple DLLs all based at the same address). In that case,
base-relocation tables are needed: a list of addresses that contain a
field that needs relocating, and what type and size of reloc is needed.

The same info is needed for EXE if it contains flags saying that the EXE
could be loaded at an arbitrary adddress.

Re: A Famous Security Bug

<utsl74$19la6$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34776&group=comp.lang.c#34776

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Mon, 25 Mar 2024 21:01:40 +0100
Organization: A noiseless patient Spider
Lines: 137
Message-ID: <utsl74$19la6$1@dont-email.me>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<utkea9$31sr2$1@dont-email.me> <utktul$35ng8$1@dont-email.me>
<utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me>
<utn1a0$3ogob$1@dont-email.me> <utnh5m$3sdhk$1@dont-email.me>
<utpenn$dtnq$1@dont-email.me> <utq0gh$i9hm$1@dont-email.me>
<utqaak$kfuv$2@dont-email.me> <20240325141628.00006170@yahoo.com>
<utrqgp$12v02$1@dont-email.me> <uts7e0$1686i$1@dont-email.me>
<20240325195118.0000333a@yahoo.com> <utsemf$18477$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 25 Mar 2024 21:01:40 +0100 (CET)
Injection-Info: dont-email.me; posting-host="8bb1363eb201d723c16a92a0d69da8a9";
logging-data="1365318"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/kT71pJsnR2Fb32cmEF/bpOtJxOUEksRQ="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:BjR8iOOosoGTP6u4xnRSXI5S1dM=
Content-Language: en-GB
In-Reply-To: <utsemf$18477$1@dont-email.me>
 by: David Brown - Mon, 25 Mar 2024 20:01 UTC

On 25/03/2024 19:10, bart wrote:
> On 25/03/2024 16:51, Michael S wrote:
>> On Mon, 25 Mar 2024 16:06:24 +0000
>> bart <bc@freeuk.com> wrote:
>>
>>> On 25/03/2024 12:26, David Brown wrote:
>>>> On 25/03/2024 12:16, Michael S wrote:
>>>>> On Sun, 24 Mar 2024 23:43:32 +0100
>>>>> David Brown <david.brown@hesbynett.no> wrote:
>>>>>>
>>>>>> I could be  wrong here, of course.
>>>>>
>>>>> It seems, you are.
>>>>
>>>> It happens - and it was not unexpected here, as I said.  I don't
>>>> have all these compilers installed to test.
>>>>
>>>> But it would be helpful if you had a /little/ more information.  If
>>>> you don't know why some compilers generate binaries that have
>>>> memory mapped at 0x400000, and others do not, fair enough.  I am
>>>> curious, but it's not at all important.
>>>
>>> In the PE EXE format, the default image load base is specified in a
>>> special header in the file:
>>>
>>>     Magic:            20B
>>>     Link version:     1.0
>>>     Code size:        512 200
>>>     Idata size:       1024 400
>>>     Zdata size:       512
>>>     Entry point:      4096 1000 in data:0
>>>     Code base:        4096
>>>     Image base:       4194304 400000
>>>     Section align:    4096
>>>
>>> By convention it is at 0x40'0000 (I've no idea why).
>>>
>>> More recently, dynamic loading, regardless of what it says in the PE
>>> header, has become popular with linkers. So, while there is still a
>>> fixed value in the Image Base file, which might be 0x140000000, it
>>> gets loaded at some random address, usually in high memory above 2GB.
>>>
>>> I don't know what's responsible for that, but presumably the OS must
>>> be in on the act.
>>>
>>> To make this possible, both for loading above 2GB, and for loading at
>>> an address not known by the linker, the code inside the EXE must be
>>> position-independent, and have relocation info for any absolute
>>> 64-bit static addresses. 32-bit static addresses won't work.
>>>
>>
>> I don't understand why you say that EXE must be position-independent.
>> I never learned PE format in depth (and learned only absolute minimum of
>> elf, just enough to be able to load images in simple embedded
>> scenario), but my impression always was that PE EXE contains plenty of
>> relocation info for a loader, so it (loader) can modify (I think
>> professional argot uses the word 'fix') non-PIC at load time to run at
>> any chosen position.
>> Am I wrong about it?
>
>
> A PE EXE designed to run only at the image base given won't be
> position-independent, so it can't be moved anywwhere else.
>
> There isn't enough info to make it possible, especially before
> position-independent addressing modes for x64 came along (that is, using
> offset to the RIP intruction pointer instead of 32-bit absolute addresses).
>
> Take this C program:
>
>    int abc;
>    int* ptr = &abc;
>
>    int main(void) {
>        int x;
>        x = abc;
>    }
>
> Some of the assembly generated is this:
>
>    abc:   resb 4
>
>    ptr:   dq abc
>    ...
>           mov eax, [abc]
>
> That last reference is an absolute 32-bit address, for example it might
> have address 0x00403000 when loaded at 0x400000.
>
> If the program is instead loaded at 0x78230000, there is no reloc info
> to tell it that that particular 32-bit value, plus the 64-bit field
> initialising ptr, must be adjusted.
>
> RIP-relative addressing (I think sometimes called PIC), can fix that
> second reference:
>
>           mov eax, [rip:abc]
>
> But it only works for code, not data; that initialisation is still
> absolute.
>
> When a DLL is generated instead, those will need to be moved (to avoid
> multiple DLLs all based at the same address). In that case,
> base-relocation tables are needed: a list of addresses that contain a
> field that needs relocating, and what type and size of reloc is needed.
>
> The same info is needed for EXE if it contains flags saying that the EXE
> could be loaded at an arbitrary adddress.
>

I have a few comments about this. One is that PIC is "Position
Independent Code", while PID is "Position Independent Data". Enabling
PIC on a compiler may imply PID as well, or they may be independent.
This can all cause significant run-time costs as access to non-local
data and functions has at least one extra layer of indirection - though
doing it via a register like RIP reduces that overhead quite a bit.

An alternative method is to have the linker generate a file that
contains the executable before the final linking, and a link relocation
table. This is similar to a linkable object file - a reference to the
address of the variable "abc" would be replaced by 0x00000000 in the
machine code, and an entry in the relocation table would say "fill in
the address of abc at position Y from the start of the code section".

Then the program is loaded into memory by a link-loader that fills these
blank fields, just the same way as a static linker does when generating
the image. This complicates the loading mechanism and makes it slower
to start code, but it runs faster.

There are other ways to do things, possibly combinations of these.

I believe the COFF format, which is the base for Windows executable
formats, supports such relocation tables. That does not mean that they
are supported or used on Windows, of course. You know more about what
is actually used in PE format files than I do.

Re: A Famous Security Bug

<v7lMN.723882$p%Mb.643562@fx15.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34778&group=comp.lang.c#34778

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!news.uni-stuttgart.de!npeer.as286.net!npeer-ng0.as286.net!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx15.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: A Famous Security Bug
Newsgroups: comp.lang.c
References: <bug-20240320191736@ram.dialup.fu-berlin.de> <utktul$35ng8$1@dont-email.me> <utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me> <utn1a0$3ogob$1@dont-email.me> <utnh5m$3sdhk$1@dont-email.me> <utpenn$dtnq$1@dont-email.me> <utq0gh$i9hm$1@dont-email.me> <utqaak$kfuv$2@dont-email.me> <20240325141628.00006170@yahoo.com> <utrqgp$12v02$1@dont-email.me> <uts7e0$1686i$1@dont-email.me> <20240325195118.0000333a@yahoo.com> <utsemf$18477$1@dont-email.me> <utsl74$19la6$1@dont-email.me>
Lines: 14
Message-ID: <v7lMN.723882$p%Mb.643562@fx15.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Mon, 25 Mar 2024 20:28:11 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Mon, 25 Mar 2024 20:28:11 GMT
X-Received-Bytes: 1639
 by: Scott Lurndal - Mon, 25 Mar 2024 20:28 UTC

David Brown <david.brown@hesbynett.no> writes:
>On 25/03/2024 19:10, bart wrote:

>
>I believe the COFF format, which is the base for Windows executable
>formats, supports such relocation tables. That does not mean that they
>are supported or used on Windows, of course. You know more about what
>is actually used in PE format files than I do.

COFF supports relocation entries at link time, but not run-time. Run
time linking (and if necessary relocation) was a feature of the ELF
format that superceded COFF in SVR4.

Re: A Famous Security Bug

<20240326000501.00007d6d@yahoo.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34779&group=comp.lang.c#34779

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5...@yahoo.com (Michael S)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Mon, 25 Mar 2024 23:05:01 +0200
Organization: A noiseless patient Spider
Lines: 128
Message-ID: <20240326000501.00007d6d@yahoo.com>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<utkea9$31sr2$1@dont-email.me>
<utktul$35ng8$1@dont-email.me>
<utm06k$3glqc$1@dont-email.me>
<utme8b$3jtip$1@dont-email.me>
<utn1a0$3ogob$1@dont-email.me>
<utnh5m$3sdhk$1@dont-email.me>
<utpenn$dtnq$1@dont-email.me>
<utq0gh$i9hm$1@dont-email.me>
<utqaak$kfuv$2@dont-email.me>
<20240325141628.00006170@yahoo.com>
<utrqgp$12v02$1@dont-email.me>
<uts7e0$1686i$1@dont-email.me>
<20240325195118.0000333a@yahoo.com>
<utsemf$18477$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Injection-Date: Mon, 25 Mar 2024 22:05:03 +0100 (CET)
Injection-Info: dont-email.me; posting-host="34267a0c37a5dddcaf40eac287db4722";
logging-data="1388293"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/cBD4VCEwRDVSULmMuk7/nyKjPGD8TKcs="
Cancel-Lock: sha1:9FfjAkzHwwx5+xBr3N93CwEJHRk=
X-Newsreader: Claws Mail 4.1.1 (GTK 3.24.34; x86_64-w64-mingw32)
 by: Michael S - Mon, 25 Mar 2024 21:05 UTC

On Mon, 25 Mar 2024 18:10:23 +0000
bart <bc@freeuk.com> wrote:

> On 25/03/2024 16:51, Michael S wrote:
> > On Mon, 25 Mar 2024 16:06:24 +0000
> > bart <bc@freeuk.com> wrote:
> >
> >> On 25/03/2024 12:26, David Brown wrote:
> >>> On 25/03/2024 12:16, Michael S wrote:
> >>>> On Sun, 24 Mar 2024 23:43:32 +0100
> >>>> David Brown <david.brown@hesbynett.no> wrote:
> >>>>>
> >>>>> I could be  wrong here, of course.
> >>>>>
> >>>>
> >>>> It seems, you are.
> >>>>
> >>>
> >>> It happens - and it was not unexpected here, as I said.  I don't
> >>> have all these compilers installed to test.
> >>>
> >>> But it would be helpful if you had a /little/ more information.
> >>> If you don't know why some compilers generate binaries that have
> >>> memory mapped at 0x400000, and others do not, fair enough.  I am
> >>> curious, but it's not at all important.
> >>>
> >>
> >> In the PE EXE format, the default image load base is specified in a
> >> special header in the file:
> >>
> >> Magic: 20B
> >> Link version: 1.0
> >> Code size: 512 200
> >> Idata size: 1024 400
> >> Zdata size: 512
> >> Entry point: 4096 1000 in data:0
> >> Code base: 4096
> >> Image base: 4194304 400000
> >> Section align: 4096
> >>
> >> By convention it is at 0x40'0000 (I've no idea why).
> >>
> >> More recently, dynamic loading, regardless of what it says in the
> >> PE header, has become popular with linkers. So, while there is
> >> still a fixed value in the Image Base file, which might be
> >> 0x140000000, it gets loaded at some random address, usually in
> >> high memory above 2GB.
> >>
> >> I don't know what's responsible for that, but presumably the OS
> >> must be in on the act.
> >>
> >> To make this possible, both for loading above 2GB, and for loading
> >> at an address not known by the linker, the code inside the EXE
> >> must be position-independent, and have relocation info for any
> >> absolute 64-bit static addresses. 32-bit static addresses won't
> >> work.
> >
> > I don't understand why you say that EXE must be
> > position-independent. I never learned PE format in depth (and
> > learned only absolute minimum of elf, just enough to be able to
> > load images in simple embedded scenario), but my impression always
> > was that PE EXE contains plenty of relocation info for a loader, so
> > it (loader) can modify (I think professional argot uses the word
> > 'fix') non-PIC at load time to run at any chosen position.
> > Am I wrong about it?
>
>
> A PE EXE designed to run only at the image base given won't be
> position-independent, so it can't be moved anywwhere else.
>
> There isn't enough info to make it possible, especially before
> position-independent addressing modes for x64 came along (that is,
> using offset to the RIP intruction pointer instead of 32-bit absolute
> addresses).
>
> Take this C program:
>
> int abc;
> int* ptr = &abc;
>
> int main(void) {
> int x;
> x = abc;
> }
>
> Some of the assembly generated is this:
>
> abc: resb 4
>
> ptr: dq abc
> ...
> mov eax, [abc]
>
> That last reference is an absolute 32-bit address, for example it
> might have address 0x00403000 when loaded at 0x400000.
>
> If the program is instead loaded at 0x78230000, there is no reloc
> info to tell it that that particular 32-bit value, plus the 64-bit
> field initialising ptr, must be adjusted.
>
> RIP-relative addressing (I think sometimes called PIC), can fix that
> second reference:
>
> mov eax, [rip:abc]
>
> But it only works for code, not data; that initialisation is still
> absolute.
>
> When a DLL is generated instead, those will need to be moved (to
> avoid multiple DLLs all based at the same address). In that case,
> base-relocation tables are needed: a list of addresses that contain a
> field that needs relocating, and what type and size of reloc is
> needed.
>
> The same info is needed for EXE if it contains flags saying that the
> EXE could be loaded at an arbitrary adddress.
>

Your explanation exactly matches what I was imagining.
The technology for relocation of non-PIC code is already here, in file
format definitions and in OS loader code. The linker or the part of
compiler that serves the role of linker can decide to not generate
required tables. Operation in such mode will have small benefits in EXE
size and in quicker load time, but IMHO nowadays it should be used
rarely, only in special situations rather than serve as a default of the
tool.

Re: A Famous Security Bug

<utsq47$1atlm$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34780&group=comp.lang.c#34780

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bc...@freeuk.com (bart)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Mon, 25 Mar 2024 21:25:27 +0000
Organization: A noiseless patient Spider
Lines: 155
Message-ID: <utsq47$1atlm$1@dont-email.me>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<utkea9$31sr2$1@dont-email.me> <utktul$35ng8$1@dont-email.me>
<utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me>
<utn1a0$3ogob$1@dont-email.me> <utnh5m$3sdhk$1@dont-email.me>
<utpenn$dtnq$1@dont-email.me> <utq0gh$i9hm$1@dont-email.me>
<utqaak$kfuv$2@dont-email.me> <20240325141628.00006170@yahoo.com>
<utrqgp$12v02$1@dont-email.me> <uts7e0$1686i$1@dont-email.me>
<20240325195118.0000333a@yahoo.com> <utsemf$18477$1@dont-email.me>
<20240326000501.00007d6d@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 25 Mar 2024 22:25:27 +0100 (CET)
Injection-Info: dont-email.me; posting-host="f53bde5462ef908e46a536c53c557cbe";
logging-data="1406646"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/kyhRtGDHXUPcLfdo5ku/E"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:Uh/y6QL9s9ArjD+9e4q2v66Sflk=
In-Reply-To: <20240326000501.00007d6d@yahoo.com>
Content-Language: en-GB
 by: bart - Mon, 25 Mar 2024 21:25 UTC

On 25/03/2024 21:05, Michael S wrote:
> On Mon, 25 Mar 2024 18:10:23 +0000
> bart <bc@freeuk.com> wrote:
>
>> On 25/03/2024 16:51, Michael S wrote:
>>> On Mon, 25 Mar 2024 16:06:24 +0000
>>> bart <bc@freeuk.com> wrote:
>>>
>>>> On 25/03/2024 12:26, David Brown wrote:
>>>>> On 25/03/2024 12:16, Michael S wrote:
>>>>>> On Sun, 24 Mar 2024 23:43:32 +0100
>>>>>> David Brown <david.brown@hesbynett.no> wrote:
>>>>>>>
>>>>>>> I could be  wrong here, of course.
>>>>>>>
>>>>>>
>>>>>> It seems, you are.
>>>>>>
>>>>>
>>>>> It happens - and it was not unexpected here, as I said.  I don't
>>>>> have all these compilers installed to test.
>>>>>
>>>>> But it would be helpful if you had a /little/ more information.
>>>>> If you don't know why some compilers generate binaries that have
>>>>> memory mapped at 0x400000, and others do not, fair enough.  I am
>>>>> curious, but it's not at all important.
>>>>>
>>>>
>>>> In the PE EXE format, the default image load base is specified in a
>>>> special header in the file:
>>>>
>>>> Magic: 20B
>>>> Link version: 1.0
>>>> Code size: 512 200
>>>> Idata size: 1024 400
>>>> Zdata size: 512
>>>> Entry point: 4096 1000 in data:0
>>>> Code base: 4096
>>>> Image base: 4194304 400000
>>>> Section align: 4096
>>>>
>>>> By convention it is at 0x40'0000 (I've no idea why).
>>>>
>>>> More recently, dynamic loading, regardless of what it says in the
>>>> PE header, has become popular with linkers. So, while there is
>>>> still a fixed value in the Image Base file, which might be
>>>> 0x140000000, it gets loaded at some random address, usually in
>>>> high memory above 2GB.
>>>>
>>>> I don't know what's responsible for that, but presumably the OS
>>>> must be in on the act.
>>>>
>>>> To make this possible, both for loading above 2GB, and for loading
>>>> at an address not known by the linker, the code inside the EXE
>>>> must be position-independent, and have relocation info for any
>>>> absolute 64-bit static addresses. 32-bit static addresses won't
>>>> work.
>>>
>>> I don't understand why you say that EXE must be
>>> position-independent. I never learned PE format in depth (and
>>> learned only absolute minimum of elf, just enough to be able to
>>> load images in simple embedded scenario), but my impression always
>>> was that PE EXE contains plenty of relocation info for a loader, so
>>> it (loader) can modify (I think professional argot uses the word
>>> 'fix') non-PIC at load time to run at any chosen position.
>>> Am I wrong about it?
>>
>>
>> A PE EXE designed to run only at the image base given won't be
>> position-independent, so it can't be moved anywwhere else.
>>
>> There isn't enough info to make it possible, especially before
>> position-independent addressing modes for x64 came along (that is,
>> using offset to the RIP intruction pointer instead of 32-bit absolute
>> addresses).
>>
>> Take this C program:
>>
>> int abc;
>> int* ptr = &abc;
>>
>> int main(void) {
>> int x;
>> x = abc;
>> }
>>
>> Some of the assembly generated is this:
>>
>> abc: resb 4
>>
>> ptr: dq abc
>> ...
>> mov eax, [abc]
>>
>> That last reference is an absolute 32-bit address, for example it
>> might have address 0x00403000 when loaded at 0x400000.
>>
>> If the program is instead loaded at 0x78230000, there is no reloc
>> info to tell it that that particular 32-bit value, plus the 64-bit
>> field initialising ptr, must be adjusted.
>>
>> RIP-relative addressing (I think sometimes called PIC), can fix that
>> second reference:
>>
>> mov eax, [rip:abc]
>>
>> But it only works for code, not data; that initialisation is still
>> absolute.
>>
>> When a DLL is generated instead, those will need to be moved (to
>> avoid multiple DLLs all based at the same address). In that case,
>> base-relocation tables are needed: a list of addresses that contain a
>> field that needs relocating, and what type and size of reloc is
>> needed.
>>
>> The same info is needed for EXE if it contains flags saying that the
>> EXE could be loaded at an arbitrary adddress.
>>
>
> Your explanation exactly matches what I was imagining.
> The technology for relocation of non-PIC code is already here, in file
> format definitions and in OS loader code. The linker or the part of
> compiler that serves the role of linker can decide to not generate
> required tables. Operation in such mode will have small benefits in EXE
> size and in quicker load time, but IMHO nowadays it should be used
> rarely, only in special situations rather than serve as a default of the
> tool.

There are two aspects to be considered:

* Relocating a program to a different address below 2GB

* Relocating a program to any address including above 2GB

The first can be accommodated with tables derived from the reloc info of
object files.

But the second requires compiler cooperation in generating code that
will work above 2GB.

Part of that can be done with RIP-relative address modes as I touched
on. But not all; RIP-relative won't work here:

movsx rax, dword [i]
mov rax, [rbx*8 + abc]

where the address works with registers. This requires something like:

lea rcx, [rip:abc] # or mov rcx, abc (64-bit abs addr)
mov rax, [rbx*8 + rcx]

This is specific to x64, but other processors will have their issues.
Like ARM64 which doesn't even have the 32-bit displayment used with rip
here.

Re: A Famous Security Bug

<20240326023103.00004ea0@yahoo.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34782&group=comp.lang.c#34782

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: already5...@yahoo.com (Michael S)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Tue, 26 Mar 2024 01:31:03 +0200
Organization: A noiseless patient Spider
Lines: 166
Message-ID: <20240326023103.00004ea0@yahoo.com>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<utkea9$31sr2$1@dont-email.me>
<utktul$35ng8$1@dont-email.me>
<utm06k$3glqc$1@dont-email.me>
<utme8b$3jtip$1@dont-email.me>
<utn1a0$3ogob$1@dont-email.me>
<utnh5m$3sdhk$1@dont-email.me>
<utpenn$dtnq$1@dont-email.me>
<utq0gh$i9hm$1@dont-email.me>
<utqaak$kfuv$2@dont-email.me>
<20240325141628.00006170@yahoo.com>
<utrqgp$12v02$1@dont-email.me>
<uts7e0$1686i$1@dont-email.me>
<20240325195118.0000333a@yahoo.com>
<utsemf$18477$1@dont-email.me>
<20240326000501.00007d6d@yahoo.com>
<utsq47$1atlm$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable
Injection-Date: Tue, 26 Mar 2024 00:31:06 +0100 (CET)
Injection-Info: dont-email.me; posting-host="fdc93945bf4086afcea95294ad40c436";
logging-data="1445476"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18HV0cA+SMYGuUTLwJtTTfjZENREPl1RNA="
Cancel-Lock: sha1:/WshBP4uV2ddl5KUyJF8kUYtfzA=
X-Newsreader: Claws Mail 4.1.1 (GTK 3.24.34; x86_64-w64-mingw32)
 by: Michael S - Mon, 25 Mar 2024 23:31 UTC

On Mon, 25 Mar 2024 21:25:27 +0000
bart <bc@freeuk.com> wrote:

> On 25/03/2024 21:05, Michael S wrote:
> > On Mon, 25 Mar 2024 18:10:23 +0000
> > bart <bc@freeuk.com> wrote:
> >
> >> On 25/03/2024 16:51, Michael S wrote:
> >>> On Mon, 25 Mar 2024 16:06:24 +0000
> >>> bart <bc@freeuk.com> wrote:
> >>>
> >>>> On 25/03/2024 12:26, David Brown wrote:
> >>>>> On 25/03/2024 12:16, Michael S wrote:
> >>>>>> On Sun, 24 Mar 2024 23:43:32 +0100
> >>>>>> David Brown <david.brown@hesbynett.no> wrote:
> >>>>>>>
> >>>>>>> I could be  wrong here, of course.
> >>>>>>>
> >>>>>>
> >>>>>> It seems, you are.
> >>>>>>
> >>>>>
> >>>>> It happens - and it was not unexpected here, as I said.  I don't
> >>>>> have all these compilers installed to test.
> >>>>>
> >>>>> But it would be helpful if you had a /little/ more information.
> >>>>> If you don't know why some compilers generate binaries that have
> >>>>> memory mapped at 0x400000, and others do not, fair enough.  I am
> >>>>> curious, but it's not at all important.
> >>>>>
> >>>>
> >>>> In the PE EXE format, the default image load base is specified
> >>>> in a special header in the file:
> >>>>
> >>>> Magic: 20B
> >>>> Link version: 1.0
> >>>> Code size: 512 200
> >>>> Idata size: 1024 400
> >>>> Zdata size: 512
> >>>> Entry point: 4096 1000 in data:0
> >>>> Code base: 4096
> >>>> Image base: 4194304 400000
> >>>> Section align: 4096
> >>>>
> >>>> By convention it is at 0x40'0000 (I've no idea why).
> >>>>
> >>>> More recently, dynamic loading, regardless of what it says in the
> >>>> PE header, has become popular with linkers. So, while there is
> >>>> still a fixed value in the Image Base file, which might be
> >>>> 0x140000000, it gets loaded at some random address, usually in
> >>>> high memory above 2GB.
> >>>>
> >>>> I don't know what's responsible for that, but presumably the OS
> >>>> must be in on the act.
> >>>>
> >>>> To make this possible, both for loading above 2GB, and for
> >>>> loading at an address not known by the linker, the code inside
> >>>> the EXE must be position-independent, and have relocation info
> >>>> for any absolute 64-bit static addresses. 32-bit static
> >>>> addresses won't work.
> >>>
> >>> I don't understand why you say that EXE must be
> >>> position-independent. I never learned PE format in depth (and
> >>> learned only absolute minimum of elf, just enough to be able to
> >>> load images in simple embedded scenario), but my impression always
> >>> was that PE EXE contains plenty of relocation info for a loader,
> >>> so it (loader) can modify (I think professional argot uses the
> >>> word 'fix') non-PIC at load time to run at any chosen position.
> >>> Am I wrong about it?
> >>
> >>
> >> A PE EXE designed to run only at the image base given won't be
> >> position-independent, so it can't be moved anywwhere else.
> >>
> >> There isn't enough info to make it possible, especially before
> >> position-independent addressing modes for x64 came along (that is,
> >> using offset to the RIP intruction pointer instead of 32-bit
> >> absolute addresses).
> >>
> >> Take this C program:
> >>
> >> int abc;
> >> int* ptr = &abc;
> >>
> >> int main(void) {
> >> int x;
> >> x = abc;
> >> }
> >>
> >> Some of the assembly generated is this:
> >>
> >> abc: resb 4
> >>
> >> ptr: dq abc
> >> ...
> >> mov eax, [abc]
> >>
> >> That last reference is an absolute 32-bit address, for example it
> >> might have address 0x00403000 when loaded at 0x400000.
> >>
> >> If the program is instead loaded at 0x78230000, there is no reloc
> >> info to tell it that that particular 32-bit value, plus the 64-bit
> >> field initialising ptr, must be adjusted.
> >>
> >> RIP-relative addressing (I think sometimes called PIC), can fix
> >> that second reference:
> >>
> >> mov eax, [rip:abc]
> >>
> >> But it only works for code, not data; that initialisation is still
> >> absolute.
> >>
> >> When a DLL is generated instead, those will need to be moved (to
> >> avoid multiple DLLs all based at the same address). In that case,
> >> base-relocation tables are needed: a list of addresses that
> >> contain a field that needs relocating, and what type and size of
> >> reloc is needed.
> >>
> >> The same info is needed for EXE if it contains flags saying that
> >> the EXE could be loaded at an arbitrary adddress.
> >>
> >
> > Your explanation exactly matches what I was imagining.
> > The technology for relocation of non-PIC code is already here, in
> > file format definitions and in OS loader code. The linker or the
> > part of compiler that serves the role of linker can decide to not
> > generate required tables. Operation in such mode will have small
> > benefits in EXE size and in quicker load time, but IMHO nowadays it
> > should be used rarely, only in special situations rather than serve
> > as a default of the tool.
>
> There are two aspects to be considered:
>
> * Relocating a program to a different address below 2GB
>
> * Relocating a program to any address including above 2GB
>
> The first can be accommodated with tables derived from the reloc info
> of object files.
>
> But the second requires compiler cooperation in generating code that
> will work above 2GB.
>
> Part of that can be done with RIP-relative address modes as I touched
> on. But not all; RIP-relative won't work here:
>
> movsx rax, dword [i]
> mov rax, [rbx*8 + abc]
>
> where the address works with registers. This requires something like:
>
> lea rcx, [rip:abc] # or mov rcx, abc (64-bit abs addr)
> mov rax, [rbx*8 + rcx]
>
> This is specific to x64, but other processors will have their issues.
> Like ARM64 which doesn't even have the 32-bit displayment used with
> rip here.
>

You mean, when compiler knows that the program is loaded at low address
and when combined data segments are relatively small then compiler can
use zero-extended or sign-extended 32-bit literals to form 64-bit
addresses of static/global objects?
I see how relocation of such program is a problem in 64-bit mode, but
still fail to see how similar problem could happen in 32-bit mode.

Re: A Famous Security Bug

<utt567$1dair$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34783&group=comp.lang.c#34783

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bc...@freeuk.com (bart)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Tue, 26 Mar 2024 00:34:14 +0000
Organization: A noiseless patient Spider
Lines: 73
Message-ID: <utt567$1dair$1@dont-email.me>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<utkea9$31sr2$1@dont-email.me> <utktul$35ng8$1@dont-email.me>
<utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me>
<utn1a0$3ogob$1@dont-email.me> <utnh5m$3sdhk$1@dont-email.me>
<utpenn$dtnq$1@dont-email.me> <utq0gh$i9hm$1@dont-email.me>
<utqaak$kfuv$2@dont-email.me> <20240325141628.00006170@yahoo.com>
<utrqgp$12v02$1@dont-email.me> <uts7e0$1686i$1@dont-email.me>
<20240325195118.0000333a@yahoo.com> <utsemf$18477$1@dont-email.me>
<20240326000501.00007d6d@yahoo.com> <utsq47$1atlm$1@dont-email.me>
<20240326023103.00004ea0@yahoo.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 26 Mar 2024 01:34:15 +0100 (CET)
Injection-Info: dont-email.me; posting-host="7b32b1f0d0f59a306dc9e204152d2f32";
logging-data="1485403"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/x1dSkta66zBNPuTb4IpcA"
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:2Me7X53TngAT9tKRp3qLn2TaF3A=
In-Reply-To: <20240326023103.00004ea0@yahoo.com>
Content-Language: en-GB
 by: bart - Tue, 26 Mar 2024 00:34 UTC

On 25/03/2024 23:31, Michael S wrote:
> On Mon, 25 Mar 2024 21:25:27 +0000
> bart <bc@freeuk.com> wrote:

>>> Your explanation exactly matches what I was imagining.
>>> The technology for relocation of non-PIC code is already here, in
>>> file format definitions and in OS loader code. The linker or the
>>> part of compiler that serves the role of linker can decide to not
>>> generate required tables. Operation in such mode will have small
>>> benefits in EXE size and in quicker load time, but IMHO nowadays it
>>> should be used rarely, only in special situations rather than serve
>>> as a default of the tool.
>>
>> There are two aspects to be considered:
>>
>> * Relocating a program to a different address below 2GB
>>
>> * Relocating a program to any address including above 2GB
>>
>> The first can be accommodated with tables derived from the reloc info
>> of object files.
>>
>> But the second requires compiler cooperation in generating code that
>> will work above 2GB.
>>
>> Part of that can be done with RIP-relative address modes as I touched
>> on. But not all; RIP-relative won't work here:
>>
>> movsx rax, dword [i]
>> mov rax, [rbx*8 + abc]
>>
>> where the address works with registers. This requires something like:
>>
>> lea rcx, [rip:abc] # or mov rcx, abc (64-bit abs addr)
>> mov rax, [rbx*8 + rcx]
>>
>> This is specific to x64, but other processors will have their issues.
>> Like ARM64 which doesn't even have the 32-bit displayment used with
>> rip here.
>>
>
> You mean, when compiler knows that the program is loaded at low address
> and when combined data segments are relatively small then compiler can
> use zero-extended or sign-extended 32-bit literals to form 64-bit
> addresses of static/global objects?
> I see how relocation of such program is a problem in 64-bit mode, but
> still fail to see how similar problem could happen in 32-bit mode.
>

At 32 bits the problems of high-loading disappear, as programs and data
need to fit into 2GB.

Some problems with relocating remain. RIP-relative can't be used, as I
believe that works only in 64-bit mode.

What remains are the base-relocations, which had in the past only been
needed when generating dynamic libraries like DLLs. They just weren't a
thing for EXE.

This then reduces to whether the C toolset will generate the right EXE.
Either it does or doesn't, but you can always choose a different compiler.

All I can tell you is that of the suite of 5 compilers I've tried, 4 of
them, including in 32-bit mode if supported, don't generate an EXE that
will be loaded at an arbitrary address. Only gcc will do that.

The same goes for Clang run at rextester.com: that doesn't load high
(but it could also be an old version).

Maybe some don't think it's that important. But it's not as
straightforward as you seem to think. Yes, it might have been a bit
simpler with 32 bits, but it wasn't trendy then, and not not many still
use 32 bits.

Re: A Famous Security Bug

<memset-20240327112617@ram.dialup.fu-berlin.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34795&group=comp.lang.c#34795

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: ram...@zedat.fu-berlin.de (Stefan Ram)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: 27 Mar 2024 10:30:31 GMT
Organization: Stefan Ram
Lines: 51
Expires: 1 Feb 2025 11:59:58 GMT
Message-ID: <memset-20240327112617@ram.dialup.fu-berlin.de>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de 2xy0MSElB/eAzEUSrzAAEwXR9Eouaf0razg//0OkTHqzYe
Cancel-Lock: sha1:9Idlx8p+kBcizxdOsLwo1LTO+pY= sha256:qeubUFGwyrnjY7Cu29h7lrpenYWV13I1CbUNAU6eE2o=
X-Copyright: (C) Copyright 2024 Stefan Ram. All rights reserved.
Distribution through any means other than regular usenet
channels is forbidden. It is forbidden to publish this
article in the Web, to change URIs of this article into links,
and to transfer the body without this notice, but quotations
of parts in other Usenet posts are allowed.
X-No-Archive: Yes
Archive: no
X-No-Archive-Readme: "X-No-Archive" is set, because this prevents some
services to mirror the article in the web. But the article may
be kept on a Usenet archive server with only NNTP access.
X-No-Html: yes
Content-Language: en-US
Accept-Language: de-DE-1901, en-US, it, fr-FR
 by: Stefan Ram - Wed, 27 Mar 2024 10:30 UTC

ram@zedat.fu-berlin.de (Stefan Ram) wrote or quoted:
>A "famous security bug":
>void f( void )
>{ char buffer[ MAX ];
>/* . . . */
>memset( buffer, 0, sizeof( buffer )); }
>. Can you see what the bug is?
>(I have already read the answer; I post it as a pastime.)

I was reading this under the heading "State postconditions".
It was suggested that the code should rather be:

void f()
{ char buffer[MAX];
/* . . . */
memset( buffer, 0, sizeof( buffer ));
Ensures( buffer[ 0 ]== 0 ); }

("Ensures" states a postcondition). Now, according to the text
I read, the compiler cannot eliminate the "memset" anymore.

Here are some thoughts of mine on this:

With the "buffer[ 0 ]== 0", I wonder, as per the "as if rule",
whether the compiler would still be permitted to replace
the memset by just "buffer[ 0 ]= 0;".

So, what would be a bit more "paranoid" would then be:

for( size_t i = 0; i < sizeof( buffer ); ++i )
Ensures( buffer[ i ]== 0 );

or,

i = mylib_random( sizeof( buffer ));
Ensures( buffer[ i ]== 0 );

. How could one implement "Ensures" in C? The first thing that
comes to mind is a call to "assert" of course.

But I also have to think of an "escape" Chandler Carruth mentioned
it in one talk. IIRC, it was something along the lines of

static void escape( volatile void * p )
{ asm volatile( "" : : "g"(p) : "memory" ); }

(which might not be standard C). Now, if you call "escape( buffer )"
at the end of the definition of the function "f" above, the compiler
knows that the contents of buffer has become visible to the outside
world, so that the effects of the "memset" operation become visible
externally, which means that the "memset" call cannot be elided.

Re: A Famous Security Bug

<0-20240327113225@ram.dialup.fu-berlin.de>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34796&group=comp.lang.c#34796

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: ram...@zedat.fu-berlin.de (Stefan Ram)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: 27 Mar 2024 10:35:27 GMT
Organization: Stefan Ram
Lines: 12
Expires: 1 Feb 2025 11:59:58 GMT
Message-ID: <0-20240327113225@ram.dialup.fu-berlin.de>
References: <bug-20240320191736@ram.dialup.fu-berlin.de> <memset-20240327112617@ram.dialup.fu-berlin.de>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de zLTz1xmPlLUJ6ilp7ArhEwsDDJXl/aPDFVwGz9rUkTNcxM
Cancel-Lock: sha1:jTwV7RbyUmhYeI+p9eK/gTS+m98= sha256:8Pk+MYLKog2TVPYFoWDOkfMubHy4kD1EWL2KIviPbRQ=
X-Copyright: (C) Copyright 2024 Stefan Ram. All rights reserved.
Distribution through any means other than regular usenet
channels is forbidden. It is forbidden to publish this
article in the Web, to change URIs of this article into links,
and to transfer the body without this notice, but quotations
of parts in other Usenet posts are allowed.
X-No-Archive: Yes
Archive: no
X-No-Archive-Readme: "X-No-Archive" is set, because this prevents some
services to mirror the article in the web. But the article may
be kept on a Usenet archive server with only NNTP access.
X-No-Html: yes
Content-Language: en-US
Accept-Language: de-DE-1901, en-US, it, fr-FR
 by: Stefan Ram - Wed, 27 Mar 2024 10:35 UTC

ram@zedat.fu-berlin.de (Stefan Ram) wrote or quoted:
>void f()
>{ char buffer[MAX];
> /* . . . */
> memset( buffer, 0, sizeof( buffer ));
> Ensures( buffer[ 0 ]== 0 ); }

Oh, and now I see a potential bug in this:
"buffer[ 0 ]" assumes that MAX > 0.

(ISO C forbids "char buffer[ 0 ];", but the code
might be used on some nonstandard implementation.)

Re: A Famous Security Bug

<wwvedbw6i9o.fsf@LkoBDZeT.terraraq.uk>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34798&group=comp.lang.c#34798

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!news.nntp4.net!nntp.terraraq.uk!.POSTED.tunnel.sfere.anjou.terraraq.org.uk!not-for-mail
From: inva...@invalid.invalid (Richard Kettlewell)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Wed, 27 Mar 2024 11:12:03 +0000
Organization: terraraq NNTP server
Message-ID: <wwvedbw6i9o.fsf@LkoBDZeT.terraraq.uk>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<memset-20240327112617@ram.dialup.fu-berlin.de>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: innmantic.terraraq.uk; posting-host="tunnel.sfere.anjou.terraraq.org.uk:172.17.207.6";
logging-data="51469"; mail-complaints-to="usenet@innmantic.terraraq.uk"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
Cancel-Lock: sha1:PLewEUqv6uOnT69wYs4BLjS+y00=
X-Face: h[Hh-7npe<<b4/eW[]sat,I3O`t8A`(ej.H!F4\8|;ih)`7{@:A~/j1}gTt4e7-n*F?.Rl^
F<\{jehn7.KrO{!7=:(@J~]<.[{>v9!1<qZY,{EJxg6?Er4Y7Ng2\Ft>Z&W?r\c.!4DXH5PWpga"ha
+r0NzP?vnz:e/knOY)PI-
X-Boydie: NO
 by: Richard Kettlewell - Wed, 27 Mar 2024 11:12 UTC

ram@zedat.fu-berlin.de (Stefan Ram) writes:
> i = mylib_random( sizeof( buffer ));
> Ensures( buffer[ i ]== 0 );
>
> . How could one implement "Ensures" in C? The first thing that
> comes to mind is a call to "assert" of course.

The assert gets compiled out too.

> But I also have to think of an "escape" Chandler Carruth mentioned
> it in one talk. IIRC, it was something along the lines of
>
> static void escape( volatile void * p )
> { asm volatile( "" : : "g"(p) : "memory" ); }
>
> (which might not be standard C). Now, if you call "escape( buffer )"
> at the end of the definition of the function "f" above, the compiler
> knows that the contents of buffer has become visible to the outside
> world, so that the effects of the "memset" operation become visible
> externally, which means that the "memset" call cannot be elided.

Indeed it’s not standard C, but variants of it are a common strategy on
compilers that support it.

The flaw is that any data from the target buffer that’s been copied into
registers or other temporary storage isn’t erased. How much that matters
is situational. In principle C23’s memset_explicit could address this.

--
https://www.greenend.org.uk/rjk/

Re: A Famous Security Bug

<20240327121437.309@kylheku.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34802&group=comp.lang.c#34802

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: 433-929-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Wed, 27 Mar 2024 21:06:12 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 73
Message-ID: <20240327121437.309@kylheku.com>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com> <uthirj$29aoc$1@dont-email.me>
<20240321092738.111@kylheku.com> <87a5mr1ffp.fsf@nosuchdomain.example.com>
<20240322083648.539@kylheku.com> <87le6az0s8.fsf@nosuchdomain.example.com>
<20240322094449.555@kylheku.com> <87cyrmyvnv.fsf@nosuchdomain.example.com>
<20240322123323.805@kylheku.com> <utmst2$3n7mv$2@dont-email.me>
<20240323090700.848@kylheku.com> <utnt30$3v0ck$1@dont-email.me>
<20240323182314.725@kylheku.com> <utp9ct$cmur$1@dont-email.me>
<20240324083718.507@kylheku.com> <utpk90$f8q6$1@dont-email.me>
Injection-Date: Wed, 27 Mar 2024 21:06:12 +0100 (CET)
Injection-Info: dont-email.me; posting-host="8f377292846874240a48d49262348abc";
logging-data="3234759"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+e9qk7ubTNVBR9cgoMOu1I92prV+0jRXU="
User-Agent: slrn/pre1.0.4-9 (Linux)
Cancel-Lock: sha1:FNA8Ae8i1AhpN4GqO/fXvq8ai8I=
 by: Kaz Kylheku - Wed, 27 Mar 2024 21:06 UTC

On 2024-03-24, David Brown <david.brown@hesbynett.no> wrote:
> On 24/03/2024 17:02, Kaz Kylheku wrote:
>> On 2024-03-24, David Brown <david.brown@hesbynett.no> wrote:
>>> On 24/03/2024 06:50, Kaz Kylheku wrote:
>>>> (So why bother looking.) I mean,
>>>> the absolute baseline requirement any LTO implementor strives toward is
>>>> no change in observable behavior in a strictly conforming program, which
>>>> would be a showstopper.
>>>>
>>>
>>> Yes.
>>>
>>> I don't believe anyone - except you - has said anything otherwise. A C
>>> implementation is conforming if and only if it takes any correct C
>>> source code and generates a program image that always has correct
>>> observable behaviour when no undefined behaviour is executed. There are
>>> no extra imaginary requirements to be conforming, such as not being
>>> allowed to use extra information while compiling translation units.
>>
>> But the requirement isn't imaginary. The "least requirements"
>> paragraph doesn't mean that all other requirements are imaginary;
>> most of them are necessary to describe the language so that we know
>> how to find the observable behavior.
>>
>
> The text is not imaginary - your reading between the lines /is/. There
> is no rule in the C standards stopping the compiler from using
> additional information or knowledge about other parts of the program.

Sure there is; just not in a way that speaks to the formal notion of
conformance. The text is there, and a user and implementor can use
that as a touchstone for agreeing on something outside of conformance.

>> In safety critical coding, we might want to conduct a code review of
>> the disassembly of an object file (does it correctly implement the
>> intent we believe to be expressed in the source), and then retain that
>> exact file until wit needs to be recompiled.
>
> Sure. And for that reason, some developers in that field will not use
> LTO. I personally don't make much use of LTO because it makes software
> a pain to debug.

So, in that situation, your requirement can be articulated in a way that
refers to the descriptions in ISO C. You're having your translation
units semantically analyzed according to the abstract separation between
phase 7 and 8 (which is not required to be followed for conformance).

We can identify the LTO switch in the compiler as hinging around
whether the abstract semantics is followed or not. (Just we can't tell
using observable behavior.)

This seems like a good thing.

>> We just may not confuse that conformance (private contract between
>> implementor and user) with ISO C conformance, as I have.
>> Sorry about that!
>>
>
> Are you saying that after dozens of posts back and forth where you made
> claims about non-conformity of C compilers handling of C code in
> comp.lang.c, with heavy references to the C standards which define the
> term "conformity", you are now saying that you were not talking about C
> standard conformity?

Certainly not! I was wrongly talking about that one and only
conformance.

Once again, sorry about that.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca

Re: A Famous Security Bug

<20240328122327.3895243496305c8cc8a9d063@g{oogle}mail.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34804&group=comp.lang.c#34804

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: anton....@g{oogle}mail.com (Anton Shepelev)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Thu, 28 Mar 2024 12:23:27 +0300
Organization: A noiseless patient Spider
Lines: 34
Message-ID: <20240328122327.3895243496305c8cc8a9d063@g{oogle}mail.com>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<20240321131621.321@kylheku.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 28 Mar 2024 09:23:31 +0100 (CET)
Injection-Info: dont-email.me; posting-host="b2af8e000113f14e0de58c4b02eef262";
logging-data="3673424"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/6lgA4cq6T/wjC+Upwitv3L0wYot1K37g="
Cancel-Lock: sha1:lV2kqi8vbgmvQ8pftTA6PGqGqBA=
X-Newsreader: Sylpheed 3.7.0 (GTK+ 2.24.30; i686-pc-mingw32)
 by: Anton Shepelev - Thu, 28 Mar 2024 09:23 UTC

Kaz Kylheku:

> If C compilers warned about every piece of dead code that
> is eliminated, you'd be up to your ears in diagnostics all
> day.

Is so much dead code a defect in the source or a benigh
consequence of a well-pondered decision?

> If you do want the code deleted, that doesn't always mean
> you can do it yoruself. What gets eliminated can be target
> dependent:
>
> switch (sizeof (long)) {
> case 4: ...
> case 8: ..
> }

The case above is IMHO best handled by conditional
compilation, even though more work may be required to
dispatch on a type size in the preprocessor.

> Because memset is part of the C language, the compiler
> knows exactly what effect it has (that it's equivalent to
> setting all the bytes to zero, like a sequence of
> assignments).

Yes, it is an instance of a special case relying upon hard-
coded information. Why not, however, let the programmer
elimitate this dead code from his code, if it /is/ dead?

--
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments

Re: A Famous Security Bug

<uu3ekk$3g8b3$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34805&group=comp.lang.c#34805

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: jameskuy...@alumni.caltech.edu (James Kuyper)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Thu, 28 Mar 2024 05:52:20 -0400
Organization: A noiseless patient Spider
Lines: 32
Message-ID: <uu3ekk$3g8b3$1@dont-email.me>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<utkea9$31sr2$1@dont-email.me> <utktul$35ng8$1@dont-email.me>
<utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 28 Mar 2024 09:52:20 +0100 (CET)
Injection-Info: dont-email.me; posting-host="2ceea09d85f2c48031948d5e55584ec0";
logging-data="3678563"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/Cb7TQ/xkVYHjLZz8o4aDaTWPyohH22F4="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:8sGBqNjhsVz53pOwvnZJu54yZ+U=
In-Reply-To: <utme8b$3jtip$1@dont-email.me>
Content-Language: en-US
 by: James Kuyper - Thu, 28 Mar 2024 09:52 UTC

On Sat, 23 Mar 2024 11:26:03 +0000
bart <bc@freeuk.com> wrote:

> On 23/03/2024 07:26, James Kuyper wrote:
> > bart <bc@freeuk.com> writes:
> >> On 22/03/2024 17:14, James Kuyper wrote:
> > [...]
> >>> If you want to tell a system not only what a program must do, but
> >>> also how it must do it, you need to use a lower-level language
> >>> than C.
> >>
> >> Which one?
> >
> > That's up to you. The point is, C is NOT that language.
>
> I'm asking which /mainstream/ HLL is lower level than C. So
> specifically ruling out assembly.

I don't know of any, and said nothing to suggest that there is one. I'm
only pointing out that if that's important to you, you must either find
such a language, or create it (as you seem to already be doing). If, as
you imply, there's no such mainstream HLL, that implies that there's not
enough people sharing your preferences to make such an HLL popular
enough to qualify as mainstream.
I certainly don't care how my programs achieve their observable
behavior, and I'm only too happy to let machine-language experts use
their specialized expertise to create compilers which achieve that
behavior in whatever way is best for the target system. I have no desire
to spend my time aquiring that expertise.

Re: A Famous Security Bug

<uu3fu0$3gmba$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34806&group=comp.lang.c#34806

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: jameskuy...@alumni.caltech.edu (James Kuyper)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Thu, 28 Mar 2024 06:14:24 -0400
Organization: A noiseless patient Spider
Lines: 59
Message-ID: <uu3fu0$3gmba$1@dont-email.me>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<utkea9$31sr2$1@dont-email.me> <utktul$35ng8$1@dont-email.me>
<utm06k$3glqc$1@dont-email.me> <utme8b$3jtip$1@dont-email.me>
<utn1a0$3ogob$1@dont-email.me> <utnh5m$3sdhk$1@dont-email.me>
<20240324185353.00002395@yahoo.com> <utpt4f$ha61$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 28 Mar 2024 10:14:24 +0100 (CET)
Injection-Info: dont-email.me; posting-host="2ceea09d85f2c48031948d5e55584ec0";
logging-data="3692906"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19VmawCkPK/Bb2uGkPn0L1ZB0qpHpW/wnQ="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:V7PqRm4v9Xw8OcCyiiIvkGLaCv4=
In-Reply-To: <utpt4f$ha61$1@dont-email.me>
Content-Language: en-US
 by: James Kuyper - Thu, 28 Mar 2024 10:14 UTC

On 24/03/2024 19:58, bart wrote:
> On 24/03/2024 15:53, Michael S wrote:

>> #include <stdio.h>
>> #include <stddef.h>
>>
>> int main(void)
>> {
>>    char* p0 = (char*)((size_t)main & -(size_t)0x10000);
>>    printf("%c%c\n", p0[0], p0[1]);
>>    return 0;
>> }
>>
>>
>> That would work for small programs. Not necessarily for bigger
>> programs.
>>
>
> I'm not sure how that works. Are EXE images always loaded at multiple of
> 64KB? I suppose on larger programs it could search backwards 64KB at a
> time (although it could also hit on a rogue 'MZ' in program data).
>
> My point however was whether C considered that p0[0] access UB because
> it doesn't point into any C data object.

Here's what the standard says about (size_t)main:
"... the result is implementation-defined. If the result cannot be
represented in the integer type, the behavior is undefined. The result
need not be in the range of values of any integer type."

Here's what the standard says about the conversion to char*:
"the result is implementation-defined, might not be correctly aligned,
might not point to an entity of the referenced type, and might produce
an indeterminate representation when stored into an object."

Alignment cannot be an issue with char*, but the other two problems
remain. In particular, I think you're assuming that, when converted back
to a pointer, the resulting pointer will point 0x10000 bytes further on
in memory. There's no such guarantee.

p0[0] is defined as *(p0+0). As a result, the relevant wording occurs in
the description of the unary * operator.

"If the operand points to a function, the result is a function
designator; if it points to an object, the result is an lvalue
designating the object."

Here's the most fundamental problem: there's no guarantee that p0[0]
points at a C object. There's a very good chance, if the code does what
you're hoping it will do, that it points inside a function. As a result,
the following applies:

"If an invalid value has been assigned to the pointer, the behavior of
the unary * operator is undefined."

So, an implementation is free to define the behavior of such code so
that it does what you want - but the C standard doesn't even come close
to mandating that it do so.

Re: A Famous Security Bug

<BVeNN.99577$24ld.88300@fx07.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34814&group=comp.lang.c#34814

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!feeder1-2.proxad.net!proxad.net!feeder1-1.proxad.net!193.141.40.65.MISMATCH!npeer.as286.net!npeer-ng0.as286.net!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx07.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: A Famous Security Bug
Newsgroups: comp.lang.c
References: <bug-20240320191736@ram.dialup.fu-berlin.de> <20240320114218.151@kylheku.com> <20240321211306.779b21d126e122556c34a346@gmail.moc> <20240321131621.321@kylheku.com> <20240328122327.3895243496305c8cc8a9d063@g{oogle}mail.com>
Lines: 26
Message-ID: <BVeNN.99577$24ld.88300@fx07.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Thu, 28 Mar 2024 14:12:49 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Thu, 28 Mar 2024 14:12:49 GMT
X-Received-Bytes: 1617
 by: Scott Lurndal - Thu, 28 Mar 2024 14:12 UTC

Anton Shepelev <anton.txt@g{oogle}mail.com> writes:
>Kaz Kylheku:
>
>> If C compilers warned about every piece of dead code that
>> is eliminated, you'd be up to your ears in diagnostics all
>> day.
>
>Is so much dead code a defect in the source or a benigh
>consequence of a well-pondered decision?
>
>> If you do want the code deleted, that doesn't always mean
>> you can do it yoruself. What gets eliminated can be target
>> dependent:
>>
>> switch (sizeof (long)) {
>> case 4: ...
>> case 8: ..
>> }
>
>The case above is IMHO best handled by conditional
>compilation, even though more work may be required to
>dispatch on a type size in the preprocessor.

And it would be ugly and prone to breakage. Let
the compiler optimization pass handle it.

Re: A Famous Security Bug

<uu4bl1$3nrfa$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=34823&group=comp.lang.c#34823

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Thu, 28 Mar 2024 19:07:28 +0100
Organization: A noiseless patient Spider
Lines: 133
Message-ID: <uu4bl1$3nrfa$1@dont-email.me>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com> <uthirj$29aoc$1@dont-email.me>
<20240321092738.111@kylheku.com> <87a5mr1ffp.fsf@nosuchdomain.example.com>
<20240322083648.539@kylheku.com> <87le6az0s8.fsf@nosuchdomain.example.com>
<20240322094449.555@kylheku.com> <87cyrmyvnv.fsf@nosuchdomain.example.com>
<20240322123323.805@kylheku.com> <utmst2$3n7mv$2@dont-email.me>
<20240323090700.848@kylheku.com> <utnt30$3v0ck$1@dont-email.me>
<20240323182314.725@kylheku.com> <utp9ct$cmur$1@dont-email.me>
<20240324083718.507@kylheku.com> <utpk90$f8q6$1@dont-email.me>
<20240327121437.309@kylheku.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 28 Mar 2024 18:07:29 +0100 (CET)
Injection-Info: dont-email.me; posting-host="0a08a487bba2a74163287f88a6183244";
logging-data="3927530"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18zSkl2TsFow1W5/igho1TNqCzXIsjR/C0="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:5O65bjaI8v8kRZkw0Q2HPGxjwsU=
Content-Language: en-GB
In-Reply-To: <20240327121437.309@kylheku.com>
 by: David Brown - Thu, 28 Mar 2024 18:07 UTC

On 27/03/2024 22:06, Kaz Kylheku wrote:
> On 2024-03-24, David Brown <david.brown@hesbynett.no> wrote:
>> On 24/03/2024 17:02, Kaz Kylheku wrote:
>>> On 2024-03-24, David Brown <david.brown@hesbynett.no> wrote:
>>>> On 24/03/2024 06:50, Kaz Kylheku wrote:
>>>>> (So why bother looking.) I mean,
>>>>> the absolute baseline requirement any LTO implementor strives toward is
>>>>> no change in observable behavior in a strictly conforming program, which
>>>>> would be a showstopper.
>>>>>
>>>>
>>>> Yes.
>>>>
>>>> I don't believe anyone - except you - has said anything otherwise. A C
>>>> implementation is conforming if and only if it takes any correct C
>>>> source code and generates a program image that always has correct
>>>> observable behaviour when no undefined behaviour is executed. There are
>>>> no extra imaginary requirements to be conforming, such as not being
>>>> allowed to use extra information while compiling translation units.
>>>
>>> But the requirement isn't imaginary. The "least requirements"
>>> paragraph doesn't mean that all other requirements are imaginary;
>>> most of them are necessary to describe the language so that we know
>>> how to find the observable behavior.
>>>
>>
>> The text is not imaginary - your reading between the lines /is/. There
>> is no rule in the C standards stopping the compiler from using
>> additional information or knowledge about other parts of the program.
>
> Sure there is; just not in a way that speaks to the formal notion of
> conformance. The text is there, and a user and implementor can use
> that as a touchstone for agreeing on something outside of conformance.
>

Users and implementers can agree on requirements that are outside the
requirements of the standards - that is certainly true. A user will
require many things of a compiler that are not in the standard - the
system it runs on, its speed, its cost, the quality of the error
messages, and countless other things.

Those are not mentioned in the C standards, but are without doubt
important to users.

However, you can't claim there are things in the C standards that have
implications about things that are not related to conformance to the C
standards! And you can't claim that violating something that is based
on /your/ requirements outside of the C standards makes a compiler
non-conforming in the context of the C standards.

You are free to say that /you/ require a particular behaviour from your
compiler, and that LTO violates conformity with /your/ requirements.
And you can happily use a reference to the C standards to help explain
your additional requirements. You just don't get to say that the C
standards make requirements that they don't contain.

>>> In safety critical coding, we might want to conduct a code review of
>>> the disassembly of an object file (does it correctly implement the
>>> intent we believe to be expressed in the source), and then retain that
>>> exact file until wit needs to be recompiled.
>>
>> Sure. And for that reason, some developers in that field will not use
>> LTO. I personally don't make much use of LTO because it makes software
>> a pain to debug.
>
> So, in that situation, your requirement can be articulated in a way that
> refers to the descriptions in ISO C.

No, not remotely.

My requirements for debugging are not covered in the C standards in any
way. Currently, enabling LTO in gcc makes the code generation difficult
for single-step debugging - it is often very difficult to see which
assembly instructions match up with which piece of source code. I
fine-tune other optimisation flags too in order to give a better balance
(for my own personal definition of "better") between code efficiency and
ease of debugging. I do not suspect gcc LTO of generating incorrect or
non-conforming code.

When you choose not to enable LTO, you are making exactly the same kind
of decision (except you do so for testability, rather than debugability).

> You're having your translation
> units semantically analyzed according to the abstract separation between
> phase 7 and 8 (which is not required to be followed for conformance).
>

That is completely irrelevant to me. What /is/ relevant, is that code
is not moved around too much and it is thus easier to follow when
single-stepping or doing other debugging. I may also disable inlining
and other inter-procedural optimisations within units - something that
clearly has no relevance to conformity.

> We can identify the LTO switch in the compiler as hinging around
> whether the abstract semantics is followed or not. (Just we can't tell
> using observable behavior.)

No, we can't. LTO is fully valid, conforming optimisation that does not
affect the abstract semantics of the language in any way.

But it might affect other requirements outside of the C standards and
their definition of the semantics of the language.

>
> This seems like a good thing.

It's a good thing that people get the choice to balance different
requirements beyond the C standards. (And they even get some options
that affect conformity, because that is not always important to all users.)

>
>>> We just may not confuse that conformance (private contract between
>>> implementor and user) with ISO C conformance, as I have.
>>> Sorry about that!
>>>
>>
>> Are you saying that after dozens of posts back and forth where you made
>> claims about non-conformity of C compilers handling of C code in
>> comp.lang.c, with heavy references to the C standards which define the
>> term "conformity", you are now saying that you were not talking about C
>> standard conformity?
>
> Certainly not! I was wrongly talking about that one and only
> conformance.
>
> Once again, sorry about that.
>

OK. Let's try to be clear - "conformance" on its own, in c.l.c., means
conformity to the C standards. If you or I are talking about conforming
to a different set of requirements, we need to be explicit about it.

Re: A Famous Security Bug

<86le5byfdn.fsf@linuxsc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=35121&group=comp.lang.c#35121

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17...@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Wed, 17 Apr 2024 12:10:28 -0700
Organization: A noiseless patient Spider
Lines: 24
Message-ID: <86le5byfdn.fsf@linuxsc.com>
References: <bug-20240320191736@ram.dialup.fu-berlin.de> <20240320114218.151@kylheku.com> <20240321211306.779b21d126e122556c34a346@gmail.moc> <20240321131621.321@kylheku.com> <utk1k9$2uojo$1@dont-email.me> <20240322083037.20@kylheku.com> <utkgd2$32aj7$1@dont-email.me> <wwva5mpwbh0.fsf@LkoBDZeT.terraraq.uk> <86o7b3k283.fsf@linuxsc.com> <utppal$gh3s$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Date: Wed, 17 Apr 2024 21:10:31 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="6d92fdb377d11e51f4f17eafe8d18365";
logging-data="1864605"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18KdKqbm3t8rZP66Wh0kw7eDdW4mSQF61g="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:avaR4NP/8x0lFBs14NuaxQ7xL6A=
sha1:a9aeSjBrDk/zx5PyhVwIWbQ/qlg=
 by: Tim Rentsch - Wed, 17 Apr 2024 19:10 UTC

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

> On 24/03/2024 16:45, Tim Rentsch wrote:
>
>> The C standard means what the ISO C group thinks it means.
>> They are the ultimate and sole authority. Any discussion about what
>> the C standard requires that ignores that or pretends otherwise is
>> a meaningless exercise.
>
> An intentionalist.

That is a misunderstanding of what I said.

> But when a text has come about by a process of argument, negotation
> and compromise and votes, is that postion so easy to defend as it
> might appear to be for a simpler text?

It's not a position, it's an observation. The ISO C committee is
the recognized authority for judgment about the meaning of the C
standard. Whatever discussion may have gone into writing the
document is irrelevant; all that matters is that the ISO C
group went through the approved ISO process, and hence the world
at large defers to their view as being authoritative on the
question of how to read the text of the standard.

Re: A Famous Security Bug

<uvql43$25k0u$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=35124&group=comp.lang.c#35124

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: A Famous Security Bug
Date: Thu, 18 Apr 2024 10:20:19 +0200
Organization: A noiseless patient Spider
Lines: 51
Message-ID: <uvql43$25k0u$1@dont-email.me>
References: <bug-20240320191736@ram.dialup.fu-berlin.de>
<20240320114218.151@kylheku.com>
<20240321211306.779b21d126e122556c34a346@gmail.moc>
<20240321131621.321@kylheku.com> <utk1k9$2uojo$1@dont-email.me>
<20240322083037.20@kylheku.com> <utkgd2$32aj7$1@dont-email.me>
<wwva5mpwbh0.fsf@LkoBDZeT.terraraq.uk> <86o7b3k283.fsf@linuxsc.com>
<utppal$gh3s$1@dont-email.me> <86le5byfdn.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Thu, 18 Apr 2024 10:20:20 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="a0080b8d6f1ecc9cfec5050a64ed5ac7";
logging-data="2281502"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+r+B/pXfhIF4oFvDQl+q7JaQ3e5Xojkus="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.11.0
Cancel-Lock: sha1:0pTlQwTAVDszyAP/g1BU3iSpkFM=
Content-Language: en-GB
In-Reply-To: <86le5byfdn.fsf@linuxsc.com>
 by: David Brown - Thu, 18 Apr 2024 08:20 UTC

On 17/04/2024 21:10, Tim Rentsch wrote:
> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
>
>> On 24/03/2024 16:45, Tim Rentsch wrote:
>>
>>> The C standard means what the ISO C group thinks it means.
>>> They are the ultimate and sole authority. Any discussion about what
>>> the C standard requires that ignores that or pretends otherwise is
>>> a meaningless exercise.
>>
>> An intentionalist.
>
> That is a misunderstanding of what I said.
>
>> But when a text has come about by a process of argument, negotation
>> and compromise and votes, is that postion so easy to defend as it
>> might appear to be for a simpler text?
>
> It's not a position, it's an observation. The ISO C committee is
> the recognized authority for judgment about the meaning of the C
> standard. Whatever discussion may have gone into writing the
> document is irrelevant; all that matters is that the ISO C
> group went through the approved ISO process, and hence the world
> at large defers to their view as being authoritative on the
> question of how to read the text of the standard.

You can't have it both ways.

One interpretation is that the /text/ of the standard is the be-all and
end-all of "the C standard", in which case what the ISO C group thinks
is irrelevant. It is only the written word that matters.

The other is that it is the beliefs and intentions of the ISO C group,
as the C authority, that defines "the C standard", in which case the
written standard is just an approximate summary of how they define the
language. Any other published writings or discussions, such as
rationale documents, WG documents, Jens Gustedt's Blog, C compilers and
libraries written by committee members, etc., are relevant to
understanding the group's interpretation of and meaning behind the standard.

You can't claim that /only/ the text matters and also that /only/ the
committee's judgement matters.

I think most people would say that the text of the C standard is
authoritative, not the committee or their opinions, judgements, thoughts
or interpretations. If the text does not match their intentions, or is
- in their opinion - misunderstood by others, then it is their job to
revise or update the standard document.

Pages:123456
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor