Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

System restarting, wait...


devel / comp.std.c / Arrays and pointer arithmetic

SubjectAuthor
o Arrays and pointer arithmeticTim Rentsch

1
Arrays and pointer arithmetic

<867d9ehi0q.fsf@linuxsc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=451&group=comp.std.c#451

  copy link   Newsgroups: comp.std.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: tr.17...@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.std.c
Subject: Arrays and pointer arithmetic
Date: Mon, 28 Feb 2022 12:07:33 -0800
Organization: A noiseless patient Spider
Lines: 131
Message-ID: <867d9ehi0q.fsf@linuxsc.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: reader02.eternal-september.org; posting-host="7d0634b361e5517f2a8ab668d07a1578";
logging-data="5375"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18f/VzNyXFQaS0KM4woRsOmp66nuUTG0HY="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:clukOp3MvsvQZ6AnGp9DtKnjGHU=
sha1:in9i9gKyYcAhiPIOp954xbQfOJI=
 by: Tim Rentsch - Mon, 28 Feb 2022 20:07 UTC

This posting is prompted by a discussion in comp.lang.c++ about
arrays and pointer arithmetic. An excerpt from that discussion
is given below, to provide context for anyone who did not see the
recent comp.lang.c++ discussion.

Consider three aspects of behavior in C:

* reading a union member after storing into a different
member

* sequencing rules for expressions

* implications for code reordering around a volatile access

Here are some notes about these areas.

For reading a union member after a different member has been
written, C90 says the value is implementation defined. C99 does
not say that, and explains what appears to be a different rule in
a non-normative footnote. Yet meeting notes from the ISO C
website indicate that the C99 description is meant to convey the
same semantics as the C90 description (or vice versa).

For sequencing within a single expression, there was a famous
debate about whether (for C90 and C99) an assignment such as
'a[a[0]] = 4;', where a[0] initially has the value 0, has defined
behavior or undefined behavior. A straightforward reading of the
C90/C99 text suggests it was undefined. In C11, the description
of sequencing rules was revised, and under the C11 description
the behavior is, pretty unambiguously, well defined. Yet there
is no mention of the C11 sequencing rules constituting a change
from the C90/C99 rules; apparently the C11 description was meant
to be, at best, a clarification, but without any change to what
the semantics are.

For code reordering around a volatile access, it's easy to draw
the conclusion that the C standard allows no movement (i.e., for
purposes of optimization) of any earlier or later reads or writes
across the volatile access expression. Yet discussion with
committee members definitely indicates that some such code
movement is allowed, despite what the C standard text would
plainly indicate.

Another example has to do with type rules for printf() arguments.
If there is a printf() call such as

printf( "%u", 7 );

is the behavior defined or undefined? There are reasonable
positions both pro and con. How are we to understand which view
better represents the judgment of the committee members? (I take
it as given that a judgment from the ISO C committee constitutes
the ultimate authority as to what the C standard either requires
or allows.) Incidentally, in the recent draft N2731, there is
new wording that answers this question in favor of the behavior
being defined, not undefined.

How are we to make sense of these apparent incongruities? All of
these cases can be understood using a single explanation: members
of the ISO C committee have a mental model for how the language
is supposed to behave in each case, and what is written in the
ISO C standard is meant to reflect those models, but sometimes the
writing falls short. When it does, the model prevails, because
as far as the members' view is concerned, "the truth" is what the
model says, not what the words say.

The description of semantic rules for pointer arithmetic talk
about situations where "the expression P points [...] to an
element of an array object [...]", but it isn't always clear what
"array object" is being referenced (in particular in the presence
of allocated memory).

My understanding of what C allows for pointer arithmetic is as
follows. What matters is where the pointer value in question
originally came from. If the original pointer value pointed to
an element of an array (with suitable language to handle the case
of pointing one past the last element of the array), further use
of that pointer value (e.g., by means of casting) is allowed to
access all the memory occupied by the array of the element of the
original pointer value source. Thus in the example below the
address &foo points to a single element array that coincides with
all of the memory occupied by foo, and thus it may access (after
the castint) all of the int elements of the two-dimensional array.

Evidence for this mental model, and for committee members holding
it, can be seen in various official ISO C writings on their
website, when the "provenance" of pointer values is discussed.
My understanding of what C allows here is based partly or perhaps
mostly on those written discussions.

When I say below "an argument could be made...", it doesn't mean
that I feel unsure about my own understanding. What it does mean
is that someone reading just the text in the C standard, and
nothing else, might very well reach a different conclusion. My
comment is meant to acknowledge that such positions may exist,
even though I myself don't find them persuasive.

I hope this explanation clarifies both what I meant and why I
have reached the conclusions that I have.

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>
>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>
> [edited for brevity]
>
>>> If we have this code fragment
>>>
>>> int foo[10][20];
>>> extern void set_elements( int *, size_t, int )
>>>
>>> set_elements( (int*) &foo, 10*20, -1 );
>>>
>>> an argument could be made that set_elements() cannot use pointer
>>> arithmetic (including that implied by use of []) on its first
>>> argument other than to access between foo[0][0] and foo[0][19] (or
>>> to construct a pointer to foo[0][20]). [...]
>>
>> [...] It's what direct additions and subtractions are permitted
>> for any given pointer that I no longer feel sure about. Your "a
>> case could be made" suggests you are not entirely sure either,
>> though it does suggest you consider that case is a stretch.
>
> The implied question here has a somewhat longish answer. I'll
> get to it when I can. Also, as it seems we have drifted rather
> far from C++, comp.std.c is I think a better place to continue.

(This concludes the quoted excerpt.)

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor