Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

"If the code and the comments disagree, then both are probably wrong." -- Norm Schryer


devel / comp.compilers / Re: Union C++ standard

SubjectAuthor
* Union C++ standardHans-Peter Diettrich
+- Re: Union C++ standardKaz Kylheku
+- Re: Union C++ standardgah4
`* Re: Union C++ standardDavid Brown
 `* Re: Union C++ standardDerek Jones
  `* Re: Union C++ standardDavid Brown
   +* Re: Union C++ standardDerek Jones
   |`* Re: Union C++ standardDavid Brown
   | `* Re: Union C++ standardDerek Jones
   |  +* Re: Union C++ standardGeorge Neuner
   |  |`- Re: Union C++ standard terminologyDerek Jones
   |  `- Re: Union C++ standardDavid Brown
   `* Re: Union C++ standardKaz Kylheku
    `- Re: Union C++ standardKeith Thompson

1
Union C++ standard

<21-11-004@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=133&group=comp.compilers#133

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: DrDiettr...@netscape.net (Hans-Peter Diettrich)
Newsgroups: comp.compilers
Subject: Union C++ standard
Date: Thu, 25 Nov 2021 11:11:04 +0100
Organization: Compilers Central
Lines: 12
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-11-004@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="44177"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 26 Nov 2021 12:31:56 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Hans-Peter Diettrich - Thu, 25 Nov 2021 10:11 UTC

Can somebody explain why the access to members of a union is "undefined"
except for the most recently written member?

What can be undefined in a union of data types of the same typesize end
alignment? Any member written will result in a unique bit/byte pattern
in memory, whose reading may not make sense in a different type but
undoubtedly is well defined.

DoDi
[I think it's undefined in a standards sense. In any individual
implementation the result is predictable, but it's not portable. -John]

Re: Union C++ standard

<21-11-006@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=135&group=comp.compilers#135

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: 480-992-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.compilers
Subject: Re: Union C++ standard
Date: Fri, 26 Nov 2021 18:06:37 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 78
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-11-006@comp.compilers>
References: <21-11-004@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="54566"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 26 Nov 2021 13:26:59 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Kaz Kylheku - Fri, 26 Nov 2021 18:06 UTC

On 2021-11-25, Hans-Peter Diettrich <DrDiettrich1@netscape.net> wrote:
> Can somebody explain why the access to members of a union is "undefined"
> except for the most recently written member?

I don't think that is true; if two members of foo, x and y, have
the same type, then it's possible to write to foo.x and then read foo.y.

> What can be undefined in a union of data types of the same typesize end
> alignment?

The representation. Same size and alignment are not sufficient
determiners of type.

For instance, you may find that int and float are of the same size on
the compiler you're using.

If the language does not define what it means to access a float object
through an int lvalue, that allows aggressive optimizations based on
the assumption that type aliasing is absent in the program.

For instance suppose that you have

struct s {
float *pflo;
int ival;
};

and a (nonsensical example) function working with a struct s *ptr
parameter:

int fun(struct s *ptr)
{
ptr->ival++;
*ptr->pflo = 0;
return ptr->ival;
}

Under the assumption that objects of different types are not aliased
by the program, the compiler can edit code which:

1. reads ptr->ival
2. stores the increment value back into ptr->ival
3. stores 0.0 through *ptr->pflo
4. returns the previously incremented value.

Now suppose that aliasing is allowed among any types, like int and
float. The compiler has no idea what ptr->pflo points to. The
caller could easily have set it like this:

ptr->pflo = (float *) &ptr->ival;

So if that is allowed, we cannot emit the code like above. We
must do this:

1. read ptr->ival
2. store the incremented value back into ptr->ival
3. store 0.0 through *ptr->flo
4. NEW: re-read ptr->ival in case it was changed by 3.
5. return the re-read value.

Now that's just one problem. The other is the problem that writing a
value as one type and reading as another, if required to be defined in
terms of bits or whatever, is going to be entirely nonportable
nonetheless. The language standard cannot define it completely to the
point that you can rely on the value being the same when the program is
ported. At best the standard could say that it's implementation-defined
behavior to read through differently-typed union-member.
Implementation-defined is basically "almost-undefined, except the
situation must be documented by the implementor and cannot blow up".

If certain behavior of unions is valuable to the users of a compiler,
they can always negotiate that with their compiler vendor; the
standard doesn't have to be involved in everything that is defined
between the implementor and programmer.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Re: Union C++ standard

<21-11-007@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=136&group=comp.compilers#136

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: gah...@u.washington.edu (gah4)
Newsgroups: comp.compilers
Subject: Re: Union C++ standard
Date: Fri, 26 Nov 2021 12:16:23 -0800 (PST)
Organization: Compilers Central
Lines: 32
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-11-007@comp.compilers>
References: <21-11-004@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="43064"; mail-complaints-to="abuse@iecc.com"
Keywords: C, history, standards, comment
Posted-Date: 26 Nov 2021 21:15:22 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <21-11-004@comp.compilers>
 by: gah4 - Fri, 26 Nov 2021 20:16 UTC

On Friday, November 26, 2021 at 9:32:00 AM UTC-8, Hans-Peter Diettrich wrote:
> Can somebody explain why the access to members of a union is "undefined"
> except for the most recently written member?

> What can be undefined in a union of data types of the same typesize end
> alignment? Any member written will result in a unique bit/byte pattern
> in memory, whose reading may not make sense in a different type but
> undoubtedly is well defined.

In addition to the previously mentioned reasons, which I agree with,
there used to be (maybe still are) machines that tag memory with the
type stored. That makes it very difficult to access memory as bits of
a different type. Some language standards are written to allow for
those machines.

Many languages originated before IEEE floating point, where there
was no expectation that floating point values would agree between
different machines. Even more, some have "trap" values in floating
point, such that one can't reference some values. (VAX has a trap
value for negative zero.) Since the language can't control all this,
it is made undefined. (But otherwise, machine dependent.)

JVM, while not using tags, is defined such that programs don't
do that. The verifier is supposed to catch attempts, even if not
executed, to access memory the wrong way. Among others,
that allows for programs to be endian independent. (long takes
twice as much memory as int, but by refusing such access,
programs can't detect that, and so work on all hardware.)

(Not that it is likely that there will be a C++ compiler for JVM.)
[I think the Unisys Libra series may still run tagged Burroughs
architecture code, but if so I doubt there was ever a C compiler. -John]

Re: Union C++ standard

<21-11-008@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=137&group=comp.compilers#137

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.compilers
Subject: Re: Union C++ standard
Date: Sat, 27 Nov 2021 16:59:36 +0100
Organization: A noiseless patient Spider
Lines: 24
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-11-008@comp.compilers>
References: <21-11-004@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="50781"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 27 Nov 2021 14:29:18 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <21-11-004@comp.compilers>
Content-Language: en-GB
 by: David Brown - Sat, 27 Nov 2021 15:59 UTC

On 25/11/2021 11:11, Hans-Peter Diettrich wrote:
> Can somebody explain why the access to members of a union is "undefined"
> except for the most recently written member?
>
> What can be undefined in a union of data types of the same typesize end
> alignment? Any member written will result in a unique bit/byte pattern
> in memory, whose reading may not make sense in a different type but
> undoubtedly is well defined.
>
> DoDi
> [I think it's undefined in a standards sense.  In any individual
> implementation the result is predictable, but it's not portable. -John]
>

In C++, objects of a class typically have some kind of invariant which
is established by the constructor, and kept consistent when accessed via
its public methods. Messing with the underlying data representation
directly is going to risk losing that - it means you are accessing data
without going through the proper defined interface (the public or
protected methods and members).

In C, type-punning via unions is allowed (i.e., fully defined behaviour
in the standards), but not in C++ where the language is expected to
enforce higher-level aspects of the data.

Re: Union C++ standard

<21-11-009@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=138&group=comp.compilers#138

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: der...@NOSPAM-knosof.co.uk (Derek Jones)
Newsgroups: comp.compilers
Subject: Re: Union C++ standard
Date: Sun, 28 Nov 2021 12:51:14 +0000
Organization: Compilers Central
Lines: 16
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-11-009@comp.compilers>
References: <21-11-004@comp.compilers> <21-11-008@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="13223"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 28 Nov 2021 12:13:12 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <21-11-008@comp.compilers>
Content-Language: en-US
 by: Derek Jones - Sun, 28 Nov 2021 12:51 UTC

David,

> In C, type-punning via unions is allowed (i.e., fully defined behaviour

That is not true. Writing into one member and then reading from
another member is undefined behavior.

There is a special dispensation for what is known as a
common initial sequence:
sentence 1029
http://c0x.shape-of-code.com/6.5.2.3.html

> in the standards), but not in C++ where the language is expected to
> enforce higher-level aspects of the data.

This is a meaningless statement.

Re: Union C++ standard

<21-11-010@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=139&group=comp.compilers#139

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.compilers
Subject: Re: Union C++ standard
Date: Sun, 28 Nov 2021 19:00:00 +0100
Organization: A noiseless patient Spider
Lines: 79
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-11-010@comp.compilers>
References: <21-11-004@comp.compilers> <21-11-008@comp.compilers> <21-11-009@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="34846"; mail-complaints-to="abuse@iecc.com"
Keywords: C, types, comment
Posted-Date: 28 Nov 2021 14:42:42 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <21-11-009@comp.compilers>
Content-Language: en-GB
 by: David Brown - Sun, 28 Nov 2021 18:00 UTC

On 28/11/2021 13:51, Derek Jones wrote:
> David,
>
>> In C, type-punning via unions is allowed (i.e., fully defined behaviour
>
> That is not true.  Writing into one member and then reading from
> another member is undefined behavior.

No, it is correct. It would be helpful if you looked at the full
published standards, or (as most people do, since they are free) the
final pre-publishing drafts. In particular, they contain the footnotes
that appear to be missing in the format you linked here. Footnotes are
not part of the normative text, but are added for clarification. (Your
reference also misses the standard paragraph numbering, and it is
outdated - not that this particular issue has changed since C was
standardised.)

So, the relevant paragraph is 6.5.2.3p3:

"""
A postfix expression followed by the . operator and an identifier
designates a member of a structure or union object. The value is that of
the named member, 101) and is an lvalue if the first expression is an
lvalue. If the first expression has qualified type, the result has the
so-qualified version of the type of the designated member.
"""

The footnote (101 in C18 - footnote numbers are not consistent between C
standard versions) is:

"""
If the member used to read the contents of a union object is not the
same as the member last used to store a value in the object, the
appropriate part of the object representation of the value is
reinterpreted as an object representation in the new type as described
in 6.2.6 (a process sometimes called "type punning"). This might be a
trap representation.
"""

These quotations are from C18 (draft N2346), which is the current C
standard (until C23 is finalised). They have not changed since C99,
when the footnote was added without a change to the normative text.
This means that as far as the C committee was concerned, using unions
for type-punning has always (since standardisation) been valid in C, but
they realised that the text was unclear and thus added the footnote.
(Arguably, since C90 did not clearly state that type-punning was
defined, the behaviour was in fact undefined - though probably all C
compilers allowed the behaviour.)

> There is a special dispensation for what is known as a
> common initial sequence:
> sentence 1029
> http://c0x.shape-of-code.com/6.5.2.3.html

This is an additional guarantee that has a longer history - it
specifically allows a particular type of access that had been in regular
use from before C was standardised.

>> in the standards), but not in C++ where the language is expected to
>> enforce higher-level aspects of the data.
>
> This is a meaningless statement.

I disagree, but perhaps that is subjective. In C++, accessing a member
of a union other than the one most recently written (or "active") member
is undefined behaviour, unless it matches the "initial sequence" exception.

Some useful references are:

<https://en.cppreference.com/w/c/language/union>
<https://en.cppreference.com/w/cpp/language/union>

While that site does not have the weight of the C or C++ standards, it
is supported by and contributed to by the C and C++ standards committees
and their ISO working groups. The site does not get that kind of thing
wrong.
[I see what the standard says, but I don't see how reinterpreting the
bits from one type to another can be fully defined. I've certainly done
it, but it never seemed very portable. -John]

Re: Union C++ standard

<21-11-011@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=140&group=comp.compilers#140

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: der...@NOSPAM-knosof.co.uk (Derek Jones)
Newsgroups: comp.compilers
Subject: Re: Union C++ standard
Date: Mon, 29 Nov 2021 00:09:39 +0000
Organization: Compilers Central
Lines: 31
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-11-011@comp.compilers>
References: <21-11-004@comp.compilers> <21-11-008@comp.compilers> <21-11-009@comp.compilers> <21-11-010@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="99482"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 28 Nov 2021 22:18:40 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <21-11-010@comp.compilers>
Content-Language: en-US
 by: Derek Jones - Mon, 29 Nov 2021 00:09 UTC

David,

>>> In C, type-punning via unions is allowed (i.e., fully defined behaviour
>>
>> That is not true.  Writing into one member and then reading from
>> another member is undefined behavior.
>
> No, it is correct. It would be helpful if you looked at the full

You have misunderstood the C conformance model, which revolves around
the use of "shall" and "shall not", and the kind of section in which
they appear (e.g., Constraints). See:
http://c0x.shape-of-code.com/4..html

For a longer discussion see: http://knosof.co.uk/cbook/

> """
> If the member used to read the contents of a union object is not the
> same as the member last used to store a value in the object, the
> appropriate part of the object representation of the value is
> reinterpreted as an object representation in the new type as described
> in 6.2.6 (a process sometimes called "type punning"). This might be a
> trap representation.
> """
>
> These quotations are from C18 (draft N2346), which is the current C
> standard (until C23 is finalised). They have not changed since C99,

This footnote was added in response to this DR (so it must have come
after C99):
http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_283.htm

Re: Union C++ standard

<21-11-012@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=141&group=comp.compilers#141

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: 480-992-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.compilers
Subject: Re: Union C++ standard
Date: Mon, 29 Nov 2021 16:39:03 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 73
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-11-012@comp.compilers>
References: <21-11-004@comp.compilers> <21-11-008@comp.compilers> <21-11-009@comp.compilers> <21-11-010@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="38682"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 29 Nov 2021 11:48:55 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Kaz Kylheku - Mon, 29 Nov 2021 16:39 UTC

On 2021-11-28, David Brown <david.brown@hesbynett.no> wrote:
> On 28/11/2021 13:51, Derek Jones wrote:
>> David,
>>
>>> In C, type-punning via unions is allowed (i.e., fully defined behaviour
>>
>> That is not true.  Writing into one member and then reading from
>> another member is undefined behavior.
>
> No, it is correct. It would be helpful if you looked at the full
> published standards, or (as most people do, since they are free) the
> final pre-publishing drafts. In particular, they contain the footnotes
> that appear to be missing in the format you linked here. Footnotes are
> not part of the normative text, but are added for clarification.

Not being normative means that anything that looks like a requirement
that is in a footnote is not actually a requirement.

Footnotes can only clarify requirements that are not themselves in a
foonote; they can't add new requirements.

If you cannot infer the existence of a requirement while ignoring all
foonotes and examples, it isn't there.

> These quotations are from C18 (draft N2346), which is the current C
> standard (until C23 is finalised). They have not changed since C99,
> when the footnote was added without a change to the normative text.

I have a copy of C99 (the final thing from ANSI, not a draft); I do not
see any such footnote; the paragraph has no footnotes.

> This means that as far as the C committee was concerned, using unions
> for type-punning has always (since standardisation) been valid in C,

Note that the "trap representation" terminology didn't exist prior to
C99, so any footnote referencing such a thing cannot possibly reflect
any intent about what C was going back to before standardization,
let alone some new footnote since C99.

> they realised that the text was unclear and thus added the footnote.
> (Arguably, since C90 did not clearly state that type-punning was
> defined, the behaviour was in fact undefined - though probably all C
> compilers allowed the behaviour.)

The concept of trap representations adds nuance to the requirements
for accessing objects; it doesn't make everything defined.

The trap concept is used to create a new (in C99) model why accessing an
object a field of "unsigned char" is okay: the unsigned char type has no
trap representation: all combinations of bit patterns give rise to a
valid value.

For instance, if we have a union like this:

union u {
int x;
unsigned char y[sizeof (int)];
}

then certain requirements can be inferred if we store x and access y[0],
based on knowing the implementation's parameters, like size and
representation of the integer and byte order.

I don't believe that the footnote you quoted gives any special blessing
to union-based type punning over pointer-based type punning. It just
clarifies the fact that unions are type punning, subject to the same
requirements as any other type punning. It points the reader to section
6.2.6 where the real requirements are, using which all instances of
type punning are to be interpreted.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Re: Union C++ standard

<21-11-013@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=142&group=comp.compilers#142

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.compilers
Subject: Re: Union C++ standard
Date: Mon, 29 Nov 2021 21:00:06 +0100
Organization: A noiseless patient Spider
Lines: 47
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-11-013@comp.compilers>
References: <21-11-004@comp.compilers> <21-11-008@comp.compilers> <21-11-009@comp.compilers> <21-11-010@comp.compilers> <21-11-011@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="42763"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 29 Nov 2021 16:50:48 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <21-11-011@comp.compilers>
Content-Language: en-GB
 by: David Brown - Mon, 29 Nov 2021 20:00 UTC

On 29/11/2021 01:09, Derek Jones wrote:
> David,
>
>>>> In C, type-punning via unions is allowed (i.e., fully defined behaviour
>>>
>>> That is not true.  Writing into one member and then reading from
>>> another member is undefined behavior.
>>
>> No, it is correct.  It would be helpful if you looked at the full
>
> You have misunderstood the C conformance model, which revolves around
> the use of "shall" and "shall not", and the kind of section in which
> they appear (e.g., Constraints).  See:
> http://c0x.shape-of-code.com/4..html
>
> For a longer discussion see: http://knosof.co.uk/cbook/

I was not aware of your qualifications when I posted earlier - you have
been directly involved in things that I can only infer from reading the
standards and other material.

Let me put it this way. Those of us who read the C standards, but were
not involved in writing them, do our best to interpret the precise
meaning of the words in the normative text. Those meanings are not
always clear. When we see examples or footnotes, we know they were
added by the same people that wrote the standard, and are used as
clarification for the meaning of the normative text. The footnote
(added, as you say, in a C99 TC in response to a defect report - and
therefore AIUI as much part of C99 standard as the original published
text since the TC's replace previous versions) makes it perfectly clear
that type-punning via unions is defined behaviour in C. This codifies
the existing practice supported by most (if not all) C compilers, and
relied upon by code.

Since the change was a clarifying footnote, not a change to the
normative text, the implication is that the normative text was always
intended to support these semantics. The only three alternatives I see
to that is that the footnote was added by some committee members who
disagreed with what other committee members wrote in the normative text,
that the committee changed their minds about union semantics but did not
change the normative text, or that the footnote was deliberately added
to confuse people. None of these alternatives is appealing.

If my reasoning here is faulty, I'd be grateful if you could point out
the flaw.

David

Re: Union C++ standard

<21-11-014@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=143&group=comp.compilers#143

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: Keith.S....@gmail.com (Keith Thompson)
Newsgroups: comp.compilers
Subject: Re: Union C++ standard
Date: Mon, 29 Nov 2021 14:32:21 -0800
Organization: None to speak of
Lines: 62
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-11-014@comp.compilers>
References: <21-11-004@comp.compilers> <21-11-008@comp.compilers> <21-11-009@comp.compilers> <21-11-010@comp.compilers> <21-11-012@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="7255"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards, comment
Posted-Date: 29 Nov 2021 18:09:01 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Keith Thompson - Mon, 29 Nov 2021 22:32 UTC

Kaz Kylheku <480-992-1380@kylheku.com> writes:
> On 2021-11-28, David Brown <david.brown@hesbynett.no> wrote:
>> On 28/11/2021 13:51, Derek Jones wrote:
>>> David,
>>>
>>>> In C, type-punning via unions is allowed (i.e., fully defined behaviour
>>>
>>> That is not true.  Writing into one member and then reading from
>>> another member is undefined behavior.
>>
>> No, it is correct. It would be helpful if you looked at the full
>> published standards, or (as most people do, since they are free) the
>> final pre-publishing drafts. In particular, they contain the footnotes
>> that appear to be missing in the format you linked here. Footnotes are
>> not part of the normative text, but are added for clarification.
>
> Not being normative means that anything that looks like a requirement
> that is in a footnote is not actually a requirement.
>
> Footnotes can only clarify requirements that are not themselves in a
> foonote; they can't add new requirements.
>
> If you cannot infer the existence of a requirement while ignoring all
> foonotes and examples, it isn't there.
>
>> These quotations are from C18 (draft N2346), which is the current C
>> standard (until C23 is finalised). They have not changed since C99,
>> when the footnote was added without a change to the normative text.

I don't think N2346 is a draft of "C18". The page headers say:

N2346 working draft — March 13, 2019 ISO/IEC 9899:202x (E)

The current C standard is usually referred to as "C17"; it's a minor
update to C11. Work is in progress on C2X, which will supersede C17.

> I have a copy of C99 (the final thing from ANSI, not a draft); I do not
> see any such footnote; the paragraph has no footnotes.

(It was published by ISO. ANSI adopted it and sold copies.)

The footnote does not appear in the published 1999 ISO C standard. It
was added by Technical Corrigendum 3, published in 2007 (and therefore
in the N1256 draft, which includes the 1999 standard with the three
Technical Corrigenda merged into it). As far as ISO is concerned, it's
part of C99 -- and of later editions of the standard, which have
superseded C99.

[...]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */
[I think this horse has been beaten about all it needs to. And
it reminds us that just because C says something is defined does
not mean it's portable or that the results are easily predictable.

This is hardly the only place, e.g., the exciting range of things
that might or might not happen if the result of an integer operation
doesn't fit in its result type. It might wrap, it might overflow,
it might saturate. Or it might not. -John]

Re: Union C++ standard

<21-11-015@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=144&group=comp.compilers#144

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: der...@NOSPAM-knosof.co.uk (Derek Jones)
Newsgroups: comp.compilers
Subject: Re: Union C++ standard
Date: Tue, 30 Nov 2021 00:46:04 +0000
Organization: Compilers Central
Lines: 39
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-11-015@comp.compilers>
References: <21-11-004@comp.compilers> <21-11-008@comp.compilers> <21-11-009@comp.compilers> <21-11-010@comp.compilers> <21-11-011@comp.compilers> <21-11-013@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="46060"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 30 Nov 2021 13:17:07 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <21-11-013@comp.compilers>
Content-Language: en-US
 by: Derek Jones - Tue, 30 Nov 2021 00:46 UTC

David,

> I was not aware of your qualifications when I posted earlier - you have
> been directly involved in things that I can only infer from reading the
> standards and other material.

You should always infer meaning by reading from the standard, never
defer to anybody arguing from authority.

> Let me put it this way. Those of us who read the C standards, but were
> not involved in writing them, do our best to interpret the precise
> meaning of the words in the normative text. Those meanings are not
> always clear.

You have made the mistake of reading the standard as "plain English".
Almost everybody falls into this trap when they start out.
In fact the standard is a stylized version of English, with some phrases
specified to have a given meaning in specific contexts.

As the committee is always saying, the standard is not intended as
a tutorial. You probably need to read it three or four times to
get an idea of how it fits together (there is a strange logic to it).

Start by understanding how the text is styled.

The Conformance section specifies how "shall" and "shall not" are to be
interpreted.

You also need to understand "unspecified behaviors" and "undefined behaviors".

See Kaz Kylheku's discussion of the status of footnotes.

You need to trace a legalistic top down approach (which takes
practice).

There are people actively discussing standard C on comp.std.c

Footnotes state the obvious when it is not obvious to somebody.
They are also an enormous source of confusion and best ignored.

Re: Union C++ standard

<21-11-016@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=145&group=comp.compilers#145

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!rocksolid2!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: gneun...@comcast.net (George Neuner)
Newsgroups: comp.compilers
Subject: Re: Union C++ standard
Date: Tue, 30 Nov 2021 17:18:35 -0500
Organization: A noiseless patient Spider
Lines: 41
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-11-016@comp.compilers>
References: <21-11-004@comp.compilers> <21-11-008@comp.compilers> <21-11-009@comp.compilers> <21-11-010@comp.compilers> <21-11-011@comp.compilers> <21-11-013@comp.compilers> <21-11-015@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="35941"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 30 Nov 2021 18:53:52 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: George Neuner - Tue, 30 Nov 2021 22:18 UTC

On Tue, 30 Nov 2021 00:46:04 +0000, Derek Jones
<derek@NOSPAM-knosof.co.uk> wrote:

>You have made the mistake of reading the standard as "plain English".
>Almost everybody falls into this trap when they start out.
>In fact the standard is a stylized version of English, with some phrases
>specified to have a given meaning in specific contexts.
>
>As the committee is always saying, the standard is not intended as
>a tutorial. You probably need to read it three or four times to
>get an idea of how it fits together (there is a strange logic to it).
>
>Start by understanding how the text is styled.
>
>The Conformance section specifies how "shall" and "shall not" are to be
>interpreted.

But it does NOT define "will" and "will not", and "must" and "must
not", and "does" and "does not" ... terms which are used liberally in
the documents, apparently without having any normative definition.

Not to mention that the Conformance section generally is not included
in draft documents. Nor are there easy to find, freely available,
references on how to read various standards documents.

A great many programmers are in work situations which can't support
purchasing every official document that might apply.

>You also need to understand "unspecified behaviors" and "undefined behaviors".
>
>See Kaz Kylheku's discussion of the status of footnotes.
>
>You need to trace a legalistic top down approach (which takes practice).
>
>There are people actively discussing standard C on comp.std.c
>
>Footnotes state the obvious when it is not obvious to somebody.
>They are also an enormous source of confusion and best ignored.

YMMV,
George

Re: Union C++ standard

<21-11-017@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=146&group=comp.compilers#146

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.compilers
Subject: Re: Union C++ standard
Date: Tue, 30 Nov 2021 23:24:07 +0100
Organization: A noiseless patient Spider
Lines: 112
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-11-017@comp.compilers>
References: <21-11-004@comp.compilers> <21-11-008@comp.compilers> <21-11-009@comp.compilers> <21-11-010@comp.compilers> <21-11-011@comp.compilers> <21-11-013@comp.compilers> <21-11-015@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="36342"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 30 Nov 2021 18:54:47 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <21-11-015@comp.compilers>
Content-Language: en-GB
 by: David Brown - Tue, 30 Nov 2021 22:24 UTC

On 30/11/2021 01:46, Derek Jones wrote:
> David,
>
>> I was not aware of your qualifications when I posted earlier - you have
>> been directly involved in things that I can only infer from reading the
>> standards and other material.
>
> You should always infer meaning by reading from the standard, never
> defer to anybody arguing from authority.

But it helps to listen to people or read sources that have established a
position of respect and a reputation for reliability. Of course anyone
can get things wrong. If several people whom I know to be experts in a
language all agree, but have an interpretation different from what I
read in the standard, then I must suspect my own interpretation - at the
very least, it warrants further investigation and discussion.

In this case, you are - I assume - a person with a high degree of
experience and knowledge of the C standards. I don't know you well
enough to judge for myself, having only read a few of your posts, but
your qualifications are significant. Your interpretation of the
standard here differs from mine, and from how I have seen many other
experts and reliable resources interpret it. So I am not deferring to
anyone - I am reading the standard. But I am asking to help figure out
if I am reading the standard correctly!

>> Let me put it this way.  Those of us who read the C standards, but were
>> not involved in writing them, do our best to interpret the precise
>> meaning of the words in the normative text.  Those meanings are not
>> always clear.
>
> You have made the mistake of reading the standard as "plain English".
> Almost everybody falls into this trap when they start out.
> In fact the standard is a stylized version of English, with some phrases
> specified to have a given meaning in specific contexts.
>

I am not making that mistake - at least, not as a general point. I am
well aware of the specialised and stylized language used in the
standard, and how specific terms and phrases can have meanings that are
not "plain English". It is, however, possible that I am making an error
in the interpretation of this particular issue. Just as I do not know
you, you do not know me - I've been studying and discussing the C
standards for a great many years, and I am not new to it. (Again, that
experience does not mean I think I know everything about it or that my
interpretation of it is flawless.) My programming field is quite
specialised - small-systems embedded programming - and I have not
bothered about parts of the standard that are not relevant there. But
I've gone through a lot of the "meat" of the documents, many times.

> As the committee is always saying, the standard is not intended as
> a tutorial.  You probably need to read it three or four times to
> get an idea of how it fits together (there is a strange logic to it).
>
> Start by understanding how the text is styled.
>
> The Conformance section specifies how "shall" and "shall not" are to be
> interpreted.
>

Yes.

> You also need to understand "unspecified behaviors" and "undefined
> behaviors".
>

I know.

> See Kaz Kylheku's discussion of the status of footnotes.
>

I did. I agree that the footnotes are not normative, but I don't agree
with his interpretation of the footnote. The footnote says very clearly
that type-punning using a union is defined as storing a value of one
type in the representation (given in 6.2.6, along with
implementation-dependent details), then re-interpreting that
representation as the new type when read.

> You need to trace a legalistic top down approach (which takes
> practice).
>
> There are people actively discussing standard C on comp.std.c
>

I follow that group.

> Footnotes state the obvious when it is not obvious to somebody.
> They are also an enormous source of confusion and best ignored.

I'm sorry, but none of what you wrote comes close to answering my
question. Your response merely says that the standard is written in
"standardese" and must be read appropriately - had I been new to the C
standards, it would have been useful.

The footnote was added specifically as a TC based on a DR that would
have been voted on, and it has been left untouched through three
revisions of the standard despite being arguably a critical part of the
language that many programs rely on. Either the footnote accurately
describes the "rules" of the language - that type-punning via unions is
defined behaviour that can be relied upon (albeit with some
implementation-specific details, and scope for undefined behaviour from
accessing trap representations - such as storing 2 to a char, then
reading it as a _Bool), or the footnote was added deliberately and
intentionally with the aim of confusing and misleading people.

I find it hard to believe the latter - despite that being your suggestion
here.

Again, if you see a flaw in my reasoning, please say.

David

Re: Union C++ standard terminology

<21-12-001@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=147&group=comp.compilers#147

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: der...@knosof.co.uk (Derek Jones)
Newsgroups: comp.compilers
Subject: Re: Union C++ standard terminology
Date: Wed, 1 Dec 2021 13:35:57 +0000
Organization: Compilers Central
Lines: 47
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-12-001@comp.compilers>
References: <21-11-004@comp.compilers> <21-11-008@comp.compilers> <21-11-009@comp.compilers> <21-11-010@comp.compilers> <21-11-011@comp.compilers> <21-11-013@comp.compilers> <21-11-015@comp.compilers> <21-11-016@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="65296"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 05 Dec 2021 13:29:09 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <21-11-016@comp.compilers>
Content-Language: en-US
 by: Derek Jones - Wed, 1 Dec 2021 13:35 UTC

George,

>> The Conformance section specifies how "shall" and "shall not" are to be
>> interpreted.
>
> But it does NOT define "will" and "will not", and "must" and "must
> not", and "does" and "does not" ... terms which are used liberally in
> the documents, apparently without having any normative definition.

The ISO directives say:
'Do not use "must" as an alternative for "shall".'
https://isotc.iso.org/livelink/livelink?func=ll&objId=4230456&objAction=browse&sort=subtype
Although the IETF treats the terms similarly:
https://www.ietf.org/rfc/rfc2119.txt

My recollection is the the ISO directives used to strongly recommend
against the use of any form of "must".

The get out of jail answer is to point out that
"ISO/IEC 2382−1:1993, Information technology — Vocabulary — Part 1:
Fundamental terms"
appears in the list of Normative references.
Back when libraries used to contain paper documents, I spent an
afternoon rummaging around the various parts of ISO 2382.
I was surprised to find out how few terms are defined and how
vague/general the definitions actually were.

I have been in committee meetings were people said the term was
defined in ISO 2382, we found out that it wasn't, then everybody
switched to saying: "Ok, common usage English applies" (whatever
that is; the "Longman Grammar of Spoken and Written English" is
great, but out of print, see the student edition).

There is one occurrence of the word "must" in the standard, in an
example.

"does not" is common, mostly in examples and footnotes.
The instances I have looked at look reasonable, e.g.,
"Each ? that does not begin one of the trigraphs..."

The three instances of "will not" all appear in footnotes.

> Not to mention that the Conformance section generally is not included
> in draft documents. Nor are there easy to find, freely available,
> references on how to read various standards documents.

It appears in every copy of the draft standard I have seen.

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor