Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Victory or defeat!


devel / comp.compilers / Re: Undefined behaviour, was: for or against equality

SubjectAuthor
* Re: for or against equality, was Why are ambiguous grammars usually a bad idea?Martin Ward
`* Re: for or against equality, was Why are ambiguous grammars usually a bad idea?David Brown
 +* Re: what is defined, was for or against equalityThomas Koenig
 |+- Re: what is defined, was for or against equalityDavid Brown
 |`* Re: what is defined, was for or against equalitySpiros Bousbouras
 | `* Re: what is defined, was for or against equalityThomas Koenig
 |  +* Re: what is defined, was for or against equalitySpiros Bousbouras
 |  |`* Re: what is defined, was for or against equalityThomas Koenig
 |  | `- Re: what is defined, was for or against equalitySpiros Bousbouras
 |  +* Re: what is defined, was for or against equalityDavid Brown
 |  |`* Re: what is defined, was for or against equalityThomas Koenig
 |  | `* Re: what is defined, was for or against equalityDavid Brown
 |  |  `* Re: what is defined, was for or against equalityKaz Kylheku
 |  |   `* Re: what is defined, was for or against equalitygah4
 |  |    `* Re: what is defined, was for or against equalityThomas Koenig
 |  |     +- Re: what is defined, was for or against equalityDavid Brown
 |  |     `- Re: what is defined, was for or against equalityThomas Koenig
 |  `- Re: what is defined, was for or against equalitygah4
 +- Re: for or against equality, was Why are ambiguous grammars usually a bad idea?Robert Prins
 +* Undefined behaviour, was: for or against equalityMartin Ward
 |`- Re: Undefined behaviour, was: for or against equalitySpiros Bousbouras
 `* Re: Undefined behaviour, was: for or against equalityDavid Brown
  `* Re: Undefined behaviour, was: for or against equalityAnton Ertl
   +- Re: Undefined behaviour, was: for or against equalityDavid Brown
   `* Re: Undefined behaviour, was: for or against equalityKaz Kylheku
    `- Re: Undefined behaviour, was: for or against equalityGeorge Neuner

Pages:12
Re: for or against equality, was Why are ambiguous grammars usually a bad idea?

<22-01-016@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=178&group=comp.compilers#178

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: mar...@gkc.org.uk (Martin Ward)
Newsgroups: comp.compilers
Subject: Re: for or against equality, was Why are ambiguous grammars usually a bad idea?
Date: Wed, 5 Jan 2022 10:25:37 +0000
Organization: Compilers Central
Lines: 48
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-016@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="84231"; mail-complaints-to="abuse@iecc.com"
Keywords: PL/I, history, comment
Posted-Date: 05 Jan 2022 12:11:32 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk>
Content-Language: en-GB
 by: Martin Ward - Wed, 5 Jan 2022 10:25 UTC

On 04/01/2022 21:26, gah4 wrote:
> Stories are that COBOL programmers always
> keep the list of reserved words nearby, to avoid using them.

Our esteemed moderator claims:
> [COBOL doesn't have that many reserved words

I count 510 reserved words in IBM COBOL. Adding a few other dialects
can push the total to 700 or more. By comparison, C has about 32
reserved words.

The story I heard was of a COBOL shop where it was mandatory to
include a hyphen in every data name: in effect, *every* unhyphenated
word was treated as a reserved word. The slightly more managable list
of *hyphenated* reserved words (149 in IBM COBOL, but 46 of these are
of the form COMP-0, COMP-1, COMP-2 etc) was printed out and posted on
the wall.

I just noticed that if you include a digit in the part of the name
before the first hyphen, you can guarantee to avoid all
the reserved words!

PL/I went to the other extreme of no reserved words in reaction
to COBOL. Also, the aim of PL/I was to be a language which does
everything: business programming (like COBOL) and scientific
programming (like FORTRAN). In theory, if you only wanted
to do, say, business programming, you only needed to learn
part of the language and you would not get tripped up by keywords
from the other part of the language that you didn't know about yet.

Using a language that you don't know in its entirety might seem
dangerous, but everybody seems to do it these days:
how many C programmers have read the entire 500+ pages of
the latest C standard and memorised the 200+ varieties
of "undefined behaviour" so that they can avoid all of them
in every line of code that they write?
--
Martin

Dr Martin Ward | Email: martin@gkc.org.uk | http://www.gkc.org.uk
G.K.Chesterton site: http://www.gkc.org.uk/gkc | Erdos number: 4
[IBM hoped everyone would switch from Fortran and COBOL to PL/I and
it was obvious Fortran programmers would not put up with reserved
words, particularly ones unrelated to scientific programming.
As far as the size of languages, that seems a matter of point of
view. Python is a large language if you consider the standard
library to be part of the language, a very small one if you don't.
-John]

Re: for or against equality, was Why are ambiguous grammars usually a bad idea?

<22-01-018@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=179&group=comp.compilers#179

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.compilers
Subject: Re: for or against equality, was Why are ambiguous grammars usually a bad idea?
Date: Thu, 6 Jan 2022 09:11:29 +0100
Organization: A noiseless patient Spider
Lines: 50
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-018@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="22569"; mail-complaints-to="abuse@iecc.com"
Keywords: design
Posted-Date: 06 Jan 2022 10:49:10 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-01-016@comp.compilers>
Content-Language: en-GB
 by: David Brown - Thu, 6 Jan 2022 08:11 UTC

On 05/01/2022 11:25, Martin Ward wrote:

> Using a language that you don't know in its entirety might seem
> dangerous, but everybody seems to do it these days:
> how many C programmers have read the entire 500+ pages of
> the latest C standard and memorised the 200+ varieties
> of "undefined behaviour" so that they can avoid all of them
> in every line of code that they write?

I think it is normal not to know everything about the language you use.
And if you include the language's standard library, then there are very
few currently used languages where it would even be possible to learn it
all. By the time you learned all of the language and default libraries
of C++, Java, Python, etc., there would be a new version out and you'd
have more to learn.

The important things for writing code are to know enough to be able to
write the kind of code you are doing, and to avoid accidentally doing
things you didn't intend. Static warning tools are vital here - from
syntax-highlighting and check-as-you-type editors and IDE's, through
compiler warning flags, to stand-alone checkers. Your tools should
tell you if you are accidentally using a reserved word as an
identifier.

There is no need to memorize undefined behaviours for a language -
indeed, such a thing is impossible since everything not defined by a
language standard is, by definition, undefined behaviour. (C and C++
are not special here - the unusual thing is just that their standards
say this explicitly.)

The trick is to memorize the /defined/ behaviours, and stick to them.
You generally don't need to know if a language leaves (1 / 0) as
undefined, or gives a specific value, or prints an error message -
usually it is sufficient to know the values for which (x / y) /is/
defined, and stick to those values.

Basically, trying to execute undefined behaviour is no more and no less
than a bug in the program - whether it is "undefined" in terms of the
language, the library, the code you wrote yourself, the customer's
specification, or anything else. People program primarily by trying to
write correct code - not by trying to think of all the ways they could
write incorrect code!

The real challenge from big languages and big standard libraries is not
/writing/ code, it is /reading/ it. It doesn't really matter if a C
programmer, when writing some code, does not know what the syntax "void
foo(int a[static 10]);" means. (Most C programmers don't know it, and
never miss it.) But it can be a problem if they have to read and
understand code that uses something they don't know.

Re: what is defined, was for or against equality

<22-01-020@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=180&group=comp.compilers#180

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Thu, 6 Jan 2022 16:43:05 -0000 (UTC)
Organization: news.netcologne.de
Lines: 33
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-020@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="46644"; mail-complaints-to="abuse@iecc.com"
Keywords: design, standards
Posted-Date: 06 Jan 2022 13:11:20 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Thomas Koenig - Thu, 6 Jan 2022 16:43 UTC

David Brown <david.brown@hesbynett.no> schrieb:

> There is no need to memorize undefined behaviours for a language -
> indeed, such a thing is impossible since everything not defined by a
> language standard is, by definition, undefined behaviour. (C and C++
> are not special here - the unusual thing is just that their standards
> say this explicitly.)

This is a rather C-centric view of things. The Fortran standard
uses a different model.

There are constraints, which are numbered. Any violation of such
a constraint needs to be reported by the compiler ("processor",
in Fortran parlance). If it fails to do so, this is a bug in
the compiler.

There are also phrases which have "shall" or "shall not". If this
is violated, this is an error in the program. Catching such a
violation is a good thing from quality of implementation standpoint,
but is not required. Many run-time errors such as array overruns
fall into this category.

[...]

> The real challenge from big languages and big standard libraries is not
> /writing/ code, it is /reading/ it. It doesn't really matter if a C
> programmer, when writing some code, does not know what the syntax "void
> foo(int a[static 10]);" means. (Most C programmers don't know it, and
> never miss it.) But it can be a problem if they have to read and
> understand code that uses something they don't know.

Agreed.

Re: for or against equality, was Why are ambiguous grammars usually a bad idea?

<22-01-021@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=181&group=comp.compilers#181

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: rob...@prino.org (Robert Prins)
Newsgroups: comp.compilers
Subject: Re: for or against equality, was Why are ambiguous grammars usually a bad idea?
Date: Thu, 6 Jan 2022 19:07:20 +0000
Organization: A noiseless patient Spider
Lines: 20
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-021@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="47503"; mail-complaints-to="abuse@iecc.com"
Keywords: design, comment
Posted-Date: 06 Jan 2022 13:14:07 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Robert Prins - Thu, 6 Jan 2022 19:07 UTC

On 2022-01-06 08:11, David Brown wrote:
> On 05/01/2022 11:25, Martin Ward wrote:
>
> Your tools should tell you if you are accidentally using a reserved word as an
> identifier.

Your language should not have reserved words, if PL/I (AD 1964) could already do
without them...

'nuff said!

Robert
--
Robert AH Prins
robert(a)prino(d)org
The hitchhiking grandfather - https://prino.neocities.org/
Some REXX code for use on z/OS - https://prino.neocities.org/zOS/zOS-Tools.html
[Just because it's possible to do something doesn't mean it is a good idea. A
lot of us think a reasonable number of reserved words are fine and make it less
likely that a typo will silently change the meaning of a program. -John]

Re: what is defined, was for or against equality

<22-01-026@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=185&group=comp.compilers#185

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Fri, 7 Jan 2022 12:06:12 +0100
Organization: A noiseless patient Spider
Lines: 68
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-026@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-020@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="30588"; mail-complaints-to="abuse@iecc.com"
Keywords: standards, semantics
Posted-Date: 07 Jan 2022 20:24:03 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-01-020@comp.compilers>
Content-Language: en-GB
 by: David Brown - Fri, 7 Jan 2022 11:06 UTC

On 06/01/2022 17:43, Thomas Koenig wrote:
> David Brown <david.brown@hesbynett.no> schrieb:
>
>> There is no need to memorize undefined behaviours for a language -
>> indeed, such a thing is impossible since everything not defined by a
>> language standard is, by definition, undefined behaviour. (C and C++
>> are not special here - the unusual thing is just that their standards
>> say this explicitly.)
>
> This is a rather C-centric view of things. The Fortran standard
> uses a different model.
>
> There are constraints, which are numbered. Any violation of such
> a constraint needs to be reported by the compiler ("processor",
> in Fortran parlance). If it fails to do so, this is a bug in
> the compiler.

C has basically the same concept.

(IIRC, C++ as a few constraints such as the "one definition rule" that
where the standard says no diagnostics are necessary, because
identifying the error would mean the compiler has to see multiple
translation units at once. Compilers often diagnose these if they have
some kind of link-time optimisation or program-at-once mode.)

>
> There are also phrases which have "shall" or "shall not". If this
> is violated, this is an error in the program. Catching such a
> violation is a good thing from quality of implementation standpoint,
> but is not required. Many run-time errors such as array overruns
> fall into this category.

That is the same in C. From 4.2 "Conformance" :

"""
If a “shall” or “shall not” requirement that appears outside of a
constraint or runtime-constraint is violated, the behavior is undefined.
Undefined behavior is otherwise indicated in this International Standard
by the words “undefined behavior” or by the omission of any explicit
definition of behavior. There is no difference in emphasis among these
three; they all describe “behavior that is undefined”.
"""

The only difference I see from what you describe of Fortran (I have not
read any Fortran standards) is that the C standards also note that
behaviour that is not defined in the standards is undefined behaviour as
far as the standards are concerned. That is a tautology, of course, and
applies equally to Fortran and any other language.

It is quite possible that the details of which behaviours are defined or
not varies between the languages - things like division by 0,
out-of-bounds array access, etc., may be different. As I understand it,
passing aliased pointers or array references as different parameters to
the same function can lead to undefined behaviour in Fortran, whereas it
is defined in C (unless you use "restrict").

> [...]
>
>> The real challenge from big languages and big standard libraries is not
>> /writing/ code, it is /reading/ it. It doesn't really matter if a C
>> programmer, when writing some code, does not know what the syntax "void
>> foo(int a[static 10]);" means. (Most C programmers don't know it, and
>> never miss it.) But it can be a problem if they have to read and
>> understand code that uses something they don't know.
>
> Agreed.

Re: what is defined, was for or against equality

<22-01-027@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=186&group=comp.compilers#186

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: spi...@gmail.com (Spiros Bousbouras)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Fri, 7 Jan 2022 13:21:29 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 67
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-027@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-020@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="30860"; mail-complaints-to="abuse@iecc.com"
Keywords: standards, semantics
Posted-Date: 07 Jan 2022 20:25:13 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-01-020@comp.compilers>
 by: Spiros Bousbouras - Fri, 7 Jan 2022 13:21 UTC

On Thu, 6 Jan 2022 16:43:05 -0000 (UTC)
Thomas Koenig <tkoenig@netcologne.de> wrote:
> David Brown <david.brown@hesbynett.no> schrieb:
>
> > There is no need to memorize undefined behaviours for a language -
> > indeed, such a thing is impossible since everything not defined by a
> > language standard is, by definition, undefined behaviour. (C and C++
> > are not special here - the unusual thing is just that their standards
> > say this explicitly.)
>
> This is a rather C-centric view of things. The Fortran standard
> uses a different model.
>
> There are constraints, which are numbered. Any violation of such
> a constraint needs to be reported by the compiler ("processor",
> in Fortran parlance). If it fails to do so, this is a bug in
> the compiler.
>
> There are also phrases which have "shall" or "shall not". If this
> is violated, this is an error in the program. Catching such a
> violation is a good thing from quality of implementation standpoint,
> but is not required. Many run-time errors such as array overruns
> fall into this category.

This seems to me exactly like the C model. What difference do you see ?

Regarding the more general issue, it seems to me that undefined behaviour is
a red herring (which I think is the point David was making). Every time one
writes code in any language , one must have an expectation on how the code is
supposed to behave and some reasoning on why the code they wrote will behave
according to their expectations. The reasoning will be based (apart from
general rules from logic and mathematics) on what the standard of the
programming language specifies (if the language has a standard) , what the
translator/compiler documentation specifies , what the documentation of any
libraries they use specifies and so forth.

For example lets say that I write in C

int a = INT_MAX + 1 ;

with the expectation that a will get the value INT_MIN. The onus is on me
to provide a reasoning why the code above will meet my expectation. If I
cannot provide such a reasoning then from my point of view the code is
already undefined. The fact that the C standard also says that the code is
undefined is irrelevant. Even if the C standard specified for example that
signed integer arithmetic uses wraparound, unless I could point to the place
in the standard where it said so, the code is still undefined from my point
of view so I should not use it.

But lets say that I have the above code and I intend to compile it with
GCC using the -fwrapv flag. Then my expectation is actually justified
based on the GCC documentation for what -fwrapv means and the parts
of the C standard which define what the various symbols in

int a = INT_MAX + 1 ;

mean. I'm not going to provide a proof because it should be obvious. But
any such proof would not need to cite any part of the C standard which
explicitly mentions undefined behaviour.

The only occasion where an explicit mention of undefined behaviour would be
relevant would be if the C standard (or any standard) were contradictory i.e.
it said in some place that some construct has a certain defined behaviour and
it said in some other place that the same construct has undefined behaviour.
But with a popular language like C , if such contradictions existed , they
would be caught early and corrected.

Undefined behaviour, was: for or against equality

<22-01-028@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=187&group=comp.compilers#187

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: mar...@gkc.org.uk (Martin Ward)
Newsgroups: comp.compilers
Subject: Undefined behaviour, was: for or against equality
Date: Fri, 7 Jan 2022 14:02:50 +0000
Organization: Compilers Central
Lines: 26
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-028@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="31121"; mail-complaints-to="abuse@iecc.com"
Keywords: standards, semantics
Posted-Date: 07 Jan 2022 20:25:43 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-01-018@comp.compilers>
Content-Language: en-GB
 by: Martin Ward - Fri, 7 Jan 2022 14:02 UTC

On 06/01/2022 08:11, David Brown wrote:
> The trick is to memorize the/defined/ behaviours, and stick to them.

Isn't the set of defined behaviours bigger than the set
of undefined behaviours? How do you know what is defined
if you don't know what is undefined?

For example, a = b + c is precisely defined in C and C++ for
floating point variables, but the result can be "undefined behaviour"
for ordinary 32 bit signed integer values.

If you want to stick to defined behaviours then you need
to add extra code. For example, CERT recommends:

if (((si_b > 0) && (si_a > (INT_MAX - si_b))) ||
((si_b < 0) && (si_a < (INT_MIN - si_b)))) {
/* Handle error */
} else {
sum = si_a + si_b;
}

--
Martin

Dr Martin Ward | Email: martin@gkc.org.uk | http://www.gkc.org.uk
G.K.Chesterton site: http://www.gkc.org.uk/gkc | Erdos number: 4

Re: Undefined behaviour, was: for or against equality

<22-01-029@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=188&group=comp.compilers#188

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.compilers
Subject: Re: Undefined behaviour, was: for or against equality
Date: Fri, 7 Jan 2022 15:56:22 +0100
Organization: Compilers Central
Lines: 150
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-029@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <7f4f52f2-49ee-9e80-1f03-c3fb9c74f574@gkc.org.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="31420"; mail-complaints-to="abuse@iecc.com"
Keywords: standards, semantics
Posted-Date: 07 Jan 2022 20:27:05 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Content-Language: en-GB
In-Reply-To: <7f4f52f2-49ee-9e80-1f03-c3fb9c74f574@gkc.org.uk>
 by: David Brown - Fri, 7 Jan 2022 14:56 UTC

On 07/01/2022 15:02, Martin Ward wrote:
> On 06/01/2022 08:11, David Brown wrote:
>> The trick is to memorize the/defined/  behaviours, and stick to them.
>
> Isn't the set of defined behaviours bigger than the set
> of undefined behaviours? How do you know what is defined
> if you don't know what is undefined?

You know what is "defined" because you can find the definition for it -
everything else is undefined. You could enumerate all defined
behaviours for a language - after all, the documentation (language
standards, compiler manual, library documentation, etc.) is finite. It
doesn't really make sense to try to find how many undefined behaviours
there are - it's like asking how many things are there that are apples.

Language standards tell you the defined behaviour for a language.
Anything that is not there, is undefined - that's simply what the word
"undefined" means.

Note that there are many other things besides language standards that
define behaviour of code in practice - compilers or interpreters can add
their own definitions to things that are not defined by the language
standards, as can additional standards such as POSIX.

If you write a function "foo" - perhaps written in the same language
(such as C), perhaps in a completely different language - then its
behaviour is not defined by the language standards. It is not mentioned
anywhere in those documents, so it is undefined. (That is different
from functions whose behaviour is specified in the standard, such as
"memcpy".)

Undefined behaviour, as far as language standards are concerned, are
omnipresent in programming - for all languages. The problem only comes
when you attempt to execute something that does not have its behaviour
defined /anywhere/. Then it is incorrect code - a bug.

When I learned to program (i.e., during my university education rather
than from books, magazines and trial and error previous to that), we
were very clear about how a function is specified. You have a
pre-condition and a post-condition. The function can assume the
pre-condition is logically "true", and it will guarantee that the
post-condition is true at the exit. (Typically you also have an
"invariant" that is a clause in both parts, but that is just for
convenience.) If the function is called when the pre-condition is
false, the function has no obligation to do anything - it can give an
error, launch nasal daemons, give the answer it thinks the programmer
hoped for, or anything else. The behaviour is undefined.

This concept has existed since the dawn of programming:

"""
On two occasions I have been asked, 'Pray, Mr. Babbage, if you put into
the machine wrong figures, will the right answers come out?' I am not
able rightly to apprehend the kind of confusion of ideas that could
provoke such a question.

Charles Babbage
"""

The C standards contain a fair number of explicit undefined behaviours.
They do that for convenience and clarity, and often to encourage
compiler developers towards greater efficiency rather than run-time
checks, and to encourage programmers towards not assuming particular
behaviours even if one compiler happens to define the behaviour. So a
compiler writer knows that they can assume "a + b" never overflows (for
integer arithmetic), and a programmer knows that they can't assume
signed arithmetic is wrapping even if the compiler they are using at the
time /guarantees/ wrapping behaviour. (I have never seen a C compiler
that guarantees this without explicit flags.)

C is a language that expects the programmer to take responsibility for
his or her code, and ensure that it is correct. Fortunately, good
compiler developers know this is difficult and provide tools to help
people find their bugs. Thus you have a language that can give
efficient results, /and/ provide good debugging and run-time checking,
as long as you get good tools and understand how to use them.

>
> For example, a = b + c is precisely defined in C and C++ for
> floating point variables, but the result can be "undefined behaviour"
> for ordinary 32 bit signed integer values.
>

Actually, it is not precisely defined for floating point operations - if
there is an "exceptional condition" during the evaluation (the result is
not mathematically defined or not in the range of representable values
for its type), the behaviour is undefined. That applies to all
expressions - integer and floating point.

Now, it is very common (but certainly not universal) for C
implementations to use IEEE floating point formats and rules. These
provide the "mathematical definitions" for floating point operations,
including handling of calculations outside the normal ranges. But if
you are not using these, such calculations could result in undefined
behaviour. (For example, if you use "gcc -ffast-math", the compiler
will assume that all expressions are normal finite numbers - that's
perfectly valid for C, and can be very much more efficient on a lot of
targets.)

Signed integer overflow is undefined behaviour on most compilers (the
size is not necessarily 32-bit). The only one I know that defines the
behaviour is gcc (and compatibles, such as clang and icc) with the
"-fwrapv" flag enabled.

And of course that makes perfect sense. It is logical to assume that if
you add two positive numbers, you get a positive number - it is
illogical to suppose that sometimes the "correct" answer will be
negative. Some programming languages (such as Java) specifically define
signed integer arithmetic to be wrapping - the result is that sometimes
you get the wrong answer in Java, while in C you would get undefined
behaviour. Wrong answers are less helpful - leaving the behaviour
undefined means you get more efficient code and that you can use
debugging tools (such as gcc's -fsantitize=undefined) to help find the
errors in your code.

> If you want to stick to defined behaviours then you need
> to add extra code. For example, CERT recommends:
>
>   if (((si_b > 0) && (si_a > (INT_MAX - si_b))) ||
>       ((si_b < 0) && (si_a < (INT_MIN - si_b)))) {
>     /* Handle error */
>   } else {
>     sum = si_a + si_b;
>   }
>

That is /not/ code to "stick to defined behaviours". It is code to
identify problems and perhaps find some way to handle it (depending on
what the "handle error" code is).

You can "stick to defined behaviour" much more simply:

int sum = (unsigned int) si_a + (unsigned int) si_b;

The behaviour is fully defined, and the result will be wrong if there is
an overflow - just like when you use a language that has fully defined
signed integer arithmetic by wrapping.

The answer here is /not/ to worry about what happens when your
expressions overflow and you get undefined behaviour. The answer is to
think about the code you are writing, and make sure that the types and
expressions you write are appropriate for the values you have. Check
your values for validity when you get them in (from files, user input,
etc.), then write code that is correct for the full range of values.
Simple. (Well, as simple as any programming!)

Re: Undefined behaviour, was: for or against equality

<22-01-031@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=190&group=comp.compilers#190

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: spi...@gmail.com (Spiros Bousbouras)
Newsgroups: comp.compilers
Subject: Re: Undefined behaviour, was: for or against equality
Date: Sat, 8 Jan 2022 03:41:55 -0000 (UTC)
Organization: Aioe.org NNTP Server
Lines: 47
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-031@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-028@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="54542"; mail-complaints-to="abuse@iecc.com"
Keywords: standards
Posted-Date: 07 Jan 2022 23:12:52 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
X-Organisation: Weyland-Yutani
 by: Spiros Bousbouras - Sat, 8 Jan 2022 03:41 UTC

On Fri, 7 Jan 2022 14:02:50 +0000
Martin Ward <martin@gkc.org.uk> wrote:
> On 06/01/2022 08:11, David Brown wrote:
> > The trick is to memorize the/defined/ behaviours, and stick to them.
>
> Isn't the set of defined behaviours bigger than the set
> of undefined behaviours?

That depends on how you define those sets. For example, any finite string is
a potential C source code and, of strings of length N (for any value of N),
only a very small percentage have defined behaviour. But regardless, you
need to know at least some defined behaviours to be able to programme at all
and, as long as you stick to those, you are not using any undefined
behaviours.

> How do you know what is defined
> if you don't know what is undefined?

As David has already said, you know by reading the definitions. And this is
the only way to know. Trying to guess what you're getting at, perhaps you
are thinking of someone who learns some C, then makes some unwarranted
assumptions from what they have learned and then has those assumptions scaled
back by coming across explicit mentions of "undefined behaviour" in the C
standard. Perhaps some people do behave this way. For example someone who
already knows assembly and begins to learn C may assume that all address
manipulations which would be legal in assembly are also legal using C
pointers. The correct remedy is not to make unwarranted assumptions to begin
with, whether one learns C or any other programming language. There is an
infinite number of unwarranted assumptions one can make and the C standard
can only caution against a finite number of them.

> For example, a = b + c is precisely defined in C and C++ for
> floating point variables, but the result can be "undefined behaviour"
> for ordinary 32 bit signed integer values.
>
> If you want to stick to defined behaviours then you need
> to add extra code. For example, CERT recommends:
>
> if (((si_b > 0) && (si_a > (INT_MAX - si_b))) ||
> ((si_b < 0) && (si_a < (INT_MIN - si_b)))) {
> /* Handle error */
> } else {
> sum = si_a + si_b;
> }

Whether you need to add code as the above will depend on what you already
know about the types and values of si_a and si_b .

Re: what is defined, was for or against equality

<22-01-032@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=191&group=comp.compilers#191

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Sat, 8 Jan 2022 09:31:06 -0000 (UTC)
Organization: news.netcologne.de
Lines: 69
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-032@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-020@comp.compilers> <22-01-027@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="51668"; mail-complaints-to="abuse@iecc.com"
Keywords: standards, optimize
Posted-Date: 08 Jan 2022 13:11:52 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Thomas Koenig - Sat, 8 Jan 2022 09:31 UTC

Spiros Bousbouras <spibou@gmail.com> schrieb:
> On Thu, 6 Jan 2022 16:43:05 -0000 (UTC)
> Thomas Koenig <tkoenig@netcologne.de> wrote:
>> David Brown <david.brown@hesbynett.no> schrieb:
>>
>> > There is no need to memorize undefined behaviours for a language -
>> > indeed, such a thing is impossible since everything not defined by a
>> > language standard is, by definition, undefined behaviour. (C and C++
>> > are not special here - the unusual thing is just that their standards
>> > say this explicitly.)
>>
>> This is a rather C-centric view of things. The Fortran standard
>> uses a different model.
>>
>> There are constraints, which are numbered. Any violation of such
>> a constraint needs to be reported by the compiler ("processor",
>> in Fortran parlance). If it fails to do so, this is a bug in
>> the compiler.
>>
>> There are also phrases which have "shall" or "shall not". If this
>> is violated, this is an error in the program. Catching such a
>> violation is a good thing from quality of implementation standpoint,
>> but is not required. Many run-time errors such as array overruns
>> fall into this category.
>
> This seems to me exactly like the C model. What difference do you see ?

First, I see a difference in result. Highly intelligent and
knowledgable people argue vehemently if a program should be able
to use undefined behavior or not, and lot of vitriol is directed
against compiler writers who use the assumption that undefined
behavior cannot happen in their compilers for optimization,
especially if it turns out that existing code was broken and no
longer works after a compiler upgrade (Just read a few of Linus
Torvald's comments on that matter).

I see C conflating two separate concepts: Programm errors and
behavior that is outside the standard. "Undefined behavior is
always a programming error" does not work; that would make

#include <unistd.h>
#include <string.h>

int main()
{ char a[] = "Hello, world!\n";
write (1, a, strlen(a));
return 0;
}

not more and not less erroneous than

int main()
{ int *p = 0;
*p = 42;
}

whereas I would argue that there is an important difference between
the two.

If the C standard replaced "the behavior is undefined" with "the
program is in error, and the subsequent behavior is undefined"
or something along those lines, the discussion would be much
muted.

(Somebody may point out to me that this what the standard is
actually saying. If so, that would sort of reinforce my argument
that it should be clearer :-)

Re: Undefined behaviour, was: for or against equality

<22-01-033@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=192&group=comp.compilers#192

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.compilers
Subject: Re: Undefined behaviour, was: for or against equality
Date: Sat, 08 Jan 2022 17:52:02 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 17
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-033@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <7f4f52f2-49ee-9e80-1f03-c3fb9c74f574@gkc.org.uk> <22-01-029@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="53753"; mail-complaints-to="abuse@iecc.com"
Keywords: standards, semantics, comment
Posted-Date: 08 Jan 2022 13:20:51 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Anton Ertl - Sat, 8 Jan 2022 17:52 UTC

David Brown <david.brown@hesbynett.no> writes:
>Undefined behaviour, as far as language standards are concerned, are
>omnipresent in programming - for all languages.

Please prove this astounding assertion. My impression is that managed
languages define everything, at least to some extent, and leave
nothing undefined. If they allowed nasal demons, the appeal of
managed languages would evaporate instantly.

- anton
--
M. Anton Ertl
anton@mips.complang.tuwien.ac.at
http://www.complang.tuwien.ac.at/anton/
[Things like .NET define a lot but they still are at the mercy
of their envronment when you ask for a variable sized chunk of
storage. -John]

Re: what is defined, was for or against equality

<22-01-034@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=193&group=comp.compilers#193

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: spi...@gmail.com (Spiros Bousbouras)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Sat, 8 Jan 2022 22:28:00 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 88
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-034@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-020@comp.compilers> <22-01-027@comp.compilers> <22-01-032@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="24246"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 08 Jan 2022 17:58:29 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-01-032@comp.compilers>
 by: Spiros Bousbouras - Sat, 8 Jan 2022 22:28 UTC

On Sat, 8 Jan 2022 09:31:06 -0000 (UTC)
Thomas Koenig <tkoenig@netcologne.de> wrote:
> Spiros Bousbouras <spibou@gmail.com> schrieb:
> > On Thu, 6 Jan 2022 16:43:05 -0000 (UTC)
> > Thomas Koenig <tkoenig@netcologne.de> wrote:
> >> This is a rather C-centric view of things. The Fortran standard
> >> uses a different model.
> >>
> >> There are constraints, which are numbered. Any violation of such
> >> a constraint needs to be reported by the compiler ("processor",
> >> in Fortran parlance). If it fails to do so, this is a bug in
> >> the compiler.
> >>
> >> There are also phrases which have "shall" or "shall not". If this
> >> is violated, this is an error in the program. Catching such a
> >> violation is a good thing from quality of implementation standpoint,
> >> but is not required. Many run-time errors such as array overruns
> >> fall into this category.
> >
> > This seems to me exactly like the C model. What difference do you see ?
>
> First, I see a difference in result. Highly intelligent and
> knowledgable people argue vehemently if a program should be able
> to use undefined behavior or not, and lot of vitriol is directed
> against compiler writers who use the assumption that undefined
> behavior cannot happen in their compilers for optimization,
> especially if it turns out that existing code was broken and no
> longer works after a compiler upgrade (Just read a few of Linus
> Torvald's comments on that matter).
>
> I see C conflating two separate concepts: Programm errors and
> behavior that is outside the standard. "Undefined behavior is
> always a programming error" does not work; that would make

The C standard is in no position to say that some programme is in
error. This would require near omniscience from the standard
writers.

> #include <unistd.h>
> #include <string.h>
>
> int main()
> {
> char a[] = "Hello, world!\n";
> write (1, a, strlen(a));
> return 0;
> }
>
> not more and not less erroneous than
>
> int main()
> {
> int *p = 0;
> *p = 42;
> }
>
> whereas I would argue that there is an important difference between
> the two.

The only difference I see between the two is that the first is defined
by POSIX and the second is not. According to POSIX the first is required
to print something on stdout. I cannot imagine any extension which
would make the second programme do something useful and a conforming
implementation may well compile it as essentially a no-op.

But with something like

int main(voidd) {
int *p = 0 ;
*p = 42 ;
.... do other stuff ...
return 0 ;
}

the C standard allows for a conforming implementation to do something
useful like perhaps store 42 to address 0.

> If the C standard replaced "the behavior is undefined" with "the
> program is in error, and the subsequent behavior is undefined"
> or something along those lines, the discussion would be much
> muted.
>
> (Somebody may point out to me that this what the standard is
> actually saying. If so, that would sort of reinforce my argument
> that it should be clearer :-)

No , it most definitely does not say that nor could it possibly say
that.

Re: what is defined, was for or against equality

<22-01-035@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=194&group=comp.compilers#194

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Sun, 9 Jan 2022 00:09:19 -0000 (UTC)
Organization: news.netcologne.de
Lines: 70
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-035@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-020@comp.compilers> <22-01-027@comp.compilers> <22-01-032@comp.compilers> <22-01-034@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="67614"; mail-complaints-to="abuse@iecc.com"
Keywords: standards
Posted-Date: 08 Jan 2022 21:29:37 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Thomas Koenig - Sun, 9 Jan 2022 00:09 UTC

Spiros Bousbouras <spibou@gmail.com> schrieb:
> On Sat, 8 Jan 2022 09:31:06 -0000 (UTC)
> Thomas Koenig <tkoenig@netcologne.de> wrote:
>> Spiros Bousbouras <spibou@gmail.com> schrieb:
>> > On Thu, 6 Jan 2022 16:43:05 -0000 (UTC)
>> > Thomas Koenig <tkoenig@netcologne.de> wrote:
>> >> This is a rather C-centric view of things. The Fortran standard
>> >> uses a different model.
>> >>
>> >> There are constraints, which are numbered. Any violation of such
>> >> a constraint needs to be reported by the compiler ("processor",
>> >> in Fortran parlance). If it fails to do so, this is a bug in
>> >> the compiler.
>> >>
>> >> There are also phrases which have "shall" or "shall not". If this
>> >> is violated, this is an error in the program. Catching such a
>> >> violation is a good thing from quality of implementation standpoint,
>> >> but is not required. Many run-time errors such as array overruns
>> >> fall into this category.
>> >
>> > This seems to me exactly like the C model. What difference do you see ?
>>
>> First, I see a difference in result. Highly intelligent and
>> knowledgable people argue vehemently if a program should be able
>> to use undefined behavior or not, and lot of vitriol is directed
>> against compiler writers who use the assumption that undefined
>> behavior cannot happen in their compilers for optimization,
>> especially if it turns out that existing code was broken and no
>> longer works after a compiler upgrade (Just read a few of Linus
>> Torvald's comments on that matter).
>>
>> I see C conflating two separate concepts: Programm errors and
>> behavior that is outside the standard. "Undefined behavior is
>> always a programming error" does not work; that would make

> The C standard is in no position to say that some programme is in
> error. This would require near omniscience from the standard
> writers.

A standard (or other specification document) is certainly able to
state that some construct is in error. To grab an often-quoted
example:

J3/18-007r1, the Fortran 2018 interpretation documents, states in
subclause 9.5.3, "Array elements and array sections",

# The value of a subscript in an array element shall be within the
# bounds for its dimension.

No omnicience required to write or understand that sentence.

This puts the burden on the programmer. The compiler might catch
such an error error and abort the program, or other unpredictable
things such as overwriting an unrelated variable might also happen.

Reading a language standard can be hard. Quite often, information
is scattered throughout the text and needs to be pieced together
to find the necessary information, especially definition of terms
which are crucial to understanding. Most programmers do do not
read standards (at least final committee drafts can usually be
found these days on the Internet), but compiler writers should at
least be familiar with what they are implementing.

Programmers often rely on books, but these can also get things wrong.

Because programmers are human, they also can get ticked off when being
told that a construct they have used for years has been illegal
for decades :-|

Having a good standard is crucial to being able to write good compilers.

Re: what is defined, was for or against equality

<22-01-037@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=196&group=comp.compilers#196

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: spi...@gmail.com (Spiros Bousbouras)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Sun, 9 Jan 2022 21:30:13 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 106
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-037@comp.compilers>
References: <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-020@comp.compilers> <22-01-027@comp.compilers> <22-01-032@comp.compilers> <22-01-034@comp.compilers> <22-01-035@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="30306"; mail-complaints-to="abuse@iecc.com"
Keywords: Fortran, standards, comment
Posted-Date: 09 Jan 2022 16:53:26 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-01-035@comp.compilers>
 by: Spiros Bousbouras - Sun, 9 Jan 2022 21:30 UTC

On Sun, 9 Jan 2022 00:09:19 -0000 (UTC)
Thomas Koenig <tkoenig@netcologne.de> wrote:
> Spiros Bousbouras <spibou@gmail.com> schrieb:
> > On Sat, 8 Jan 2022 09:31:06 -0000 (UTC)
> > Thomas Koenig <tkoenig@netcologne.de> wrote:
> >> I see C conflating two separate concepts: Programm errors and
> >> behavior that is outside the standard. "Undefined behavior is
> >> always a programming error" does not work; that would make
>
> > The C standard is in no position to say that some programme is in
> > error. This would require near omniscience from the standard
> > writers.
>
> A standard (or other specification document) is certainly able to
> state that some construct is in error. To grab an often-quoted
> example:
>
> J3/18-007r1, the Fortran 2018 interpretation documents, states in
> subclause 9.5.3, "Array elements and array sections",
>
> # The value of a subscript in an array element shall be within the
> # bounds for its dimension.
>
> No omnicience required to write or understand that sentence.
>
> This puts the burden on the programmer. The compiler might catch
> such an error error and abort the program, or other unpredictable
> things such as overwriting an unrelated variable might also happen.

I haven't read any Fortran standards so I can only go by the above quote.
Only the programmer knows what their requirements are and why they think that
the code they wrote will meet those requirements. My idea of error is that
either the code does not meet the requirements or it does so only by accident
and the programmer does not have a correct reasoning as to why their code
will meet those requirements. You seem to be reading the quote as saying

No matter what the programmer requirements and no matter what extensions
their Fortram implementation offers , the programmer requirements will
not be justifiably met if they use an array subscript outside the bounds
for its dimension.

Perhaps some Fortran implementation gives information as to the layout of
distinct variables so that one knows what will be overwritten by writing off
the bounds of some aray and it will be overwritten in the way the programmer
wants. Unlikely (especially for Fortran) but it cannot be excluded. I can
imagine a C implementation for small embedded systems which does provide such
information and a programmer using it to reduce the number of instructions to
achieve a desired result. A more realistic example is the following :

#include <stdio.h>

int main(void) {
int a = 12 , b = 14 ;
printf("%2$d %1$d\n" , a , b) ;
return 0 ;
}

The above code has undefined behaviour according to the C standard. It is
defined according to POSIX .Whether it is in error depends on whether the
programmer really wanted to print
14 12

and no standards committee can possibly know this. So I still think that your
reading requires omniscience from the Fortran standard writers. But perhaps
there are other parts of the standard which justify your reading. For example
some parts of the Common Lisp standard do state that an implementation must
not extend some construct to provide useful functionality beyond what the
standard specifies. I don't remember precisely how it states it and I can't
find those parts now.

> Reading a language standard can be hard. Quite often, information
> is scattered throughout the text and needs to be pieced together
> to find the necessary information, especially definition of terms
> which are crucial to understanding. Most programmers do do not
> read standards (at least final committee drafts can usually be
> found these days on the Internet), but compiler writers should at
> least be familiar with what they are implementing.
>
> Programmers often rely on books, but these can also get things wrong.

C books at least usually don't go into the fine details of undefined
behaviour. To hone one's instincts in this area one should spend a few
months systematically reading comp.lang.c while consulting a draft
of the standard !

> Because programmers are human, they also can get ticked off when being
> told that a construct they have used for years has been illegal
> for decades :-|

This may happen but my impression with C is that the strongest complaints
come from people who

- have read the C standard (or at least the relevant parts of it)

- know that their code has undefined behaviour and know what the term means

- they do not rely on any compiler extensions

yet still feel certain (dare I say "entitled" ?) that their code ought to
behave in a certain way. For an extreme example see Robert M. Hyatt of
crafty fame (a chess programme which has won awards in the past) :
http://www.open-chess.org/viewtopic.php?f=5&t=2519 .
[Fortran used to require that arrays were stored in column major order, that
double precision took twice the space of real and integer, and you were allowed
to use EQUIVALENCE and adjustable dimensions in argument arrays to do overlaying
assuming that layout. Dunno how much more modern Fortran has deprecated it. -John]

Re: what is defined, was for or against equality

<22-01-038@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=197&group=comp.compilers#197

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Sun, 9 Jan 2022 23:00:46 +0100
Organization: A noiseless patient Spider
Lines: 177
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-038@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-020@comp.compilers> <22-01-027@comp.compilers> <22-01-032@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="38539"; mail-complaints-to="abuse@iecc.com"
Keywords: Fortran, C, optimize, standards, comment
Posted-Date: 09 Jan 2022 17:44:26 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-01-032@comp.compilers>
Content-Language: en-GB
 by: David Brown - Sun, 9 Jan 2022 22:00 UTC

On 08/01/2022 10:31, Thomas Koenig wrote:
> Spiros Bousbouras <spibou@gmail.com> schrieb:

>> This seems to me exactly like the C model. What difference do you see ?
>
> First, I see a difference in result. Highly intelligent and
> knowledgable people argue vehemently if a program should be able
> to use undefined behavior or not, and lot of vitriol is directed
> against compiler writers who use the assumption that undefined
> behavior cannot happen in their compilers for optimization,
> especially if it turns out that existing code was broken and no
> longer works after a compiler upgrade (Just read a few of Linus
> Torvald's comments on that matter).

People want compilers to do what the programmer meant, not what he or
she wrote. And in particular, if a compiler did one thing once, they
want it to continue to do the same thing with the same code - as long as
they got what they wanted the first time round.

This is, of course, entirely natural for humans. But it is not natural
for computer programs like compilers.

Linus Torvald's is known for blowing his top on matters that he either
does not understand, or when he has mixed his personal opinions with
facts, or while only looking at a small part of the big picture. (He is
also known as an incredible programmer, a world-class project leader,
and a charismatic visionary who revolutionised the software world - but
that's beside the point here!).

A key example of his complaints in this area revolve around a function
that was something equivalent to :

int foo(int * p) {
int x = *p;
if (!p) return -1;
return x;
}

His complaint was that the compiler saw that "*p" was accessed, and
therefore assumed "p" could not be zero and optimised away the test.
The compiler did exactly what it was asked to do - the optimisation is
perfectly valid according to the C standards and additional definitions
given by the compiler. But it was not what the programmer wanted, and
not what older versions of the compiler had done.

Of course, when a new optimisation simply makes object code more
efficient, programmers want that - they don't /always/ want the compiler
to handle things the way older versions did. They want the compiler to
read their minds and see what they meant to write, and generate optimal
code for that.

None of this is helped by the fact that C code often has to work
efficiently on a variety of targets and compilers, and some compilers
give extra guarantees about how they interpret code beyond the
definitions given in the C standards. Many more compilers can be relied
upon in practice to work in particular ways, though they don't guarantee
or document it, and this means the most efficient code that works in
practice on one compiler may be wrong and give incorrect results on
another compiler. You can write C code that is correct and widely
portable, but you can't write C code that is correct, optimally
efficient, and widely portable.

The big question here, is why do you think Fortran is any different? In
theory, there isn't a difference - nothing you have said here convinces
me that there is any fundamental difference between Fortran and C in
regards to undefined behaviour. (And there's no difference in the
implementations - the most commonly used Fortran compilers also handle
C, C++, and perhaps other languages.)

I believe it is a matter of who writes Fortran programs, and what these
programs do. Now, I don't know or use Fortran myself, so I might be
wrong here. However, it seems to me that Fortran is typically used by
experienced professional programmers and for scientific or numerical
programming. C is used by a much wider range of programmers, for a much
wider range of programming tasks. I think it is inevitable that you'll
get more people programming in C when they are not fully sure of what
they are doing, more code where subtle mistakes can be made, more people
using C when other languages would have been better choices, and more C
programmers who are likely to blame their tools for their own mistakes.

>
> I see C conflating two separate concepts: Programm errors and
> behavior that is outside the standard. "Undefined behavior is
> always a programming error" does not work; that would make
>
> #include <unistd.h>
> #include <string.h>
>
> int main()
> {
> char a[] = "Hello, world!\n";
> write (1, a, strlen(a));
> return 0;
> }
>

C does not have a "write" function in the standard library. So the
behaviour of "write" is not defined by the C standards - but that does
not mean the behaviour is undefined. It just means it is defined
elsewhere, not in the C standards. If the programmer doesn't know what
the "write" function does or how it is specified, then it might be
undefined behaviour - certainly it is bad programming.

> not more and not less erroneous than
>
> int main()
> {
> int *p = 0;
> *p = 42;
> }
>
> whereas I would argue that there is an important difference between
> the two.
>

There is no fundamental difference - if you know the behaviour is
defined, it is defined. (The program is then correct or incorrect
depending on how that definition matches your requirements.) If not, it
is undefined (and incorrect). In neither case is the behaviour defined
by the C standard, but the behaviour could be defined by something else
(library documentation or external definition of "write", or a C
compiler that specifically says it defines the behaviour of
dereferencing null pointers).

> If the C standard replaced "the behavior is undefined" with "the
> program is in error, and the subsequent behavior is undefined"
> or something along those lines, the discussion would be much
> muted.
>

That sounds like you dislike the "time travel" aspect of C's undefined
behaviour. Many would agree with that - they don't like the idea that
undefined behaviour later in the program can be used to change the
behaviour of code earlier on. The C standard considers undefined
behaviour to be program-wide - if you execute something that has
undefined behaviour (remembering that this means there is no definition
/anywhere/ of what will happen), the whole program is wrong and you
can't expect anything from it.

People often find this disturbing. They think perhaps it is fair enough
that dereferencing a null pointer can crash a program, but it shouldn't
affect things that came before it.

However, there are two key points to think about. First, the standards
handling of undefined behaviour means that a compiler /can/ use UB to
change the object code generated for earlier source code, not that it
/must/ do so. A compiler always balances efficient code generation with
ease-of-use and ease-of-debugging. The ideal balance point will depend
on the programmer writing the code, so compiler flags are used to tune
it, but surprises can still happen.

The other point is to consider how the standards could say anything
else. If the standards required observable behaviour to be completed
before undefined behaviour occurred, the results would be terrible.
Dereferencing a null pointer or dividing by zero could cause a complete
crash (remember the "Windows for Warships" affair? A single divide by
zero brought the whole ship network down, leaving it dead in the water
for hours). That means the compiler would need to make sure any
volatile writes had hit main memory before reading a pointer. It would
have to ensure all file stream buffers were flushed to disk before doing
a division. You can be sure Linus Torvalds would have a thing or two to
say about such a compiler.

> (Somebody may point out to me that this what the standard is
> actually saying. If so, that would sort of reinforce my argument
> that it should be clearer :-)
[Fortran has in principle historically allowed rather aggressive optimization,
e.g., A*B+A*C can turn into A*(B+C). On the other hand, in the real world,
when IBM improved their optimizing compiler Fortran H into Fortran X, the
developers said any new optimization had to produce bit identical results
to what the old compiler did. So this is not a new issue. -John]

Re: Undefined behaviour, was: for or against equality

<22-01-039@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=198&group=comp.compilers#198

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.compilers
Subject: Re: Undefined behaviour, was: for or against equality
Date: Sun, 9 Jan 2022 23:53:52 +0100
Organization: A noiseless patient Spider
Lines: 63
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-039@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <7f4f52f2-49ee-9e80-1f03-c3fb9c74f574@gkc.org.uk> <22-01-029@comp.compilers> <22-01-033@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="48149"; mail-complaints-to="abuse@iecc.com"
Keywords: standards
Posted-Date: 09 Jan 2022 18:34:40 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-01-033@comp.compilers>
Content-Language: en-GB
 by: David Brown - Sun, 9 Jan 2022 22:53 UTC

On 08/01/2022 18:52, Anton Ertl wrote:
> David Brown <david.brown@hesbynett.no> writes:
>> Undefined behaviour, as far as language standards are concerned, are
>> omnipresent in programming - for all languages.
>
> Please prove this astounding assertion. My impression is that managed
> languages define everything, at least to some extent, and leave
> nothing undefined. If they allowed nasal demons, the appeal of
> managed languages would evaporate instantly.
>

Certainly managed languages define far more than unmanaged languages.
But equally certainly, they do not define everything.

In Python, I can write :

x = flooble(123)

Nowhere in any part of the documentation for Python is a definition of
what the function "flooble" should do. Calling it is /undefined
behaviour/ as far as the language standards are concerned.

Certainly some aspects of calling it - such as the calling convention -
are defined. What should happen if the function does not exist is
defined. But the language and the standards do not define the behaviour
of "flooble".

Being "undefined behaviour as far as the language standards are
concerned" does not mean you can get nasal daemons, it means that the
language standards do not say what will happen. When one says "Division
by 0 is undefined behaviour in C", that is what is meant - as a compiler
or a host OS could give you well-defined and predictable behaviour when
you attempt to divide by 0.

A managed language may put limits on the kind of effect of undefined
behaviour. In Python (at least, CPython), it is possible to call
externally defined functions in shared libraries - even if the Python
bytecode interpreter limits possible effects of pure Python code,
calling external functions gets around those limits. I suppose you
could have a more locked-down managed language that does not allow any
external code, and has additional tracking on things like data space
usage, time usage, and other resources to stop run-away code.

Within such a closed language, you could have defined behaviour for all
code, since any code run or functions called would be in the same
language and have their definitions clear to the interpreter.

Personally, I don't see minimising undefined behaviour as part of the
appeal of managed languages. I make as much effort not to divide by
zero or work with invalid references in my Python code as I do in my C
code - it doesn't much matter if the program stops with Python exception
or a crash. I use Python for the convenience of working with strings,
dictionaries, and other data structures with little concern for memory
management, for its libraries, and other high-level features.

When running unknown code - such as javascript from a website - it is
vital that the effect of any code is limited. Code may have behaviour
that is undefined by the language standards, but it will be defined by
other parts of the code or by its environment (browser, built-in
libraries, etc.). And while it may crash the javascript program or hang
the browser, it should never be able to launch nasal daemons.

Re: what is defined, was for or against equality

<22-01-041@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=200&group=comp.compilers#200

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Mon, 10 Jan 2022 12:04:02 -0000 (UTC)
Organization: news.netcologne.de
Lines: 114
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-041@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-020@comp.compilers> <22-01-027@comp.compilers> <22-01-032@comp.compilers> <22-01-038@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="56068"; mail-complaints-to="abuse@iecc.com"
Keywords: design, Fortran, comment
Posted-Date: 10 Jan 2022 14:39:16 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Thomas Koenig - Mon, 10 Jan 2022 12:04 UTC

David Brown <david.brown@hesbynett.no> schrieb:

> The big question here, is why do you think Fortran is any different? In
> theory, there isn't a difference - nothing you have said here convinces
> me that there is any fundamental difference between Fortran and C in
> regards to undefined behaviour.

I am not sure how to better explain it. I will try a bit, but
this will be my last reply to you in this thread. We seem to have
a fundamental difference in our understanding, and seem to be
unable to resolve it.

> (And there's no difference in the
> implementations - the most commonly used Fortran compilers also handle
> C, C++, and perhaps other languages.)

Sort of.

At the risk of boring most readers of this group, a very short, but
(hopefully) pertinent introduction of how modern compilers work:

A front end translates the source to an abstract syntax tree (which
you can view with gfortran with -fdump-fortran-original) and from
that into an intermediate representation (which you can view with
gfortran, or with gcc in general, with -fdump-tree-original).
This intermediate representation is then optimized, in
an architecture-independent way (usually using SSA) and then
translated into assembler or directly to object code using a
"back end", of which many compilers also have several.

An example: The program

print *,"Hello, world"
end

is translated into (code only)

WRITE UNIT=6 FMT=-1
TRANSFER 'Hello, world'
DT_END

and then, in the intermediate representation.

MAIN__ ()
{ {
struct __st_parameter_dt dt_parm.0;

dt_parm.0.common.filename = &"hello.f90"[1]{lb: 1 sz: 1};
dt_parm.0.common.line = 2;
dt_parm.0.common.flags = 128;
dt_parm.0.common.unit = 6;
_gfortran_st_write (&dt_parm.0);
_gfortran_transfer_character_write (&dt_parm.0, &"Hello, world"[1]{lb: 1 sz: 1}, 12);
_gfortran_st_write_done (&dt_parm.0);
}
}

There is no compiler (if you mean a single binary) that handles both
C and Fortran. They are separate front ends to common middle
and back ends.

And there are certainly differences in the code that the front
ends handle to the middle end, so saying that there is "no
difference in the implementations" is not correct.

>> I see C conflating two separate concepts: Programm errors and
>> behavior that is outside the standard. "Undefined behavior is
>> always a programming error" does not work; that would make
>>
>> #include <unistd.h>
>> #include <string.h>
>>
>> int main()
>> {
>> char a[] = "Hello, world!\n";
>> write (1, a, strlen(a));
>> return 0;
>> }
>>
>
> C does not have a "write" function in the standard library. So the
> behaviour of "write" is not defined by the C standards - but that does
> not mean the behaviour is undefined.

When interpreting at a language standard, you _must_ follow the
definitions in the standards if they exist, you cannot use everyday
interpretations.

Subclause 3.4.3 (N2596) defines

# undefined behavior

# behavior, upon use of a nonportable or erroneous program
# construct or of erroneous data, for which this document imposes
# no requirements

write() is nonportable and the C standard imposes no requirements
on it. Therefore, the program above invokes undefined behavior.

> It just means it is defined
> elsewhere, not in the C standards.

Nope, see above.

(If you replaced every occurence of "undefined behavior" in the C
standard with "WRTLPFMFT behavior" and "the behavior is undefined"
with "the behavior is WRTLPFMFT", the meaning of the standard
would not change.)
[It seems like nitpicking here. Yes, the C and POSIX standards are
different things, but we all know how common it is to use them
together. -John]

Re: what is defined, was for or against equality

<22-01-042@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=201&group=comp.compilers#201

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: gah...@u.washington.edu (gah4)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Mon, 10 Jan 2022 16:58:55 -0800 (PST)
Organization: Compilers Central
Lines: 27
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-042@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-020@comp.compilers> <22-01-027@comp.compilers> <22-01-032@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="20314"; mail-complaints-to="abuse@iecc.com"
Keywords: C, standards
Posted-Date: 10 Jan 2022 21:28:16 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-01-032@comp.compilers>
 by: gah4 - Tue, 11 Jan 2022 00:58 UTC

On Saturday, January 8, 2022 at 10:11:55 AM UTC-8, Thomas Koenig wrote:

(snip)

> I see C conflating two separate concepts: Programm errors and
> behavior that is outside the standard. "Undefined behavior is
> always a programming error" does not work; that would make

> #include <unistd.h>
> #include <string.h>
> int main()
> {
> char a[] = "Hello, world!\n";
> write (1, a, strlen(a));
> return 0;
> }

Without the:

#include <unistd.h>

I agree that this would be undefined behavior. But with the include file,
you are agreeing to use whatever standard the include file belongs to.

The include file defines the arguments to write(), but even more indicates
that you either supply (in another file), or use an otherwise supplied library
defining write().

Re: Undefined behaviour, was: for or against equality

<22-01-043@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=202&group=comp.compilers#202

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: 480-992-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.compilers
Subject: Re: Undefined behaviour, was: for or against equality
Date: Tue, 11 Jan 2022 16:55:54 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 26
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-043@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <7f4f52f2-49ee-9e80-1f03-c3fb9c74f574@gkc.org.uk> <22-01-029@comp.compilers> <22-01-033@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="9819"; mail-complaints-to="abuse@iecc.com"
Keywords: standards, Lisp
Posted-Date: 11 Jan 2022 13:07:32 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Kaz Kylheku - Tue, 11 Jan 2022 16:55 UTC

On 2022-01-08, Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
> David Brown <david.brown@hesbynett.no> writes:
>>Undefined behaviour, as far as language standards are concerned, are
>>omnipresent in programming - for all languages.
>
> Please prove this astounding assertion. My impression is that managed
> languages define everything, at least to some extent, and leave
> nothing undefined. If they allowed nasal demons, the appeal of
> managed languages would evaporate instantly.

The Lisp-like programming language Scheme has unspecified order of
argument evaluation. And you can stuff side effects into argument
expressions, like in C.

Its built-in imperative have undefined return values.

ANSI Common Lisp leaves the effects undefined of modifying literals,
just like C. ANSI Lisp code that perpetrates some kind of error is
safe only if compiled in safe mode; if you compile with reduced safety,
e.g. (declare (optimize (safety 0))), then error become undefined
behavior, including type errors. If you declare that some quantity is
a fixnum integer, and request safety 0 speed 3, and then it turns
out that it's other than an integer, woe to that code.
However, in these cases you're invoking the safety escape hatch;
it's not like C where you are shackled by chains of undefined behavior
which make themselves felt every time you squirm.

Re: what is defined, was for or against equality

<22-01-044@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=203&group=comp.compilers#203

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Tue, 11 Jan 2022 18:16:28 +0100
Organization: A noiseless patient Spider
Lines: 64
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-044@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-020@comp.compilers> <22-01-027@comp.compilers> <22-01-032@comp.compilers> <22-01-038@comp.compilers> <22-01-041@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="10445"; mail-complaints-to="abuse@iecc.com"
Keywords: standards, theology
Posted-Date: 11 Jan 2022 13:08:30 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-01-041@comp.compilers>
Content-Language: en-GB
 by: David Brown - Tue, 11 Jan 2022 17:16 UTC

On 10/01/2022 13:04, Thomas Koenig wrote:
> David Brown <david.brown@hesbynett.no> schrieb:
>
>> The big question here, is why do you think Fortran is any different? In
>> theory, there isn't a difference - nothing you have said here convinces
>> me that there is any fundamental difference between Fortran and C in
>> regards to undefined behaviour.
>
> I am not sure how to better explain it. I will try a bit, but
> this will be my last reply to you in this thread. We seem to have
> a fundamental difference in our understanding, and seem to be
> unable to resolve it.
>

Fair enough. Maybe in a future discussion, one of us will have an
"Aha!" moment and understand the other's viewpoint, and progress will be
made - until then, there's no point in going around in circles. I'll
snip bits of your post here, and try to minimise new points (unless I
get that "Aha!") - but be sure I am reading and appreciating your entire
post.

>> (And there's no difference in the
>> implementations - the most commonly used Fortran compilers also handle
>> C, C++, and perhaps other languages.)
>
> Sort of.
>
> At the risk of boring most readers of this group, a very short, but
> (hopefully) pertinent introduction of how modern compilers work:
>
>
> There is no compiler (if you mean a single binary) that handles both
> C and Fortran. They are separate front ends to common middle
> and back ends.

Yes. But it is the middle end that handles most of the optimisations,
including those based on undefined behaviour. The front end determines
whether code can have undefined behaviour and in what circumstances.

>> C does not have a "write" function in the standard library. So the
>> behaviour of "write" is not defined by the C standards - but that does
>> not mean the behaviour is undefined.
>
> When interpreting at a language standard, you _must_ follow the
> definitions in the standards if they exist, you cannot use everyday
> interpretations.
>
> Subclause 3.4.3 (N2596) defines
>
> # undefined behavior
>
> # behavior, upon use of a nonportable or erroneous program
> # construct or of erroneous data, for which this document imposes
> # no requirements
>
> write() is nonportable and the C standard imposes no requirements
> on it. Therefore, the program above invokes undefined behavior.

No. (As always, this is based on my interpretation of the standards -
consider everything to have "IMHO" attached.) The implementation of
"write" is outside the scope of the standards, and is therefore
undefined as far as the standards are concerned. That does not make it
undefined behaviour in the program - it just means the standards don't
say what "write" should do.

Re: what is defined, was for or against equality

<22-01-045@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=204&group=comp.compilers#204

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: 480-992-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Tue, 11 Jan 2022 19:19:31 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 136
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-045@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-020@comp.compilers> <22-01-027@comp.compilers> <22-01-032@comp.compilers> <22-01-038@comp.compilers> <22-01-041@comp.compilers> <22-01-044@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="34163"; mail-complaints-to="abuse@iecc.com"
Keywords: standards, optimize
Posted-Date: 11 Jan 2022 14:47:23 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Kaz Kylheku - Tue, 11 Jan 2022 19:19 UTC

On 2022-01-11, David Brown <david.brown@hesbynett.no> wrote:
> On 10/01/2022 13:04, Thomas Koenig wrote:
>> David Brown <david.brown@hesbynett.no> schrieb:
>>
>>> The big question here, is why do you think Fortran is any different? In
>>> theory, there isn't a difference - nothing you have said here convinces
>>> me that there is any fundamental difference between Fortran and C in
>>> regards to undefined behaviour.
>>
>> I am not sure how to better explain it. I will try a bit, but
>> this will be my last reply to you in this thread. We seem to have
>> a fundamental difference in our understanding, and seem to be
>> unable to resolve it.
>
> Fair enough. Maybe in a future discussion, one of us will have an
> "Aha!" moment and understand the other's viewpoint, and progress will be
> made - until then, there's no point in going around in circles. I'll
> snip bits of your post here, and try to minimise new points (unless I
> get that "Aha!") - but be sure I am reading and appreciating your entire
> post.
>
>>> (And there's no difference in the
>>> implementations - the most commonly used Fortran compilers also handle
>>> C, C++, and perhaps other languages.)
>>
>> Sort of.
>>
>> At the risk of boring most readers of this group, a very short, but
>> (hopefully) pertinent introduction of how modern compilers work:
>>
>>
>> There is no compiler (if you mean a single binary) that handles both
>> C and Fortran. They are separate front ends to common middle
>> and back ends.
>
> Yes. But it is the middle end that handles most of the optimisations,
> including those based on undefined behaviour. The front end determines
> whether code can have undefined behaviour and in what circumstances.

More precisely, optimizations are based on the absence of undefined
behavior: the assumption that contracts are being upheld.

More precisely, that contracts are being upheld in the face of the
inability to determine and diagnose statically whether they are
violated; i.e. there is a "blind trust". (Though there do exist
situations in which, in principle, undefined behavior is easily
deducible at translation time, without a requirement to do so.)

Front-ends for different languages are written to the respective
requirements of those languages. Their first aim is to handle
well-defined constructs and situations. They target the intermediate
language of the compiler middle. That language has its own contracts.
The front end for each respective language has to ensure that every
situation in which behavior is defined (contract is upheld) is
translated to reliable intermediate code whose contract is upheld.
Care has to be taken that the intermediate code is expressed in the
right way so that it will not change behavior in invalid ways due to
optimizations.

This leaves a lot of room for Fortran and C to have entirely different
defined/undefined behaviors.

Even the front end for one single language can have a lot of switches
affecting what is defined or not.

Thre could be a switch which says that overflowing integer addition has
two's complement wrapping behavior. In that case, the compiler then
selects the intermediate instructions which provide that behavior
reliably (possibly simulating signed arithmetic with unsigned), and
also disables any inferences in the front end that might be based on the
assumption that overflow has not occurred.

>>> C does not have a "write" function in the standard library. So the
>>> behaviour of "write" is not defined by the C standards - but that does
>>> not mean the behaviour is undefined.
>>
>> When interpreting at a language standard, you _must_ follow the
>> definitions in the standards if they exist, you cannot use everyday
>> interpretations.
>>
>> Subclause 3.4.3 (N2596) defines
>>
>> # undefined behavior
>>
>> # behavior, upon use of a nonportable or erroneous program
>> # construct or of erroneous data, for which this document imposes
>> # no requirements
>>
>> write() is nonportable and the C standard imposes no requirements
>> on it. Therefore, the program above invokes undefined behavior.
>
> No. (As always, this is based on my interpretation of the standards -

Yes; using any function that is not in the C program, or in the
standard, is ISO C undefined behavior.

A program which includes <unistd.h> is not required to compile
according to ISO C; it can fail with an error message about the
header not being defined. Or, #include <unistd.h> is allowed, in
a conforming implementation, to bring in tokens which have nothing
to do with POSIX.

Furthermore, a program which calls write, and does not provide such a
function itself, is not required to successfully link. If it does link,
there is no requirement that this symbol is a function described by
POSIX.

POSIX implementations have to go out of their way to allow C programs
to use write as an external name, which ISO C allows.

For instance, the GNU C Library defines write as a weak symbol for
some identifier which resembles __libc_write: the "strong" symbol.

The C library internally uses only that __libc_write: it never calls
write, because user code could replace it:

int write(char *x) { ... }

double write = 42.0;

When the application defines the external name write, the weak symbol
coming from glibc yields; it is suppressed in favor of the program's
definition.

> consider everything to have "IMHO" attached.) The implementation of
> "write" is outside the scope of the standards, and is therefore
> undefined as far as the standards are concerned. That does not make it
> undefined behaviour in the program - it just means the standards don't
> say what "write" should do.

Right; it's "ISO C formal undefined behavior", not "behavior that is
not defined by any party whatsoever" ... though it could well be.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Re: what is defined, was for or against equality

<22-01-046@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=205&group=comp.compilers#205

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: gah...@u.washington.edu (gah4)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Tue, 11 Jan 2022 14:18:56 -0800 (PST)
Organization: Compilers Central
Lines: 32
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-046@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-020@comp.compilers> <22-01-027@comp.compilers> <22-01-032@comp.compilers> <22-01-038@comp.compilers> <22-01-041@comp.compilers> <22-01-044@comp.compilers> <22-01-045@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="48005"; mail-complaints-to="abuse@iecc.com"
Keywords: standards, optimize
Posted-Date: 11 Jan 2022 20:08:03 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-01-045@comp.compilers>
 by: gah4 - Tue, 11 Jan 2022 22:18 UTC

On Tuesday, January 11, 2022 at 11:47:26 AM UTC-8, Kaz Kylheku wrote:

(big snip)

> This leaves a lot of room for Fortran and C to have entirely different
> defined/undefined behaviors.

> Even the front end for one single language can have a lot of switches
> affecting what is defined or not.

I suppose so. But more usual, the compiler works to the least
common denominator.

For one, C requires static variables, and especially external ones, to
initialize to zero, but Fortran doesn't. Fortran compilers that use C
compiler middle and back ends, tend to zero such variables.

I suspect that there are many more that I don't know about.
As long as the cost is small, and it satisfies both standards,
not much reason not to do it.

Fortran has stricter rules on aliasing than C. I don't actually know
about any effect on C programs, though, but it might be that
compilers do the same for C.

One that is not C or Fortran, but IEEE 754, is the effect of
relational operators with NaN. Comparisons with NaN,
except for "not equal", return false. That means that compilers
have to be careful optimizing such, and especially that
"greater than or equal" is not the logical complement of "less than".
(I haven't looked at how compilers handle this, or, even more,
how the hardware handles it.)

Re: Undefined behaviour, was: for or against equality

<22-01-047@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=206&group=comp.compilers#206

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: gneun...@comcast.net (George Neuner)
Newsgroups: comp.compilers
Subject: Re: Undefined behaviour, was: for or against equality
Date: Tue, 11 Jan 2022 22:01:37 -0500
Organization: A noiseless patient Spider
Lines: 40
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-047@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <7f4f52f2-49ee-9e80-1f03-c3fb9c74f574@gkc.org.uk> <22-01-029@comp.compilers> <22-01-033@comp.compilers> <22-01-043@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="46913"; mail-complaints-to="abuse@iecc.com"
Keywords: standards, Lisp
Posted-Date: 12 Jan 2022 17:51:05 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: George Neuner - Wed, 12 Jan 2022 03:01 UTC

On Tue, 11 Jan 2022 16:55:54 -0000 (UTC), Kaz Kylheku
<480-992-1380@kylheku.com> wrote:

>On 2022-01-08, Anton Ertl <anton@mips.complang.tuwien.ac.at> wrote:
>> David Brown <david.brown@hesbynett.no> writes:
>>>Undefined behaviour, as far as language standards are concerned, are
>>>omnipresent in programming - for all languages.
>>
>> Please prove this astounding assertion. My impression is that managed
>> languages define everything, at least to some extent, and leave
>> nothing undefined. If they allowed nasal demons, the appeal of
>> managed languages would evaporate instantly.
>
>The Lisp-like programming language Scheme has unspecified order of
>argument evaluation. And you can stuff side effects into argument
>expressions, like in C.

In Scheme the order of evaluation for let expressions similarly is
unspecified.

There is at least one Scheme which deliberately randomizes the order
of function argument and let evaluation. And there are parallel
Schemes which evaluate function arguments and lets in parallel.

>Its built-in imperative have undefined return values.
>
>ANSI Common Lisp leaves the effects undefined of modifying literals,
>just like C. ANSI Lisp code that perpetrates some kind of error is
>safe only if compiled in safe mode; if you compile with reduced safety,
>e.g. (declare (optimize (safety 0))), then error become undefined
>behavior, including type errors. If you declare that some quantity is
>a fixnum integer, and request safety 0 speed 3, and then it turns
>out that it's other than an integer, woe to that code.
>However, in these cases you're invoking the safety escape hatch;
>it's not like C where you are shackled by chains of undefined behavior
>which make themselves felt every time you squirm.

And Lisp's optimization settings can be changed per function or per
compilation unit as well as globally. ["declaim" vs "declare"]

Re: what is defined, was for or against equality

<22-01-048@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=207&group=comp.compilers#207

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Wed, 12 Jan 2022 19:02:48 -0000 (UTC)
Organization: news.netcologne.de
Lines: 48
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-048@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-020@comp.compilers> <22-01-027@comp.compilers> <22-01-032@comp.compilers> <22-01-038@comp.compilers> <22-01-041@comp.compilers> <22-01-044@comp.compilers> <22-01-045@comp.compilers> <22-01-046@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="47748"; mail-complaints-to="abuse@iecc.com"
Keywords: standards, comment
Posted-Date: 12 Jan 2022 17:53:19 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Thomas Koenig - Wed, 12 Jan 2022 19:02 UTC

gah4 <gah4@u.washington.edu> schrieb:
> On Tuesday, January 11, 2022 at 11:47:26 AM UTC-8, Kaz Kylheku wrote:
>
> (big snip)
>
>> This leaves a lot of room for Fortran and C to have entirely different
>> defined/undefined behaviors.
>
>> Even the front end for one single language can have a lot of switches
>> affecting what is defined or not.
>
> I suppose so. But more usual, the compiler works to the least
> common denominator.
>
> For one, C requires static variables, and especially external ones, to
> initialize to zero, but Fortran doesn't. Fortran compilers that use C
> compiler middle and back ends, tend to zero such variables.

This is more a matter of operating system and linker conventions
than of compilers.

Looking at the ELF standard, one finds

..bss

This section holds uninitialized data that contribute to the program's
memory image. By definition, the system initializes the data with zeros
when the program begins to run. The section occupies no file space, as
indicated by the section type, SHT_NOBITS.

which, unsurprisingly, matches exactly what C is doing.

Anybody who writes a Fortran compiler for an ELF system will
use .bss for COMMOM blocks, because it is easiest. Initialization
with zeros then happens automatically.

> I suspect that there are many more that I don't know about.
> As long as the cost is small, and it satisfies both standards,
> not much reason not to do it.
>
> Fortran has stricter rules on aliasing than C. I don't actually know
> about any effect on C programs, though, but it might be that
> compilers do the same for C.

The rules are different, and unless C is the intermediate language,
a good compiler will hand the corresponding hints to the middle end.
[I have used Fortran systems that initialized otherwise undefined data to a value that would
trap, to help find use-before-set errors. -John]

Re: what is defined, was for or against equality

<22-01-050@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=209&group=comp.compilers#209

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.compilers
Subject: Re: what is defined, was for or against equality
Date: Thu, 13 Jan 2022 08:24:32 +0100
Organization: A noiseless patient Spider
Lines: 59
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-01-050@comp.compilers>
References: <17d70d74-1cf1-cc41-6b38-c0b307aeb35a@gkc.org.uk> <22-01-016@comp.compilers> <22-01-018@comp.compilers> <22-01-020@comp.compilers> <22-01-027@comp.compilers> <22-01-032@comp.compilers> <22-01-038@comp.compilers> <22-01-041@comp.compilers> <22-01-044@comp.compilers> <22-01-045@comp.compilers> <22-01-046@comp.compilers> <22-01-048@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="8334"; mail-complaints-to="abuse@iecc.com"
Keywords: standards
Posted-Date: 14 Jan 2022 12:39:06 EST
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-01-048@comp.compilers>
Content-Language: en-GB
 by: David Brown - Thu, 13 Jan 2022 07:24 UTC

On 12/01/2022 20:02, Thomas Koenig wrote:
> gah4 <gah4@u.washington.edu> schrieb:
>> On Tuesday, January 11, 2022 at 11:47:26 AM UTC-8, Kaz Kylheku wrote:

>> For one, C requires static variables, and especially external ones, to
>> initialize to zero, but Fortran doesn't. Fortran compilers that use C
>> compiler middle and back ends, tend to zero such variables.
>
> This is more a matter of operating system and linker conventions
> than of compilers.
>
> Looking at the ELF standard, one finds
>
> .bss
>
> This section holds uninitialized data that contribute to the program's
> memory image. By definition, the system initializes the data with zeros
> when the program begins to run. The section occupies no file space, as
> indicated by the section type, SHT_NOBITS.
>
> which, unsurprisingly, matches exactly what C is doing.
>
> Anybody who writes a Fortran compiler for an ELF system will
> use .bss for COMMOM blocks, because it is easiest. Initialization
> with zeros then happens automatically.

I was under the impression that FORTRAN compilers typically put data in
the ".common" section of object files. A key difference between .common
and .bss is that (with standard linker setup) duplicate symbols in .bss
are an error, while duplicate symbols in .common are merged. But in C
startup code, .common is also zeroed (FORTRAN may have different startup
code here - with no experience of the language, I don't know such details).

The use of ".common" by C compilers such as gcc was common practice
precisely to improve compatibility with FORTRAN in the early days, and
it let people write "int global_x;" in headers and have everything work,
rather than the correct practice of "extern int global_x;" in headers
and a single "int global_x;" in one object file. The big disadvantages
are that if you have "int local_x;" in two files, and don't use static,
they'll be merged with no error, and if you have "int global_x;" in one
file and "double global_x;" in another, it's a mess. Modern gcc now
uses "-fno-common" to avoid this.

>
>> I suspect that there are many more that I don't know about.
>> As long as the cost is small, and it satisfies both standards,
>> not much reason not to do it.
>>
>> Fortran has stricter rules on aliasing than C. I don't actually know
>> about any effect on C programs, though, but it might be that
>> compilers do the same for C.
>
> The rules are different, and unless C is the intermediate language,
> a good compiler will hand the corresponding hints to the middle end.

AFAIUI the difference in aliasing rules is that in FORTRAN, pointer or
array parameters are assumed not to alias, while in C the compiler must
assume that they might alias, unless you use "restrict". Are there
other differences?

Pages:12
server_pubkey.txt

rocksolid light 0.9.7
clearnet tor