novaBBS - comp.lang.c - Re: bart again (UCX64)

On 9/7/23 10:16 PM, Keith Thompson wrote:
> Richard Damon <Richard@Damon-Family.org> writes:
>> On 9/7/23 4:33 PM, Keith Thompson wrote:
>>> Richard Damon <Richard@Damon-Family.org> writes:
>>>> On 9/5/23 11:53 PM, candycane wrote:
>>>>> KK> This doesn't:
>>>>> KK> int fred(void)
>>>>> KK> {
>>>>> KK> while (1) {
>>>>> KK> }
>>>>> KK> }
>>>>> KK> The previous one will be a nuisance diagnosic since the
>>>>> always_true
>>>>> KK> function always returns true.
>>>>> KK> If the belief is correct, why not restructure the code:
>>>>> KK> int fred(void)
>>>>> KK> {
>>>>> KK> extern int always_true(void);
>>>>> KK> for (;;) {
>>>>> KK> always_true(); // call for its side effect only
>>>>> KK> }
>>>>> KK> }
>>>>> I may be reading this wrong, but aren't you getting into the Halting
>>>>> Problem at this point?
>>>>
>>>> Fulling solving the problem would be the Halting Problem, but there
>>>> are shortcuts that can get the vast majority.
>>>>
>>>> Mark functions that don't ever return with -Noreturn or [[noreturn]]
>>>> (or an equivalent).
>>>>
>>>> Assume loops with a non-constant control expression will at some point
>>>> terminate (yes, this isn't strictly true, but is a reasonable
>>>> assumption, and is made elsewhere in the standard).
>>>>
>>>> Since the above program has a for loop with an always true condition,
>>>> that can be assumed never to end (since there is no "break" in the
>>>> loop).
>>> It's not necessary to assume that such loops always terminate. Just
>>> don't assume anything one way or the other.
>>
>> Except that if you want to make falling off the end of an non-void
>> function an error, you can't do that.
>>
>> Note, from 6.8.5 p6 which states (using N2596):
>>
>> An iteration statement may be assumed by the implementation to
>> terminate if its controlling expression is not a constant
>> expression,171) and none of the following operations are performed in
>> its body, controlling expression or (in the case of a for statement)
>> its expression-3:172)
>> — input/output operations
>> — accessing a volatile object
>> — synchronization or atomic operations.
>>
>> This is a rule that allows optimizations to remove a loop that has no
>> external effect. A similar rule could be used to determine if the end
>> of the function is potentially reachable.
>
> That does complicate things. Note that currently an implementation is
> allowed but not required to "assume" that the loop terminates.
> Currently, the standard never requires control flow analysis to
> determine whether a constraint is violated.
>
> I'm not sure what the best solution is. I'd like to say that 6.8.5p6
> should be ignored when determining whether a diagnostic is required for
> potentially reaching a closing }.

The issue is that many functions DON'T reach the final }. and some
return values (particularly structures) might be "expensive", so it
would be desired to minimize cases where a dummy return statement is needed.

Using a simple test, will catch almost every real case. It will be rare
for real code to use a non-constant expression that the programmer KNOWS
the value will always be true, and if being done for side effects, could
just be changed to add a comma operator followed by true to make it
constant.

That, or you can add the dummy return statement if cheap (like it
normally will be).

>
> Full disclosure: I really dislike that statement in the standard, both
> because of its semantics and because it's written to permit an
> implementation to "assume" certain things rather than in terms of
> permitted behavior.

My guess is that some library + optimizer ran into the issue that some
cases generated "worthless" looping looking for something that was never
used. Seeing that the results of the loop had no effect, just removing
it helped speed up the code.

Note, the actual cases where it can be used is fairly limited, as it
says, in effect, that the loop creates no observable behavior, and the
compiler can determine the value of any variable changed in the loop
that is used later.

Note, that standard if FULL of such assumptions, because any statement
that can Undefined Behavior in certain conditions means the
implementation is allowed to assume that those conditions don't exists.
And that assumption can even precede the statement with Undefined
Behavior if no observable behavior happens between.

The assumption just isn't as clearly spelled out.

>
>>> For example:
>>> int fred(void) {
>>> while (1) {
>>> ;
>>> }
>>> }
>>> Since the condition is a constant expression, a reasonably clever
>>> compiler can prove that the closing } is unreachable, so no warning is
>>> necessary. But:
>>> int barney(void) {
>>> while (some_function()) {
>>> ;
>>> }
>>> }
>>> Since the condition is not constant, the compiler doesn't know
>>> whether
>>> the closing } is reachable. If, as I've suggested, C were to adopt the
>>> C# rules, a diagnostic message would be mandatory, because the closing }
>>> *might* be reached. (Even if the compiler is able to determine that
>>> some_function() always returns 1, because it's defined in the same
>>> translation unit, the diagnostic message would still be required.)
>>
>> Right, because we need to define the level of effort the compiler
>> needs to do. And it should avoid "surprizes" as much as possible.
>>
>>> In other words, define a straightforward set of rules for
>>> determining
>>> whether a point in the code (particularly the closing } of a non-void
>>> function other than main) is potentially reachable, rules that do not
>>> make any assumptions about non-constant values.
>>>
>>
>> Right, and using the existing "rule" about loops with non-constant
>> control expressions makes the most sense.

Which is why I suggested it.

>>
>>> Such a change to C would break some existing code. That's absolutely a
>>> valid reason to oppose it. On the other hand, every major edition of
>>> the C standard has broken some existing code, and compilers have always
>>> supported older versions and/or used non-fatal warnings, so such code
>>> would not be *fatally* broken.
>>> With the C# rules, there are some unavoidable false positives, such
>>> as:
>>> int sign(int n) {
>>> if (n < 0) return -1;
>>> if (n == 0) return 0;
>>> if (n > 0) return 1;
>>> }
>>> which are fairly easy to "fix". There are, if I'm not mistaken, no
>>> false negatives.
>>
>> Right. But make the function take a double instead of an int, and it
>> CAN fall through (on some implementations) since there exist double
>> values that are none of >0, <0 or == 0.
>
> That's a different function. I wrote the sign() function above to make
> the point I wanted to make.
>
>> One option would be to allow an [[unreachable]] attribute to mark that
>> you can't get here, even if the base rules don't prove it, and leave
>> the results Undefined if you somehow do get there.
>
> I like it. It's similar to the /*NOTREACHED*/ comment recognized by
> many lint implementations.
>
> It could be added *instead* of requiring reachability analysis, and a
> programmer could add [[unreachable]] before the closing } of a non-void
> function. But adding the requirement I propose would catch errors made
> by programmers who don't do that.
>

On 2023-09-09, David Brown <david.brown@hesbynett.no> wrote:
> On 09/09/2023 03:57, Kaz Kylheku wrote:
>> But then you have to be sure that this logical fact is true;
>> since it is an iron-clad assertion, the justification behind it
>> (which the compiler doesn't see nor care about) has to be iron
>> clad.
>
> Yes.
>
> But if I write "p++;", I also have to be absolutely sure that this is
> the correct action at the time. That's how programming works - I really
> do not see why anyone would think "unreachable();" is so special.

The thing is, no not always. Sometimes whether we increment
a variable is a requirement matter, where there are choices
that are not fatal.

E.g. we are classifying packets with a score. Do I give a score += 2 for
this condition, or just score += 1? Neither one is going to make
demons fly out of my noise. The software will behave well in
either way, with the output having different merits or whatever.
(The score += 2 is one closer to INT_MAX, so without knowing anything
else, it carries a risk that is lower in score += 1).

We can change details in programs to explore different behaviors
and decide on what the requirements should be, without there always
being an absolutely correct single choice which, if not made,
will bomb the program.

>> There is a risk in it because it's possible that if you neither
>> add "return 0" nor "unreachable()" that the situation might not
>> be erroneous at all if that end is reached.
>
> "Possibly not erroneous" implies "possibly erroneous", which - by
> definition - means "undefined behaviour", because you can't be sure
> about what will happen. So you are worried about adding explicit
> undefined behaviour in a situation where you currently already have
> undefined behaviour?

No; more like adding undefined behavior in where we don't have all the
information to know whether anything is wrong in the first place.

This is trade: the hope (more than hope, but make sure) that
unreachable() is in fact unreachable. Then if the program is smaller
and/or faster, it may represent a valid tradeoff.

There are some ways to try to obtain the benefits while containing
the risk, like having a wrapper for unreachable() like

#ifndef NDEBUG
#define assert_unreachable (abort())
#else
#define assert_unreachable (unreachable())
#endif

now in an assertion-enabled build we have the aborts, with
the code-bloat in our inline functions.

Based on confidence with testing that build, we can publish a
a NDEBUG build.

> Do you understand why I don't see writing "unreachable();" as
> particularly dangerous, compared to anything else in the code?
In summary no. There are changes we can make in programs that are
objectively safer than "insert UB here"; the amount of reasoning
required to convince ourselves that certain changes are okay is
considerably lower. An example is exploratory programming whereby
we tweak behaviors in the software which will then be captured
as requirements when settled.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Mastodon: @Kazinator@mstdn.ca
NOTE: If you use Google Groups, I don't see you, unless you're whitelisted.

Re: bart again (UCX64)

<87y1hfks06.fsf@bsb.me.uk>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=28992&group=comp.lang.c#28992

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: ben.use...@bsb.me.uk (Ben Bacarisse)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Sat, 09 Sep 2023 21:31:53 +0100
Organization: A noiseless patient Spider
Lines: 36
Message-ID: <87y1hfks06.fsf@bsb.me.uk>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com>
<GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com>
<20230907235623.619@kylheku.com> <udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me>
<543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me>
<7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
<874jk4mam8.fsf@bsb.me.uk>
<14ea840a-b27e-49c3-99b0-edebbb020662n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain
Injection-Info: dont-email.me; posting-host="af15565c84ea4ea7db7cc714af21fb35";
logging-data="219833"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+2WdoGXOM/sLd/d/tXfuF/vDDAI7ZIHp8="
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)
Cancel-Lock: sha1:4uBzVy6OB8I4P1r5T43wYike1Gs=
sha1:/Gxnc8CgHMdsrOrZ+tShR0Rd0Iw=
X-BSB-Auth: 1.57707f6f715b23d432f5.20230909213153BST.87y1hfks06.fsf@bsb.me.uk

by: Ben Bacarisse - Sat, 9 Sep 2023 20:31 UTC

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

> On Saturday, 9 September 2023 at 01:52:30 UTC+1, Ben Bacarisse wrote:
>> Malcolm McLean <malcolm.ar...@gmail.com> writes:
>>
>> > int sign(double x)
>> > {
>> > if (x < 0) return -1;
>> > if (x == 0) retun 0;
>> > if (x > 0) return 1;
>> > /* unreachable */
>> > }
>> Just for information...
>> > In fact sign() needs to return a double to handle the corner case properly.
>> >
>> > It's NaN of course. The sign of NaN is NaN.
>> In IEEE floating point (almost universal these days) NaNs (and
>> infinities) are signed, and the sign often carries useful information.
>> You can't just state that "the sign of NaN is NaN" as if it's a plain
>> fact. One might want, for example, something more like:
>>
>> return x == 0 ? 0 : 1 - 2*signbit(x);
>>
>> The information alluded to above being that some readers may not know
>> that C has a signbit macro.
>>
> IEEE also allows for signed zero. But the sign of zero is zero. Certainly for
> positive zero. I suppose you might argue that for the rare negative zeroes
> you should return -1.

One might argue that, yes. But I decided to try to make minimal changes
to the behaviour of the posted code. The hypothetical author appears to
want positive and negative zeros to have sign(...) == 0.

--
Ben.

Re: bart again (UCX64)

<udk4a5$hkp5$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29008&group=comp.lang.c#29008

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Sun, 10 Sep 2023 12:03:48 +0200
Organization: A noiseless patient Spider
Lines: 139
Message-ID: <udk4a5$hkp5$1@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me>
<543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me>
<7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 10 Sep 2023 10:03:49 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="6626f03646a3960531a6c6dcf45c9912";
logging-data="578341"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18Gn0rdPOdGZUpIt180b7RgZ+aknUw7rX0="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.13.0
Cancel-Lock: sha1:4OhA5kQvgSgy6g19LnunQ0VBB90=
Content-Language: en-GB
In-Reply-To: <7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>

by: David Brown - Sun, 10 Sep 2023 10:03 UTC

On 09/09/2023 01:06, Malcolm McLean wrote:
> On Friday, 8 September 2023 at 15:03:51 UTC+1, David Brown wrote:
>> On 08/09/2023 11:47, Malcolm McLean wrote:
>>
>>> So a defined behaviour fallback is potentially useful. "This can't happen,
>>> but if it does, give me this diagnostic".
>> Where did you learn about logic? From watching Monty Python?
>>
>> Things that can't happen, can't happen. If the thing /might/ happen,
>> then it /can/ happen.
>>
>> Do you think I write my code by guesswork? Do you think I say to
>> myself, "I guess this is unlikely, and I don't want to consider it -
>> we'll pretend it it's impossible and tell the compiler its unreachable.
>> I'm sure no one will notice" ?
>>
> OK so
>
> int sign(double x)
> {
> if (x < 0) return -1;
> if (x == 0) retun 0;
> if (x > 0) return 1;
> /* unreachable */
> }
>
> Yes?

No :

// Precondition - "x" is a finite real-value double
// Postcondition - the return value is an integer -1, 0 or 1
// matching the sign of "x".
int sign(double x)
{ if (x < 0) return -1;
if (x == 0) return 0;
if (x > 0) return 1;
unreachable();
}

> No. In David Brown world, no-one would ever make that mistake.

I assume you are talking about double values that are unordered with
respect to 0, rather than your typo. Most people /do/ make typos like
this on occasion, but their editor or IDE would pick up on it
immediately - and their compiler would certainly spot it. I recommend
that you get a spell-checker for your Usenet client - it would pick up
on your regular typos. (I would make many more mistakes without a
spell-checker.)

You should write "unreachable();" to show that the line cannot be
reached by your code, as used in your program. If it might be reached,
don't write "unreachable();" - write the correct code to handle the
situation.

You seem to be under the strange misconception that some kind of omitted
or default behaviour is acceptable here. If "sign" is called with a NaN
or other value that is unordered with 0, then there is no plausible
correct value to return. A default "return 0;" would be wrong. Having
no return statement, so that in practice an unspecified value is
returned, is also wrong.

Writing "unreachable();" does not give a correct answer if someone fails
to follow the precondition for calling this function - but it makes it
clear to the reader and to the compiler that this line cannot be reached
when the function is used correctly. That is better than leaving a
mysterious and uninformed incorrect behaviour if the function is used
incorrectly.

> In reality,
> it's the sort of thing that is quite likely to slip through.

Not in my world, no. To me, programming is about specifying behaviour
and writing implementations of those specifications. It is certainly
the case that some specifications and details are never properly
documented, but they exist nonetheless. It seems entirely clear to me
that the function "sign" will be specified to work on real numbers, not
NaNs, because they do not have a sign.

> In fact sign() needs
> to return a double to handle the corner case properly.

No, it does not.

If you want to specify a function that takes a double and returns -1, 0,
1 or NaN if the argument does not have a sign, then you can do that -
calling the function "sign" would be a very bad name.

>
> It's NaN of course. The sign of NaN is NaN.
> Or you could say that NaN isn't in the domain of the function and either catch
> it, or just ignore it and accept any garbage - the real bug will be upstream.

You are beginning, slowly, to understand a little about the fundamentals
of programming. A function needs a specification. It has
pre-conditions, saying what must be true before the function starts. It
has post-conditions, saying what the function establishes when it is
finished. It has invariants, which are things that are true before it
starts, and remain true when it is finished. This gives you the
specification of the function.

Specifications are contracts - as the caller of the function, you
promise to fulfil the pre-conditions. The implementer of the function
then promises to fulfil the post-conditions. Stick to the contract, and
everyone is happy - lie and break the contract, and you have no reason
to expect anything.

Now, no one has given a clear specification of this "sign" function - we
have the name, and we can guess the specification from the name and the
original implementation. The pre-condition is that the argument is a
value that is ordered with respect to 0. The post-condition is an
integer -1, 0, or 1 representing that order with 0.

If that is the correct specification, then the implementation is
correct. The final line can be "unreachable();", "return 42;", omitted
entirely, or anything else - it's all fine. The sensible choice that is
clearest to the reader and gives the compiler the best chance for
optimisation and static error checking is "unreachable();".

If you want to change the specification and have a different function,
that's fine - and then you have a different implementation.

> Or you could say that the code is unreachable because we always filter
> out NaN. But how would you prove that?
>

Don't give the function a NaN in the first place.

None of my programs ever generate NaNs, and never have to deal with
them. I realise that some kinds of coding find NaNs, infinities, and
other non-real floating point values to be useful, but I don't. You are
only going to see them in some pretty odd and unusual calculations, so
take them into consideration at that point in your code.

Re: bart again (UCX64)

<udk99c$i9n7$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29010&group=comp.lang.c#29010

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Sun, 10 Sep 2023 13:28:43 +0200
Organization: A noiseless patient Spider
Lines: 103
Message-ID: <udk99c$i9n7$1@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me> <udeqdh$3dhqr$1@dont-email.me>
<udf0dl$3ehqn$1@dont-email.me> <20230908185106.539@kylheku.com>
<udi6v0$52sl$1@dont-email.me> <20230909103128.789@kylheku.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 10 Sep 2023 11:28:44 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="6626f03646a3960531a6c6dcf45c9912";
logging-data="599783"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18GKsSfPQn7aahM6eTLdCu42qsQcJAX5W0="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.13.0
Cancel-Lock: sha1:Mi7URwNNqhMm3yyQ89GObUKJ71Y=
Content-Language: en-GB
In-Reply-To: <20230909103128.789@kylheku.com>

by: David Brown - Sun, 10 Sep 2023 11:28 UTC

On 09/09/2023 20:19, Kaz Kylheku wrote:
> On 2023-09-09, David Brown <david.brown@hesbynett.no> wrote:
>> On 09/09/2023 03:57, Kaz Kylheku wrote:
>>> But then you have to be sure that this logical fact is true;
>>> since it is an iron-clad assertion, the justification behind it
>>> (which the compiler doesn't see nor care about) has to be iron
>>> clad.
>>
>> Yes.
>>
>> But if I write "p++;", I also have to be absolutely sure that this is
>> the correct action at the time. That's how programming works - I really
>> do not see why anyone would think "unreachable();" is so special.
>
> The thing is, no not always. Sometimes whether we increment
> a variable is a requirement matter, where there are choices
> that are not fatal.

And sometimes an incorrect increment /will/ be fatal.

Sometimes an incorrect "unreachable();" will be fatal, sometimes it will
not.

I can certainly agree that "unreachable();" should not be used lightly -
you don't put it in code unless you are entirely sure that this point
will never be reached. But that applies to most things - you don't put
"return" in your code unless you are entirely sure that at this point,
you want to exit the function.

>
>>> There is a risk in it because it's possible that if you neither
>>> add "return 0" nor "unreachable()" that the situation might not
>>> be erroneous at all if that end is reached.
>>
>> "Possibly not erroneous" implies "possibly erroneous", which - by
>> definition - means "undefined behaviour", because you can't be sure
>> about what will happen. So you are worried about adding explicit
>> undefined behaviour in a situation where you currently already have
>> undefined behaviour?
>
> No; more like adding undefined behavior in where we don't have all the
> information to know whether anything is wrong in the first place.

If you don't have all the information to know if there is anything
wrong, get that information! It would be completely unacceptable to me
to have a point in my code where I don't know if things are running
along nicely, or if the nasal daemons are dancing.

>
> This is trade: the hope (more than hope, but make sure) that
> unreachable() is in fact unreachable. Then if the program is smaller
> and/or faster, it may represent a valid tradeoff.

You are also marking it for readers. If you want, as a human reader you
can interpret it as meaning, "the control flow will only hit this point
if you have really screwed up somewhere earlier". That is, IMHO, vastly
better than something like "return 0;" which tells the reader "this is a
normal exit from normal flow of the code - if you can't see how you can
reach this point, you have misunderstood the code", or alternatively
"abort()" which tells the reader "calling the function containing this
line may crash your program".

>
> There are some ways to try to obtain the benefits while containing
> the risk, like having a wrapper for unreachable() like
>
> #ifndef NDEBUG
> #define assert_unreachable (abort())
> #else
> #define assert_unreachable (unreachable())
> #endif

Sure. I don't see that as "containing the risk" - but I /do/ see it as
a debugging and fault-finding aid to help spot problems in your code.
In particular, it can help establish limits on where a problem has occurred.

>
> now in an assertion-enabled build we have the aborts, with
> the code-bloat in our inline functions.
>
> Based on confidence with testing that build, we can publish a
> a NDEBUG build.
>
>> Do you understand why I don't see writing "unreachable();" as
>> particularly dangerous, compared to anything else in the code?
>
> In summary no. There are changes we can make in programs that are
> objectively safer than "insert UB here"; the amount of reasoning
> required to convince ourselves that certain changes are okay is
> considerably lower. An example is exploratory programming whereby
> we tweak behaviors in the software which will then be captured
> as requirements when settled.
>

I don't see "unreachable()" as saying "insert UB here". I see it as
telling the reader and the compiler that code flow will never reach this
point. And I don't put it there unless I am sure it is correct to do
so. The line is not reached at run-time - therefore, there can be no
run-time undefined behaviour from the line.

Re: bart again (UCX64)

<udkcak$in65$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29012&group=comp.lang.c#29012

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bc...@freeuk.com (Bart)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Sun, 10 Sep 2023 13:20:36 +0100
Organization: A noiseless patient Spider
Lines: 39
Message-ID: <udkcak$in65$1@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me> <udeqdh$3dhqr$1@dont-email.me>
<udf0dl$3ehqn$1@dont-email.me> <20230908185106.539@kylheku.com>
<udi6v0$52sl$1@dont-email.me> <20230909103128.789@kylheku.com>
<udk99c$i9n7$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 10 Sep 2023 12:20:36 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c4504a1daa8fa819485410b3a018db78";
logging-data="613573"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/u1noCl97939lYNg7IiMQnqSM+CEAIpWo="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.15.0
Cancel-Lock: sha1:Jn+joohIwoH1AWUCeWoz7hnUUCk=
In-Reply-To: <udk99c$i9n7$1@dont-email.me>

by: Bart - Sun, 10 Sep 2023 12:20 UTC

On 10/09/2023 12:28, David Brown wrote:
> On 09/09/2023 20:19, Kaz Kylheku wrote:

>> In summary no. There are changes we can make in programs that are
>> objectively safer than "insert UB here"; the amount of reasoning
>> required to convince ourselves that certain changes are okay is
>> considerably lower. An example is exploratory programming whereby
>> we tweak behaviors in the software which will then be captured
>> as requirements when settled.
>>
>
> I don't see "unreachable()" as saying "insert UB here". I see it as
> telling the reader and the compiler that code flow will never reach this
> point. And I don't put it there unless I am sure it is correct to do
> so. The line is not reached at run-time - therefore, there can be no
> run-time undefined behaviour from the line.

Would you be happy if the compiler used 'unreachable()' as a directive
to mean no normal function return code is necessary?

So that if control flow did somehow reach it, it would execute whatever
garbage instruction bytes followed.

This seems reasonable enough:

// a comment
return 1;
unreachable();
}

Then one day someone decides improve the quality of the comments, and
that one changes to:

// a comment ending with \

Of course, the program is already working, and doesn't need retesting
because after all only a comment has been modified.

Re: bart again (UCX64)

<udkdn0$it6d$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29013&group=comp.lang.c#29013

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Sun, 10 Sep 2023 14:44:15 +0200
Organization: A noiseless patient Spider
Lines: 116
Message-ID: <udkdn0$it6d$1@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me> <20230908185812.914@kylheku.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 10 Sep 2023 12:44:16 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="6626f03646a3960531a6c6dcf45c9912";
logging-data="619725"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX196zPSgh4oeBmjnyXtaMIBhA1wfeiV542s="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.13.0
Cancel-Lock: sha1:RWshNwQzmvHNpe9MquKqPg/avDw=
In-Reply-To: <20230908185812.914@kylheku.com>
Content-Language: en-GB

by: David Brown - Sun, 10 Sep 2023 12:44 UTC

On 09/09/2023 04:17, Kaz Kylheku wrote:
> On 2023-09-08, David Brown <david.brown@hesbynett.no> wrote:
>> So if you are not sure the code point is unreachable, don't "call"
>> unreachable(). It's very simple.
>
> Here is a bit of evening entertainment.
>
> In a Lisp compiler I developed I was suprised to see that it optimized
> away the return sequence from a function due to that being unreachable
> (not by being declared that way, just deduced trhough control flow
> analysis).
>
> Watch what happens to at optimization levels 0, 3 and 5 to the
> expression (while t (put-line "hello")), an infinite loop:
>
> I was surprised because I was not "after" that at all; the optimizations were
> not developed with that specific situation in mind.
>
> It's a like when people write chess programs which then surprise them
> by playing betetr than their makers.
>

(I'm snipping the Lisp stuff that I can't easily follow - but keeping
your analogy, since I can follow that!)

>
> So now the function doesn't include a wasteful end instruction.
> That stood out to my eyes right away because right until that point I was so
> used to seeing those terminating "end" instructions.
>
> In C, a compiler could do something similar: when the end of the function is
> not reachable, it could omit emitting any register restoring code and ret
> instruction, making the function smaller.

Yes, and C compilers often do that.

There's no point in manually adding "unreachable()" unless it conveys
useful information to the compiler and/or the reader. So it would be
pointless here:

int negabs(int x) {
if (x < 0) {
return x;
} else {
return -x;
}
unreachable();
}

But it would be entirely appropriate here:

// uartNo must be 0, 1 or 2
uart_t * get_uart_instance(int uartNo) {
if (uartNo == 0) return UART0;
if (uartNo == 1) return UART1;
if (uartNo == 2) return UART2;
unreachable();
}

Alternatively, you could write:

// uartNo must be 0, 1 or 2
uart_t * get_uart_instance(int uartNo) {
if ((uartNo < 0) || (uartNo > 2)) unreachable();
if (uartNo == 0) return UART0;
if (uartNo == 1) return UART1;
if (uartNo == 2) return UART2;
}

We could also write:

// uartNo must be 0, 1 or 2
uart_t * get_uart_instance(int uartNo) {
static uart_t * const uarts[] = { UART0, UART1, UART2 };
return uarts[uartNo];
}

Would you be as concerned about the implicit undefined behaviour in this
third version as you are with the explicit cases?

A smart enough compiler can figure out many things, but sometimes the
programmer has more information than is available to the compiler, and
sometimes compilers are not as smart as we'd like. C does not really
have a good way to tell the compile useful things, or to give clear
specifications to functions. (C++ is getting some of the in C++23.)

>
> Now if you lie to the compiler that the end is unreachable, even though it is,
> it could do the optimization code anyway, and leave the branch in place. Or
> remove the branch also, and adjust the jmp so that the loop becomes infinite.

Yes, lying to your compiler always ends in tears.

>
> For instance suppose the instruction tested some register t4, which gets
> loaded from some global variable that could be changing. that "if t4 5",
> in the false case, jumps to code that we have declard unreachable.
> There is no point in that, so the instruction gets erased.
>
> Since the instruction gets erased, the instruction which calculated t4
> now by accessing the global variable now a dead t4 register.
> So that instruction gets removed. The backwards jmp in the loop
> now goes to whatever instruction follows: the loop has become infinite;
> repeating without testing that global variable.
>
> Wrong facts that you declare trigger a cascade of deductions that are
> increasingly removed from the circumstances of that fact.
>

Yes. And correct facts that you declare can trigger a cascade of
deductions that increasingly improve the effectiveness of the code and
of static analysis.

So don't lie to your compiler - but /do/ tell it useful truths.

Re: bart again (UCX64)

<udkfp2$j78k$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29014&group=comp.lang.c#29014

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Sun, 10 Sep 2023 15:19:29 +0200
Organization: A noiseless patient Spider
Lines: 69
Message-ID: <udkfp2$j78k$1@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me> <udeqdh$3dhqr$1@dont-email.me>
<udf0dl$3ehqn$1@dont-email.me> <20230908185106.539@kylheku.com>
<udi6v0$52sl$1@dont-email.me> <20230909103128.789@kylheku.com>
<udk99c$i9n7$1@dont-email.me> <udkcak$in65$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 10 Sep 2023 13:19:30 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="6626f03646a3960531a6c6dcf45c9912";
logging-data="630036"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18dQW8I/3HxGVJyaeo/5xbHH1aqizkWb3k="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.13.0
Cancel-Lock: sha1:wc/AS7NbBt2pohVf/tvdEcfhlQQ=
In-Reply-To: <udkcak$in65$1@dont-email.me>
Content-Language: en-GB

by: David Brown - Sun, 10 Sep 2023 13:19 UTC

On 10/09/2023 14:20, Bart wrote:
> On 10/09/2023 12:28, David Brown wrote:
>> On 09/09/2023 20:19, Kaz Kylheku wrote:
>
>>> In summary no. There are changes we can make in programs that are
>>> objectively safer than "insert UB here"; the amount of reasoning
>>> required to convince ourselves that certain changes are okay is
>>> considerably lower. An example is exploratory programming whereby
>>> we tweak behaviors in the software which will then be captured
>>> as requirements when settled.
>>>
>>
>> I don't see "unreachable()" as saying "insert UB here". I see it as
>> telling the reader and the compiler that code flow will never reach
>> this point. And I don't put it there unless I am sure it is correct
>> to do so. The line is not reached at run-time - therefore, there can
>> be no run-time undefined behaviour from the line.
>
> Would you be happy if the compiler used 'unreachable()' as a directive
> to mean no normal function return code is necessary?
>

Yes - as compilers do. (i.e., compilers will not warn about missing
return statements if you have an unreachable() before falling off the
end of the function.)

> So that if control flow did somehow reach it, it would execute whatever
> garbage instruction bytes followed.

If you lie to your compiler, bad things happen.

>
> This seems reasonable enough:
>
>     // a comment
>         return 1;
>         unreachable();
>     }

It does not seem reasonable to me. Why would you put "unreachable()"
there? You don't write it in every unreachable situation!

>
> Then one day someone decides improve the quality of the comments, and
> that one changes to:
>
> // a comment ending with \

Such comments ending in a line-continuation are a horrible idea, totally
independently of any other good or bad feature of the language or any
discussion on "unreachable". Line continuations are a necessary evil in
large macros - anywhere else, they are a mistake. (IMHO, of course!)
Fortunately compilers will warn about such things.

>
> Of course, the program is already working, and doesn't need retesting
> because after all only a comment has been modified.
>

Comments should not be taken so lightly. Put comments in code only when
they are useful, and be sure they are correct - they are important to
people reading the code, and thus need to be checked and maintained just
like the rest of the code. And never put something in a comment when it
can be expressed equally well in code.

I realise not everyone follows this practice for commenting, but I think
it is useful advice.

Re: bart again (UCX64)

<udkm33$k407$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29015&group=comp.lang.c#29015

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bc...@freeuk.com (Bart)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Sun, 10 Sep 2023 16:07:14 +0100
Organization: A noiseless patient Spider
Lines: 97
Message-ID: <udkm33$k407$1@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me> <udeqdh$3dhqr$1@dont-email.me>
<udf0dl$3ehqn$1@dont-email.me> <20230908185106.539@kylheku.com>
<udi6v0$52sl$1@dont-email.me> <20230909103128.789@kylheku.com>
<udk99c$i9n7$1@dont-email.me> <udkcak$in65$1@dont-email.me>
<udkfp2$j78k$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 10 Sep 2023 15:07:15 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="c4504a1daa8fa819485410b3a018db78";
logging-data="659463"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19NJpc9DxVvf67FJr7icH7jAGeEMSyKJa0="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.15.0
Cancel-Lock: sha1:npsyomDHuZ66Y4Tbt6iYGBmbsNw=
In-Reply-To: <udkfp2$j78k$1@dont-email.me>

by: Bart - Sun, 10 Sep 2023 15:07 UTC

On 10/09/2023 14:19, David Brown wrote:
> On 10/09/2023 14:20, Bart wrote:

>> Would you be happy if the compiler used 'unreachable()' as a directive
>> to mean no normal function return code is necessary?
>>
>
> Yes - as compilers do. (i.e., compilers will not warn about missing
> return statements if you have an unreachable() before falling off the
> end of the function.)
>
>> So that if control flow did somehow reach it, it would execute
>> whatever garbage instruction bytes followed.
>
> If you lie to your compiler, bad things happen.
>
>>
>> This seems reasonable enough:
>>
>>      // a comment
>>          return 1;
>>          unreachable();
>>      }
>
> It does not seem reasonable to me. Why would you put "unreachable()"
> there? You don't write it in every unreachable situation!

You're not looking past the example. The 'return 1;' represents a point
which you are confident will not be passed.

>>
>> Then one day someone decides improve the quality of the comments, and
>> that one changes to:
>>
>> // a comment ending with \
>
> Such comments ending in a line-continuation are a horrible idea, totally
> independently of any other good or bad feature of the language or any
> discussion on "unreachable". Line continuations are a necessary evil in
> large macros - anywhere else, they are a mistake. (IMHO, of course!)

> Fortunately compilers will warn about such things.

There is too much reliance on compilers mitigating shortcomings in the
language.

(And still, I was able to run a crashing version of my example with very
little trouble, even with a host of -W options. What needs to be made
difficult is persuading the compiler to create the crashing version. But
that's unlikely to happen until compilers start to have a clue about
what qualifies as a hard error.

At the moment the easiest and laziest choices are (1) allow everything;
(2) allow nothing, using -Werror plus a bunch of -W options.))

Anyway, at the same time that // comments were introduced, which
necessitated an upgraded compiler, they could have introduced block macros:

#macro M(a,b,c)
.... // comment
.... // comment
#endmacro

At the minute that needs to be written as this error prone:

#define M(a,b,c) \
.... \
....

// comments are not allowed; you need to use /* ... */

>>
>> Of course, the program is already working, and doesn't need retesting
>> because after all only a comment has been modified.
>>
>
> Comments should not be taken so lightly. Put comments in code only when
> they are useful, and be sure they are correct - they are important to
> people reading the code, and thus need to be checked and maintained just
> like the rest of the code. And never put something in a comment when it
> can be expressed equally well in code.

A line comment such as C's // should comment out everything to the end
of the line. You should never need to care what follows the //.

But C's line-splicing using \ screws that up.

> I realise not everyone follows this practice for commenting, but I think
> it is useful advice.
>
>

On 2023-09-10, David Brown <david.brown@hesbynett.no> wrote:
> On 09/09/2023 20:19, Kaz Kylheku wrote:
>> On 2023-09-09, David Brown <david.brown@hesbynett.no> wrote:
>>> On 09/09/2023 03:57, Kaz Kylheku wrote:
>>>> But then you have to be sure that this logical fact is true;
>>>> since it is an iron-clad assertion, the justification behind it
>>>> (which the compiler doesn't see nor care about) has to be iron
>>>> clad.
>>>
>>> Yes.
>>>
>>> But if I write "p++;", I also have to be absolutely sure that this is
>>> the correct action at the time. That's how programming works - I really
>>> do not see why anyone would think "unreachable();" is so special.
>>
>> The thing is, no not always. Sometimes whether we increment
>> a variable is a requirement matter, where there are choices
>> that are not fatal.
>
> And sometimes an incorrect increment /will/ be fatal.
>
> Sometimes an incorrect "unreachable();" will be fatal, sometimes it will
> not.

In the case of unreachable(), "incorrect" is a synonym of "executed",

Increments that are executed could be correct, and we can know that.

Executed unreachables are never correct; when we add them we must
know that they are not executed.

> I can certainly agree that "unreachable();" should not be used lightly -
> you don't put it in code unless you are entirely sure that this point
> will never be reached. But that applies to most things - you don't put
> "return" in your code unless you are entirely sure that at this point,
> you want to exit the function.

Being sure that you want some behavior change, and being sure that
some property of the program is true on penalty of undefined behavior,
are in a different class.

The behavior change has a risk, but that is separate from wanting
the behavior change.

There are all sorts of situations in which it is obvious that a behavior
change is okay. Not necessarily a change we might want to commit to the
product.

Say we are debugging a compiler and want to eliminate whether a certain
optimization is the culprit, like dead code elimination. Just go into
the eliminate_dead_code function and stick in an early return so it does
nothing. (That could be a run-time option, but it's missing.)

I never have to add a "return" thinking, "I have checked that this is
okay because it will never be reached" because that would be absurd;
of course I'm adding it because it is reached.

When you put unreachable() at the end of a function, you're saying
that you're so certain that it's not reached, that it's okay for the
compiler not to emit the return sequence instructions. If that spot is
reached, control will fall through to the next function in the object
file.

> I don't see "unreachable()" as saying "insert UB here".

But you must think of it that way, if it is specified that way
in its documentation. E.g. the GCC one:

Built-in Function: void __builtin_unreachable (void)

If control flow reaches the point of the __builtin_unreachable, the
program is undefined. It is useful in situations where the compiler
cannot deduce the unreachability of the code.

Re: bart again (UCX64)

<udkr54$kuso$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29017&group=comp.lang.c#29017

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Sun, 10 Sep 2023 18:33:39 +0200
Organization: A noiseless patient Spider
Lines: 97
Message-ID: <udkr54$kuso$1@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me> <udeqdh$3dhqr$1@dont-email.me>
<udf0dl$3ehqn$1@dont-email.me> <20230908185106.539@kylheku.com>
<udi6v0$52sl$1@dont-email.me> <20230909103128.789@kylheku.com>
<udk99c$i9n7$1@dont-email.me> <udkcak$in65$1@dont-email.me>
<udkfp2$j78k$1@dont-email.me> <udkm33$k407$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 10 Sep 2023 16:33:40 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="6626f03646a3960531a6c6dcf45c9912";
logging-data="687000"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1877EbOhhoZB6vqSGpK3yfDZlGhbUdeBEg="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.13.0
Cancel-Lock: sha1:Kv9BTDhiD5YmVZuAJmob78RqXds=
Content-Language: en-GB
In-Reply-To: <udkm33$k407$1@dont-email.me>

by: David Brown - Sun, 10 Sep 2023 16:33 UTC

On 10/09/2023 17:07, Bart wrote:
> On 10/09/2023 14:19, David Brown wrote:
>> On 10/09/2023 14:20, Bart wrote:
>
>>> Would you be happy if the compiler used 'unreachable()' as a
>>> directive to mean no normal function return code is necessary?
>>>
>>
>> Yes - as compilers do. (i.e., compilers will not warn about missing
>> return statements if you have an unreachable() before falling off the
>> end of the function.)
>>
>>> So that if control flow did somehow reach it, it would execute
>>> whatever garbage instruction bytes followed.
>>
>> If you lie to your compiler, bad things happen.
>>
>>>
>>> This seems reasonable enough:
>>>
>>>      // a comment
>>>          return 1;
>>>          unreachable();
>>>      }
>>
>> It does not seem reasonable to me. Why would you put "unreachable()"
>> there? You don't write it in every unreachable situation!
>
> You're not looking past the example. The 'return 1;' represents a point
> which you are confident will not be passed.

OK - that was not clear to me. In that case, adding "unreachable()"
would be fine if it adds something helpful to the code - if it makes
things clearer to human readers, and/or helps the compiler with
optimisations or static error checking. (That can include eliminating
false positives on warnings about missing returns, so that you can more
easily find the real problems.)

>
>
>>>
>>> Then one day someone decides improve the quality of the comments, and
>>> that one changes to:
>>>
>>> // a comment ending with \
>>
>> Such comments ending in a line-continuation are a horrible idea,
>> totally independently of any other good or bad feature of the language
>> or any discussion on "unreachable". Line continuations are a
>> necessary evil in large macros - anywhere else, they are a mistake.
>> (IMHO, of course!)
>
>> Fortunately compilers will warn about such things.
>
> There is too much reliance on compilers mitigating shortcomings in the
> language.
>

There are lots of things that should not be required in a language like
C, but which can be checked in compilers (or other tools). Language
standards provide fixed requirements - a baseline, if you like.
Compilers (or other tools) can go beyond that with additional analysis
and checks that evolve and improve all the time, and that also vary
significantly from user to user.

Good tools are important. It's possible to write correct code with poor
tools - I used to write a lot of assembly code, and you don't get
anything like the tool help for assembly. But it's a poor developer who
does not take advantage of the tools available to reduce their risk of
errors, and reduce their effort during development. A C programmer
should rely on static warnings in their compiler in the same way a
carpenter should rely on a hammer rather than bashing nails in with a rock.

>
>
>>>
>>> Of course, the program is already working, and doesn't need retesting
>>> because after all only a comment has been modified.
>>>
>>
>> Comments should not be taken so lightly. Put comments in code only
>> when they are useful, and be sure they are correct - they are
>> important to people reading the code, and thus need to be checked and
>> maintained just like the rest of the code. And never put something in
>> a comment when it can be expressed equally well in code.
>
> A line comment such as C's // should comment out everything to the end
> of the line. You should never need to care what follows the //.
>
> But C's line-splicing using \ screws that up.

Allowing it avoids significant complications and re-writes of the
standard. Avoiding it is very easy, and compilers can easily warn about
it. I'd rather C did not allow line continuation backslashes at the end
of a single-line comment, but I don't lose any sleep over it.

Re: bart again (UCX64)

<udkrau$kuso$2@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29018&group=comp.lang.c#29018

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Sun, 10 Sep 2023 18:36:46 +0200
Organization: A noiseless patient Spider
Lines: 21
Message-ID: <udkrau$kuso$2@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me> <udeqdh$3dhqr$1@dont-email.me>
<udf0dl$3ehqn$1@dont-email.me> <20230908185106.539@kylheku.com>
<udi6v0$52sl$1@dont-email.me> <20230909103128.789@kylheku.com>
<udk99c$i9n7$1@dont-email.me> <20230910075417.811@kylheku.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sun, 10 Sep 2023 16:36:47 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="6626f03646a3960531a6c6dcf45c9912";
logging-data="687000"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+eCeEuUES3CIENFi69WRK+1u3ujniESs0="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.13.0
Cancel-Lock: sha1:DQHoQVpTrd8cBTlsfQH6jjSBY2c=
Content-Language: en-GB
In-Reply-To: <20230910075417.811@kylheku.com>

by: David Brown - Sun, 10 Sep 2023 16:36 UTC

On 10/09/2023 17:44, Kaz Kylheku wrote:
> On 2023-09-10, David Brown <david.brown@hesbynett.no> wrote:

>> I don't see "unreachable()" as saying "insert UB here".
>
> But you must think of it that way, if it is specified that way
> in its documentation. E.g. the GCC one:
>
> Built-in Function: void __builtin_unreachable (void)
>
> If control flow reaches the point of the __builtin_unreachable, the
> program is undefined. It is useful in situations where the compiler
> cannot deduce the unreachability of the code.
>

I know how it works (I have read most of the gcc manual, and I read that
description before first using __builtin_unreachable). But if you read
that sentence, it is a conditional - "/if/ control flow reaches the
point...". Control flow will /not/ reach the point in my code that has
"unreachable()", thus no undefined behaviour is introduced.

Re: bart again (UCX64)

<udmddn$v4e7$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29022&group=comp.lang.c#29022

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Mon, 11 Sep 2023 08:51:34 +0200
Organization: A noiseless patient Spider
Lines: 103
Message-ID: <udmddn$v4e7$1@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me>
<543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me>
<7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
<udk4a5$hkp5$1@dont-email.me>
<8238a0d7-aeb2-4b26-8b1d-611caa37049an@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 11 Sep 2023 06:51:35 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f469f48e2216a857141c84288974ac33";
logging-data="1020359"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ETST7FxNGGInA/zulymiJSBFetihtXr0="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.11.0
Cancel-Lock: sha1:yhoF8e7GTTy7wCLzDCPgVRSu1w0=
Content-Language: en-GB
In-Reply-To: <8238a0d7-aeb2-4b26-8b1d-611caa37049an@googlegroups.com>

by: David Brown - Mon, 11 Sep 2023 06:51 UTC

On 11/09/2023 06:36, Malcolm McLean wrote:
> On Sunday, 10 September 2023 at 11:04:05 UTC+1, David Brown wrote:
>> On 09/09/2023 01:06, Malcolm McLean wrote:
>>> Or you could say that the code is unreachable because we always filter
>>> out NaN. But how would you prove that?
>>>
>>
>> Don't give the function a NaN in the first place.
>>
>> None of my programs ever generate NaNs, and never have to deal with
>> them. I realise that some kinds of coding find NaNs, infinities, and
>> other non-real floating point values to be useful, but I don't. You are
>> only going to see them in some pretty odd and unusual calculations, so
>> take them into consideration at that point in your code.
>>
> You see this is the typical "I don't need a seatbelt because I would
> never crash, and if you want a seatbelt then that means you
> shouldn't be driving" David Brown.

Bollocks. I mean, complete and under drivel.

Maybe this is news to you, but some programmers know what they are
doing. Some of us understand good software development practices, and
apply them appropriately.

It is /peanuts/ to write floating point code that does not have NaNs.
You simply don't get them in straightforward code. And you don't get
them in complicated code either, unless you specifically choose to write
code that lets error conditions build up because it is more efficient to
do lots of calculations and check for oddities (NaNs, infinities, etc.)
afterwards.

You are the one that is always telling people that floating point
numbers represent real quantities - distances, times, weights, whatever.
These don't have NaNs, and anything you do with them is not going to
give you a NaN.

And in /my/ code, specifically, floating point types always represent
real quantities. And NaNs, infinities, and other special cases do not
ever occur.

Would you like to explain where you get NaNs in your code, or think you
might get them, and not know perfectly well that they are a possibility
in that particular section of the code?

>
> Of course it's almost certainly a programming error to pass a NaN to
> sign().

Yes.

If you want a "sign_or_NaN" function, write that and use it when you
need it. It's a different function, with a different domain, and a
different specification.

> And of course in simple programs, you can often easily show
> that NaNs will never occur. In a program which isn't simple, you would
> have to go through every call to sign(), and prove that a NaN can't be
> passed to it. Which isn't going to be practical.

That is simply terrible programming practice. Seriously. You should
know about the data that you are dealing with at any point in the code.
NaNs represent errors - they can be a practical error management method
for some code (especially looped or vectored calculations where
throughput per clock cycle is important), but like any other error, they
should be contained to a small and manageable part of the code - not
leaked everywhere.

So you know when your floating point data could be a NaN or not - and if
it could be, you don't call sign() with it. In fact, you usually don't
do much for it except check it for validity before moving on.

>
> NaNs may well be passed to sign(), and the unreachable() statement
> may be executed. Now of course if we realise this, we shouldn't use
> "unreachable()". We should handle it. If maybe we don't realise this,
> unreachable() should be our seatbelt.

No, what you are describing there is a drunk driver in a Ferrari driving
at 200 mph on a busy mountain road, wondering if a seatbelt is a good
idea. To get in the situation you describe, you have to make a whole
range of appallingly bad design, specification and coding issues. This
is not the result of bugs in coding (these happen to everyone, and as
has been discussed, there are ways to help find bugs). This is the
result of bad project leadership or bad management.

>
> You need methods which are as robust as possible to the errors which
> inevitably occur.

You need methods to find and fix bugs that occur - programmers make
mistakes, and these bugs need to be found. That I agree on.

What you don't need, is ways that cover up bugs and pass them on down
the line. You don't want programmers to have the attitude that it
doesn't really matter if your function is wrong and gives the wrong
answers, as the caller function expects wrong answers and will pass the
buck elsewhere. Take some responsibility for writing /correct/ code!

Re: bart again (UCX64)

<udmdte$uh79$10@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29023&group=comp.lang.c#29023

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: chris.m....@gmail.com (Chris M. Thomasson)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Sun, 10 Sep 2023 23:59:58 -0700
Organization: A noiseless patient Spider
Lines: 37
Message-ID: <udmdte$uh79$10@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me>
<543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me>
<7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
<udk4a5$hkp5$1@dont-email.me>
<8238a0d7-aeb2-4b26-8b1d-611caa37049an@googlegroups.com>
<udmddn$v4e7$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 11 Sep 2023 06:59:59 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="23a839a520af8f67c5f88d72ed24e588";
logging-data="1000681"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/8SSmQxtra5FopqDlULFWjUew/+kCymO0="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.15.0
Cancel-Lock: sha1:GCWFFIdxQuY2bdJ+qq0AaLlcJDQ=
In-Reply-To: <udmddn$v4e7$1@dont-email.me>
Content-Language: en-US

by: Chris M. Thomasson - Mon, 11 Sep 2023 06:59 UTC

On 9/10/2023 11:51 PM, David Brown wrote:
> On 11/09/2023 06:36, Malcolm McLean wrote:
>> On Sunday, 10 September 2023 at 11:04:05 UTC+1, David Brown wrote:
>>> On 09/09/2023 01:06, Malcolm McLean wrote:
>>>> Or you could say that the code is unreachable because we always filter
>>>> out NaN. But how would you prove that?
>>>>
>>>
>>> Don't give the function a NaN in the first place.
>>>
>>> None of my programs ever generate NaNs, and never have to deal with
>>> them. I realise that some kinds of coding find NaNs, infinities, and
>>> other non-real floating point values to be useful, but I don't. You are
>>> only going to see them in some pretty odd and unusual calculations, so
>>> take them into consideration at that point in your code.
>>>
>> You see this is the typical "I don't need a seatbelt because I would
>> never crash, and if you want a seatbelt then that means you
>> shouldn't be driving" David Brown.
>
> Bollocks. I mean, complete and under drivel.
>
> Maybe this is news to you, but some programmers know what they are
> doing. Some of us understand good software development practices, and
> apply them appropriately.[...]

:^) C and C++ allows us to eat without putting corks on the forks...
This scene in dirty rotten scoundrels still cracks me up:

https://youtu.be/SKDX-qJaJ08

Why is the cork on the fork? To prevent him from hurting himself. This
reminds me of people saying that C and/or C++ are to dangerous.

Still makes me giggle.

Re: bart again (UCX64)

<2ead0d9a-bc68-48c6-8fba-f10756997a6an@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29024&group=comp.lang.c#29024

copy link Newsgroups: comp.lang.c

X-Received: by 2002:ad4:4b26:0:b0:650:304:5137 with SMTP id s6-20020ad44b26000000b0065003045137mr213268qvw.12.1694419291367;
Mon, 11 Sep 2023 01:01:31 -0700 (PDT)
X-Received: by 2002:a05:6a00:2a06:b0:68f:a681:390a with SMTP id
ce6-20020a056a002a0600b0068fa681390amr1710620pfb.0.1694419290596; Mon, 11 Sep
2023 01:01:30 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Mon, 11 Sep 2023 01:01:29 -0700 (PDT)
In-Reply-To: <udmddn$v4e7$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:ddd6:c8a4:3cd4:5e3c;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:ddd6:c8a4:3cd4:5e3c
References: <1262755563@f172.n1.z21.fsxnet> <A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me> <4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me> <543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me> <7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
<udk4a5$hkp5$1@dont-email.me> <8238a0d7-aeb2-4b26-8b1d-611caa37049an@googlegroups.com>
<udmddn$v4e7$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2ead0d9a-bc68-48c6-8fba-f10756997a6an@googlegroups.com>
Subject: Re: bart again (UCX64)
From: malcolm....@gmail.com (Malcolm McLean)
Injection-Date: Mon, 11 Sep 2023 08:01:31 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 4777

by: Malcolm McLean - Mon, 11 Sep 2023 08:01 UTC

On Monday, 11 September 2023 at 07:51:50 UTC+1, David Brown wrote:
> On 11/09/2023 06:36, Malcolm McLean wrote:
>
> It is /peanuts/ to write floating point code that does not have NaNs.
> You simply don't get them in straightforward code. And you don't get
> them in complicated code either, unless you specifically choose to write
> code that lets error conditions build up because it is more efficient to
> do lots of calculations and check for oddities (NaNs, infinities, etc.)
> afterwards.
>
For your code, maybe.
>
> You are the one that is always telling people that floating point
> numbers represent real quantities - distances, times, weights, whatever.
> These don't have NaNs, and anything you do with them is not going to
> give you a NaN.
>
Vaild data shouldn't have NaNs. However sometimes you have missing values.
Sometimes it is incorrectly assumed that that data will not have missing values,
which is one way you can get a NaN.
Of course you'll say that you shouldn't incorrectly assume that data which can
have missing values can't have missing values. In your world.
>
> And in /my/ code, specifically, floating point types always represent
> real quantities. And NaNs, infinities, and other special cases do not
> ever occur.
>
> Would you like to explain where you get NaNs in your code, or think you
> might get them, and not know perfectly well that they are a possibility
> in that particular section of the code?
>
Degenerate geometry, bascially. An example of degenerate geometry is a
quad which is actually a triangle because one of its sides is of zero length.
But there are many, many possible degeneracies. Because the calculations
and done with finite precision, there are also near-degeneracies which
cause the same problems.
Its virtually inevitable that occasionally a check for a degeneracy will be
either missed or set to a too low an episilon. And because the degeneracies are
often rare, it can take a long time for the error to actually manifest itself.

That's the main. way that NaNs crop up unexpectly in my code.

You seem to think that your own, highly supervised, highly controlled
environment which deals only with understood problems, is the way everyone
has to work.
In my case, the costs of errors aren't very high (the worst thing that can happen
is that someone's drawing gets ruined, which is embarrassing for the company,
but it's not like we'll be hit with lawsuit for killing a baby). And a lot of the algorithms
are novel - no one has ever written a function to achieve a the same transformation
before. And the criteria for acceptability are often "does it look OK?" and hard to
specify mathematically.

So bascially what I want is a system that is robust to the inevitable programming
errors. It should return incorrect results rather than crashing out, because usually
the user can simply press "undo" an an operation hasn't worked, but his work is
still there. But it should also ideally tell me what has gone wrong and where.

Re: bart again (UCX64)

<udmlvm$10ckq$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29025&group=comp.lang.c#29025

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Mon, 11 Sep 2023 11:17:42 +0200
Organization: A noiseless patient Spider
Lines: 141
Message-ID: <udmlvm$10ckq$1@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me>
<543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me>
<7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
<udk4a5$hkp5$1@dont-email.me>
<8238a0d7-aeb2-4b26-8b1d-611caa37049an@googlegroups.com>
<udmddn$v4e7$1@dont-email.me>
<2ead0d9a-bc68-48c6-8fba-f10756997a6an@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 11 Sep 2023 09:17:42 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f469f48e2216a857141c84288974ac33";
logging-data="1061530"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/rxQO1ZtZmYPu4EjFHdGfsUK75xC5bFJU="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.11.0
Cancel-Lock: sha1:C34QmN6BlKsriQFns8JL04/d7Rc=
Content-Language: en-GB
In-Reply-To: <2ead0d9a-bc68-48c6-8fba-f10756997a6an@googlegroups.com>

by: David Brown - Mon, 11 Sep 2023 09:17 UTC

On 11/09/2023 10:01, Malcolm McLean wrote:
> On Monday, 11 September 2023 at 07:51:50 UTC+1, David Brown wrote:
>> On 11/09/2023 06:36, Malcolm McLean wrote:
>>
>> It is /peanuts/ to write floating point code that does not have NaNs.
>> You simply don't get them in straightforward code. And you don't get
>> them in complicated code either, unless you specifically choose to write
>> code that lets error conditions build up because it is more efficient to
>> do lots of calculations and check for oddities (NaNs, infinities, etc.)
>> afterwards.
>>
> For your code, maybe.

I don't believe there is anything special about my code in this respect.

>>
>> You are the one that is always telling people that floating point
>> numbers represent real quantities - distances, times, weights, whatever.
>> These don't have NaNs, and anything you do with them is not going to
>> give you a NaN.
>>
> Vaild data shouldn't have NaNs.

Correct. And you should not be using potentially invalid data without
knowing that it is potentially invalid.

> However sometimes you have missing values.

That may be the case. But you don't fill in the missing values with
random bits - you deal with them appropriately (which will depend on the
application). So missing values are not a source of NaNs, unless the
application and algorithms mean it is sensible to use NaNs to fill in
the missing values. And if that's the case, you know it is the case.

> Sometimes it is incorrectly assumed that that data will not have missing values,
> which is one way you can get a NaN.

And there's the root of your problem - incorrect assumptions. I don't
know where you'd get such incorrect assumptions - perhaps from poor
specifications, or poor names for your functions, or poor testing, or
poor code reviews.

> Of course you'll say that you shouldn't incorrectly assume that data which can
> have missing values can't have missing values. In your world.

If programmers are making incorrect assumptions about the data, and the
code, then they have a far bigger problem than just an odd
"unreachable()" statement. I appreciate that some software is written
by incompetent programmers (from lack of training, lack of experience,
lack of leadership, lack of time, whatever). And you have to deal with
that in some way. But strangling good software is not the way to do it.

Data comes in two forms. Verified data, that you know is correct
(valid, in range, free from SQL injection attacks, etc.), and unverified
data that comes from unknown or untrusted outside sources. The boundary
between these can be data read from files or received from networks, or
it can be a public API for a library, or different modules of a program
written by different programming groups. Sometimes you put in temporary
boundaries and verifications, for debugging, testing and fault-finding.
Sometimes security concerns mean you have to add extra internal
boundaries and checks.

But in general, once data is verified and moved "inside" your code,
/you/ have full control and full responsibility. It doesn't go missing.
It doesn't acquire NaNs mysteriously. If functions are not well
specified, it is /your/ fault. If functions are not used according to
specification, it is /your/ fault. (Specifications don't have to be
formally written to be important.)

And if you've made a mistake - such as data not been fully verified -
you have to fix that mistake. Crippling internal functions so that they
are inefficient and functionally useless is not the answer. Nor is
intentionally going out of your way to pass the errors around.

>>
>> And in /my/ code, specifically, floating point types always represent
>> real quantities. And NaNs, infinities, and other special cases do not
>> ever occur.
>>
>> Would you like to explain where you get NaNs in your code, or think you
>> might get them, and not know perfectly well that they are a possibility
>> in that particular section of the code?
>>
> Degenerate geometry, bascially. An example of degenerate geometry is a
> quad which is actually a triangle because one of its sides is of zero length.

So check for it.

> But there are many, many possible degeneracies. Because the calculations
> and done with finite precision, there are also near-degeneracies which
> cause the same problems.

So do the calculations, then check for NaN or other problems, then deal
with those issues. Don't pass on crap data to the rest of the code.

And if you are getting these kinds of issues, you need to look a lot
harder at your algorithms. Making good numerical algorithms that are
stable in face of awkward data like this is not easy. If you are not
qualified to do it, find someone who is (or more likely, find an
existing library that does the job right).

> Its virtually inevitable that occasionally a check for a degeneracy will be
> either missed or set to a too low an episilon. And because the degeneracies are
> often rare, it can take a long time for the error to actually manifest itself.
>
> That's the main. way that NaNs crop up unexpectly in my code.

If you know they can crop up, they are not unexpected - and you should
deal with them at the time.

>
> You seem to think that your own, highly supervised, highly controlled
> environment which deals only with understood problems, is the way everyone
> has to work.

Have you ever heard of "divide and conquer" ? Break up your problems
until you have something small enough that it can be controlled and
understood. I appreciate that NaNs can turn up in some kinds of code -
I said that several times already. The flaw is letting them escape
thoughtlessly.

> In my case, the costs of errors aren't very high (the worst thing that can happen
> is that someone's drawing gets ruined, which is embarrassing for the company,
> but it's not like we'll be hit with lawsuit for killing a baby). And a lot of the algorithms
> are novel - no one has ever written a function to achieve a the same transformation
> before. And the criteria for acceptability are often "does it look OK?" and hard to
> specify mathematically.
>
> So bascially what I want is a system that is robust to the inevitable programming
> errors. It should return incorrect results rather than crashing out, because usually
> the user can simply press "undo" an an operation hasn't worked, but his work is
> still there. But it should also ideally tell me what has gone wrong and where.
>

All I am saying is that you take responsibility of your code, and its
results. If you consider NaN an appropriate result, fair enough - deal
with it. Don't expect other code that handles finite values to be
changed to give meaningless results just because you want to pass in
meaningless arguments.

Re: bart again (UCX64)

<0cea903b-f004-4fb5-aae4-b6263e553e74n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29026&group=comp.lang.c#29026

copy link Newsgroups: comp.lang.c

X-Received: by 2002:a05:620a:405:b0:76f:f0e:e650 with SMTP id 5-20020a05620a040500b0076f0f0ee650mr160279qkp.9.1694427365489;
Mon, 11 Sep 2023 03:16:05 -0700 (PDT)
X-Received: by 2002:a17:903:41c5:b0:1c0:ac09:4032 with SMTP id
u5-20020a17090341c500b001c0ac094032mr3545572ple.9.1694427365060; Mon, 11 Sep
2023 03:16:05 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Mon, 11 Sep 2023 03:16:04 -0700 (PDT)
In-Reply-To: <udmlvm$10ckq$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:ddd6:c8a4:3cd4:5e3c;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:ddd6:c8a4:3cd4:5e3c
References: <1262755563@f172.n1.z21.fsxnet> <A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me> <4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me> <543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me> <7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
<udk4a5$hkp5$1@dont-email.me> <8238a0d7-aeb2-4b26-8b1d-611caa37049an@googlegroups.com>
<udmddn$v4e7$1@dont-email.me> <2ead0d9a-bc68-48c6-8fba-f10756997a6an@googlegroups.com>
<udmlvm$10ckq$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <0cea903b-f004-4fb5-aae4-b6263e553e74n@googlegroups.com>
Subject: Re: bart again (UCX64)
From: malcolm....@gmail.com (Malcolm McLean)
Injection-Date: Mon, 11 Sep 2023 10:16:05 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 2340

by: Malcolm McLean - Mon, 11 Sep 2023 10:16 UTC

On Monday, 11 September 2023 at 10:17:58 UTC+1, David Brown wrote:
> On 11/09/2023 10:01, Malcolm McLean wrote:
>
> > Degenerate geometry, bascially. An example of degenerate geometry is a
> > quad which is actually a triangle because one of its sides is of zero length.
> So check for it.
>
You are just mouthing platitudes as if they were great insights, and arrogantly
presenting them as your superior wisdom.
So you seriously think that if it was as simple as "check for it" I wouldn't have
thought of that? Seriously?
..

Re: bart again (UCX64)

<udmpuu$110jq$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29027&group=comp.lang.c#29027

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bc...@freeuk.com (Bart)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Mon, 11 Sep 2023 11:25:32 +0100
Organization: A noiseless patient Spider
Lines: 101
Message-ID: <udmpuu$110jq$1@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me>
<543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me>
<7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
<udk4a5$hkp5$1@dont-email.me>
<8238a0d7-aeb2-4b26-8b1d-611caa37049an@googlegroups.com>
<udmddn$v4e7$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 11 Sep 2023 10:25:34 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="dfc1b655b7e393255c23b7e32d9a2c1d";
logging-data="1081978"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+0szpj7JarrxsqwvWIS09b1jLTChXRKCg="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.15.0
Cancel-Lock: sha1:FIIJag56SDyaqNcEdflvhRRtWFw=
In-Reply-To: <udmddn$v4e7$1@dont-email.me>

by: Bart - Mon, 11 Sep 2023 10:25 UTC

On 11/09/2023 07:51, David Brown wrote:
> On 11/09/2023 06:36, Malcolm McLean wrote:

>> You see this is the typical "I don't need a seatbelt because I would
>> never crash, and if you want a seatbelt then that means you
>> shouldn't be driving" David Brown.
>
> Bollocks. I mean, complete and under drivel.

You seem to disagree.

> Maybe this is news to you, but some programmers know what they are
> doing. Some of us understand good software development practices, and
> apply them appropriately.
>
>
> It is /peanuts/ to write floating point code that does not have NaNs.
> You simply don't get them in straightforward code. And you don't get
> them in complicated code either, unless you specifically choose to write
> code that lets error conditions build up because it is more efficient to
> do lots of calculations and check for oddities (NaNs, infinities, etc.)
> afterwards.
>
> You are the one that is always telling people that floating point
> numbers represent real quantities - distances, times, weights, whatever.
> These don't have NaNs, and anything you do with them is not going to
> give you a NaN.
>
> And in /my/ code, specifically, floating point types always represent
> real quantities. And NaNs, infinities, and other special cases do not
> ever occur.
>
> Would you like to explain where you get NaNs in your code, or think you
> might get them, and not know perfectly well that they are a possibility
> in that particular section of the code?

You can get them when writing calculators or interpreters, or anything
where values are not known until runtime.

Even just reading numbers from a file. Or writing a compiler when you
need to reduce a constant expression.

Then INF or IND (I can't seem to get NAN unless IND means NAN) can occur
when performing calculations between those numbers.

>>
>> Of course it's almost certainly a programming error to pass a NaN to
>> sign().
>
> Yes.
>
> If you want a "sign_or_NaN" function, write that and use it when you
> need it. It's a different function, with a different domain, and a
> different specification.

Another situation where you can't control the inputs is in writing
library functions, then can be called from 1000 different applications
written by different people.

Then +/- INF/NAN/IND is a possible when writing a sign() function.

My bignum library which implements arbitrary precision floats
specifically has representations for infinity and nans. This is the
checking code needed for an add() operation:

switch (getbintype(a,b)) {
case nn_types:
break;
case zz_types:
bn_setzero(c);
return 1;
case nz_types:
bn_dupl(c,a);
return 1;
case zn_types:
bn_dupl(c,b);
return 1;
default:
bn_setnan(c);
return 0;
}

(An 'n' type is a normal number; a 'z' type is just zero, which here is
a special number.)

Since this is a library, I can't predict what inputs will be supplied.

>> And of course in simple programs, you can often easily show
>> that NaNs will never occur. In a program which isn't simple, you would
>> have to go through every call to sign(), and prove that a NaN can't be
>> passed to it. Which isn't going to be practical.
>
> That is simply terrible programming practice. Seriously. You should
> know about the data that you are dealing with at any point in the code.

You just haven't been writing the right programs. You don't always have
control over all the code that is running.

Re: bart again (UCX64)

<580ff51c-dcee-4ec0-a0d2-4bf0dae0b8f3n@googlegroups.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29028&group=comp.lang.c#29028

copy link Newsgroups: comp.lang.c

X-Received: by 2002:a05:6214:14e6:b0:64a:742c:dcbd with SMTP id k6-20020a05621414e600b0064a742cdcbdmr198782qvw.1.1694429977387;
Mon, 11 Sep 2023 03:59:37 -0700 (PDT)
X-Received: by 2002:a05:6a00:1a0c:b0:68b:de2e:74f9 with SMTP id
g12-20020a056a001a0c00b0068bde2e74f9mr3977879pfv.1.1694429976823; Mon, 11 Sep
2023 03:59:36 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer01.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Mon, 11 Sep 2023 03:59:36 -0700 (PDT)
In-Reply-To: <udmpuu$110jq$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:ddd6:c8a4:3cd4:5e3c;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:ddd6:c8a4:3cd4:5e3c
References: <1262755563@f172.n1.z21.fsxnet> <A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me> <4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me> <543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me> <7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
<udk4a5$hkp5$1@dont-email.me> <8238a0d7-aeb2-4b26-8b1d-611caa37049an@googlegroups.com>
<udmddn$v4e7$1@dont-email.me> <udmpuu$110jq$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <580ff51c-dcee-4ec0-a0d2-4bf0dae0b8f3n@googlegroups.com>
Subject: Re: bart again (UCX64)
From: malcolm....@gmail.com (Malcolm McLean)
Injection-Date: Mon, 11 Sep 2023 10:59:37 +0000
Content-Type: text/plain; charset="UTF-8"
X-Received-Bytes: 3345

by: Malcolm McLean - Mon, 11 Sep 2023 10:59 UTC

On Monday, 11 September 2023 at 11:25:49 UTC+1, Bart wrote:
>
> > That is simply terrible programming practice. Seriously. You should
> > know about the data that you are dealing with at any point in the code.
> You just haven't been writing the right programs. You don't always have
> control over all the code that is running.
>
Most code doesn't have to handle NaN specially, because the result of an
operation on a NaN is usually NaN. So if we take the length squared of
a vector which has a NaN member, we will obtain NaN, which is right.

But sign() is an exception. Despite what Ben says, the sign of NaN is NaN
(the fact that the sign bit doesn't physically disappear is irrelevant). So
to handle NaN correctly you've got to catch it, because when you branch
as the result of an operation on a NaN, you have to to either take or not
take the branch.

sign() written in a non-NaN aware fashion is an example of a function which
might appear to have an unreachable closing brace, but in fact the brace is reachable,
if the function is called with NaN. It's a mistake which could easily be made.
If unreachable() behaves sensibly, then it's a seatbelt. We've made a mistake,
but we get an error message or other behaviour which is as useful as it
can be, given that the program is incorrect.

And then as you say, often you don't have control over the person calling the
code. So even if you check every call to sign() and somehow prove that it can
never be NaN, you can't prevent a third party calling it with NaN, and the
"unreachable" brace will be reached.

Re: bart again (UCX64)

<udn2u8$12b7a$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29029&group=comp.lang.c#29029

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Mon, 11 Sep 2023 14:58:48 +0200
Organization: A noiseless patient Spider
Lines: 54
Message-ID: <udn2u8$12b7a$1@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me>
<543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me>
<7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
<udk4a5$hkp5$1@dont-email.me>
<8238a0d7-aeb2-4b26-8b1d-611caa37049an@googlegroups.com>
<udmddn$v4e7$1@dont-email.me>
<2ead0d9a-bc68-48c6-8fba-f10756997a6an@googlegroups.com>
<udmlvm$10ckq$1@dont-email.me>
<0cea903b-f004-4fb5-aae4-b6263e553e74n@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 11 Sep 2023 12:58:48 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f469f48e2216a857141c84288974ac33";
logging-data="1125610"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/VeN5WaIaeHY2N1NkFzvu9v0mlxgYF4Cg="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.11.0
Cancel-Lock: sha1:Lvic4m9cFL0m5XZZuqsJjBOQd4I=
In-Reply-To: <0cea903b-f004-4fb5-aae4-b6263e553e74n@googlegroups.com>
Content-Language: en-GB

by: David Brown - Mon, 11 Sep 2023 12:58 UTC

On 11/09/2023 12:16, Malcolm McLean wrote:
> On Monday, 11 September 2023 at 10:17:58 UTC+1, David Brown wrote:
>> On 11/09/2023 10:01, Malcolm McLean wrote:
>>
>>> Degenerate geometry, bascially. An example of degenerate geometry is a
>>> quad which is actually a triangle because one of its sides is of zero length.
>> So check for it.
>>
> You are just mouthing platitudes as if they were great insights, and arrogantly
> presenting them as your superior wisdom.
> So you seriously think that if it was as simple as "check for it" I wouldn't have
> thought of that? Seriously?
> .

From what you are saying, yes. Seriously.

From what you have said before about matrix determinants, I know you
don't have a lot of knowledge or experience in numerical methods and
algorithms. (There's nothing wrong with that - no one can be
experienced in everything.) So do I think you might have a geometry
algorithm in your code that is numerically unstable? Absolutely -
certainly if your application can tolerate some visual glitches (in such
cases, an occasionally unstable algorithm might be better than a more
complex stable algorithm). And do I think you are lax about checking?
Yes - you've said as much.

I know different kinds of software require different levels of
development quality. You need to put a lot more effort into ensuring
the correctness of code that will be in a card buried in a oil well head
at the bottom of the ocean than you need for a computer game that can be
easily updated online. But I like to believe that most programmers are
interested in writing correct code.

So if your data might be invalid, you check it. If your data might not
fit within the domain of the function you want to call, you check it.
If your function might not be able to return correct results, your
function needs to return error indicators in some way, so that the
caller function knows what it is getting. And then the caller function
handles the errors.

What you don't do is leave the problem data wandering around your
program, ready to cause more trouble. You don't cripple the performance
by writing all functions so that they handle NaNs in some predictable
way - you make sure they are not called with NaNs. You specify your
functions - including whether they can accept NaNs and what range of
valid data they accept, and what range of valid data and/or NaNs they
return. That all makes your code easier to write, lets you have more
efficient functions, keeps it cleaner, makes it easier to analyse (so
that you know when your data is valid), and makes it easier to debug.

No one said this was simple. Perhaps "check the data" is hard in your
case. Sometimes programming /is/ hard.

Re: bart again (UCX64)

<udn68a$12tcm$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29030&group=comp.lang.c#29030

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: bc...@freeuk.com (Bart)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Mon, 11 Sep 2023 14:55:21 +0100
Organization: A noiseless patient Spider
Lines: 101
Message-ID: <udn68a$12tcm$1@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me>
<543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me>
<7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
<udk4a5$hkp5$1@dont-email.me>
<8238a0d7-aeb2-4b26-8b1d-611caa37049an@googlegroups.com>
<udmddn$v4e7$1@dont-email.me>
<2ead0d9a-bc68-48c6-8fba-f10756997a6an@googlegroups.com>
<udmlvm$10ckq$1@dont-email.me>
<0cea903b-f004-4fb5-aae4-b6263e553e74n@googlegroups.com>
<udn2u8$12b7a$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 11 Sep 2023 13:55:22 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="dfc1b655b7e393255c23b7e32d9a2c1d";
logging-data="1144214"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18TRRyrRzGn1JCyPs0Zu4upvohPcdEIe6Q="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.15.0
Cancel-Lock: sha1:DWdkC03oKZRdlLiPScTosBr4btU=
In-Reply-To: <udn2u8$12b7a$1@dont-email.me>

by: Bart - Mon, 11 Sep 2023 13:55 UTC

On 11/09/2023 13:58, David Brown wrote:
> On 11/09/2023 12:16, Malcolm McLean wrote:
>> On Monday, 11 September 2023 at 10:17:58 UTC+1, David Brown wrote:
>>> On 11/09/2023 10:01, Malcolm McLean wrote:
>>>
>>>> Degenerate geometry, bascially. An example of degenerate geometry is a
>>>> quad which is actually a triangle because one of its sides is of
>>>> zero length.
>>> So check for it.
>>>
>> You are just mouthing platitudes as if they were great insights, and
>> arrogantly
>> presenting them as your superior wisdom.
>> So you seriously think that if it was as simple as "check for it" I
>> wouldn't have
>> thought of that? Seriously?
>> .
>
> From what you are saying, yes. Seriously.
>
> From what you have said before about matrix determinants, I know you
> don't have a lot of knowledge or experience in numerical methods and
> algorithms. (There's nothing wrong with that - no one can be
> experienced in everything.) So do I think you might have a geometry
> algorithm in your code that is numerically unstable? Absolutely -
> certainly if your application can tolerate some visual glitches (in such
> cases, an occasionally unstable algorithm might be better than a more
> complex stable algorithm). And do I think you are lax about checking?
> Yes - you've said as much.
>
> I know different kinds of software require different levels of
> development quality. You need to put a lot more effort into ensuring
> the correctness of code that will be in a card buried in a oil well head
> at the bottom of the ocean than you need for a computer game that can be
> easily updated online. But I like to believe that most programmers are
> interested in writing correct code.
>
> So if your data might be invalid, you check it. If your data might not
> fit within the domain of the function you want to call, you check it. If
> your function might not be able to return correct results, your function
> needs to return error indicators in some way, so that the caller
> function knows what it is getting. And then the caller function handles
> the errors.
>
> What you don't do is leave the problem data wandering around your
> program, ready to cause more trouble. You don't cripple the performance
> by writing all functions so that they handle NaNs in some predictable
> way - you make sure they are not called with NaNs. You specify your
> functions - including whether they can accept NaNs and what range of
> valid data they accept, and what range of valid data and/or NaNs they
> return. That all makes your code easier to write, lets you have more
> efficient functions, keeps it cleaner, makes it easier to analyse (so
> that you know when your data is valid), and makes it easier to debug.

It sounds like you wouldn't care if NaNs didn't exist and weren't
supported in hardware.

(I don't care much either, when using iee754.)

But it does make you wonder why they bothered supporting NaNs at all, if
all you have to do is take more care in writing your programs.

> No one said this was simple. Perhaps "check the data" is hard in your
> case. Sometimes programming /is/ hard.

And maybe NaNs were added to make it easier. Given a program like this:

#include <stdio.h>
#include <math.h>

int main(void) {
double x=-9.0;

printf("%f\n", sqrt(x));
printf("%d\n", (int)sqrt(x));
printf("%X\n", (int)sqrt(x));
}

what behaviour would you expect or prefer?

The output I got from on-line Clang was first that 'sqrt' was not
defined. That's due to that ******* stupid -lm thing (FFS just fix
instead of requiring millions to piss about with it).

With that out of the way, it produced:

-nan
-2147483648
80000000

(Not sure why it's -nan rather than nan)

Other compilers are similar; some show -1.#IND00 instead of 'nan'.

Let me guess: you'd rather it was unpredictable, or went haywire, or
just don't care because none of your programs would ever execute such
code. Since your software is NEVER buggy, not even experimental code
during development or throwaway programs of no consequence.

Re: bart again (UCX64)

<20230911064035.380@kylheku.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29031&group=comp.lang.c#29031

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: 864-117-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Mon, 11 Sep 2023 13:57:03 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 106
Message-ID: <20230911064035.380@kylheku.com>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com>
<GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me>
<543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me>
<7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
<udk4a5$hkp5$1@dont-email.me>
<8238a0d7-aeb2-4b26-8b1d-611caa37049an@googlegroups.com>
<udmddn$v4e7$1@dont-email.me> <udmpuu$110jq$1@dont-email.me>
<580ff51c-dcee-4ec0-a0d2-4bf0dae0b8f3n@googlegroups.com>
Injection-Date: Mon, 11 Sep 2023 13:57:03 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="6a2d7add99c0dd7f4a95f93722b49487";
logging-data="1144739"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/OBaJMr0Kv/3FnSwf1Z5RMB3TNUQ4yv8s="
User-Agent: slrn/pre1.0.4-9 (Linux)
Cancel-Lock: sha1:dyIc88pzt7LM/woM8SH3YTCtQ+Y=

by: Kaz Kylheku - Mon, 11 Sep 2023 13:57 UTC

On 2023-09-11, Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:
> On Monday, 11 September 2023 at 11:25:49 UTC+1, Bart wrote:
>>
>> > That is simply terrible programming practice. Seriously. You should
>> > know about the data that you are dealing with at any point in the code.
>> You just haven't been writing the right programs. You don't always have
>> control over all the code that is running.
>>
> Most code doesn't have to handle NaN specially, because the result of an
> operation on a NaN is usually NaN. So if we take the length squared of
> a vector which has a NaN member, we will obtain NaN, which is right.
>
> But sign() is an exception. Despite what Ben says, the sign of NaN is NaN
> (the fact that the sign bit doesn't physically disappear is irrelevant). So
> to handle NaN correctly you've got to catch it, because when you branch
> as the result of an operation on a NaN, you have to to either take or not
> take the branch.

NaN is a device which lets error propagate across calculations so that
we can efficiently catch it at some convenient point, rather than bog
down every calculation step wih checks that complicate/bloat the source
code and slow down the object code.

But you do have to check for NaN somewhere, otherwise the program
will report it as its output.

Yes; if you're converting the intermediate result to some other
representation which itself doesn't have anything analogous to NaN,
then you might want to check it at that point.

For instance if we convert an "a < b" comparison into two control flows
which no longer refer to a and b, and those operands could be NaN,
we should check for NaN. Otherwise the error could be swept under
the rug.

If all branches of the conditional use both input values to
continue the calculation, then you may be able to get away without
handling NaN at that point.

// ternary equivalent:

if (a < b) { return a + c; } else { return b + c; } // bad

Here although the a < b result will conceal NaN, because it's
not the case that both branches of the code refer to both operands,
a NaN could be swept under the rug.

if (a < b) { return b - a; } else { return a - b; }

Here we are okay.

Tt is the same with the proposed sign function.

> sign() written in a non-NaN aware fashion is an example of a function which
> might appear to have an unreachable closing brace, but in fact the brace is reachable,
> if the function is called with NaN.

If the function arbitrarily returned 0, it could work. The caller would
have to check the *operands* for NaN in the situations where that is
required, like when the result of sign() is used to select code paths,
not all of which pull all operands into the calculation.

> It's a mistake which could easily be made.
> If unreachable() behaves sensibly, then it's a seatbelt.

Nothing that invokes undefined behavior when the accident occurs
can be called a seatbelt.

You seem to be mistaking it for abort() or similar.

The unreachable() we have been discussing tells the compiler that the
code won't be reached. You've proven that somehow and the compiler can
optimize accordingly.

Given

a()
{
unreachable();
}

b()
{
puts("wut?");
}

It is possible for a call to a() to cause the puts to be called,
producing the "wut?" output.

That could happen if the functions are emitted, with only no-op
instructions between them and because of the unreachable assertion,
the compiler didn't emit the return sequence, for a, so that
effectively it tail calls into b by falling through.

> We've made a mistake,
> but we get an error message or other behaviour which is as useful as it
> can be, given that the program is incorrect.

If you want that, you have to reach for something other than
unreachable assertions.

Re: bart again (UCX64)

<udn6vo$1318q$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29032&group=comp.lang.c#29032

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Mon, 11 Sep 2023 16:07:51 +0200
Organization: A noiseless patient Spider
Lines: 186
Message-ID: <udn6vo$1318q$1@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me>
<543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me>
<7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
<udk4a5$hkp5$1@dont-email.me>
<8238a0d7-aeb2-4b26-8b1d-611caa37049an@googlegroups.com>
<udmddn$v4e7$1@dont-email.me> <udmpuu$110jq$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 11 Sep 2023 14:07:52 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f469f48e2216a857141c84288974ac33";
logging-data="1148186"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19bWNUNiKtBdMzU5thYhUvVGcHZiqiipKQ="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.11.0
Cancel-Lock: sha1:QhTgeWstNG5qRdr1ftGG+LSkuXc=
In-Reply-To: <udmpuu$110jq$1@dont-email.me>
Content-Language: en-GB

by: David Brown - Mon, 11 Sep 2023 14:07 UTC

On 11/09/2023 12:25, Bart wrote:
> On 11/09/2023 07:51, David Brown wrote:
>> On 11/09/2023 06:36, Malcolm McLean wrote:
>
>>> You see this is the typical "I don't need a seatbelt because I would
>>> never crash, and if you want a seatbelt then that means you
>>> shouldn't be driving" David Brown.
>>
>> Bollocks. I mean, complete and under drivel.
>
> You seem to disagree.

:-)

>
>
>> Maybe this is news to you, but some programmers know what they are
>> doing. Some of us understand good software development practices, and
>> apply them appropriately.
>>
>>
>> It is /peanuts/ to write floating point code that does not have NaNs.
>> You simply don't get them in straightforward code. And you don't get
>> them in complicated code either, unless you specifically choose to
>> write code that lets error conditions build up because it is more
>> efficient to do lots of calculations and check for oddities (NaNs,
>> infinities, etc.) afterwards.
>>
>> You are the one that is always telling people that floating point
>> numbers represent real quantities - distances, times, weights,
>> whatever. These don't have NaNs, and anything you do with them is
>> not going to give you a NaN.
>>
>> And in /my/ code, specifically, floating point types always represent
>> real quantities. And NaNs, infinities, and other special cases do not
>> ever occur.
>>
>> Would you like to explain where you get NaNs in your code, or think
>> you might get them, and not know perfectly well that they are a
>> possibility in that particular section of the code?
>
> You can get them when writing calculators or interpreters, or anything
> where values are not known until runtime.
>
> Even just reading numbers from a file. Or writing a compiler when you
> need to reduce a constant expression.
>

Sure. These are all data from external sources, and need to be verified
and validated. And sometimes there is a possibility of a problem even
though the data is validated - that's okay, as long as you take it into
account. Floating point NaNs let you check for validity after the
calculation, and deal with problems appropriately (like printing an
error message).

And in other cases - such as the code I usually work with - there is no
possibility of NaNs getting into my code. I have no sources of external
data that could produce a NaN, and no external data that comes in
without checking. (I guess I am lucky there.)

> Then INF or IND (I can't seem to get NAN unless IND means NAN) can occur
> when performing calculations between those numbers.
>

Yes, if the numbers can be big enough (or otherwise poorly matched to
the calculations).

>>>
>>> Of course it's almost certainly a programming error to pass a NaN to
>>> sign().
>>
>> Yes.
>>
>> If you want a "sign_or_NaN" function, write that and use it when you
>> need it. It's a different function, with a different domain, and a
>> different specification.
>
> Another situation where you can't control the inputs is in writing
> library functions, then can be called from 1000 different applications
> written by different people.

Yes. Then you have perhaps three options:

1. Document the function, and insist that people use it correctly.
Those that don't, have themselves to blame - garbage in, garbage out.
This can be the right choice for small utility or calculation functions
or things that have to run efficiently, but would be a poor option for
something with security implications!

2. Document the requirements, then sanitize the inputs for safety and do
your best (which might mean doing nothing at all when the inputs are
invalid). Again, it's garbage in gives garbage out.

3. Document the function, providing some kind of clear error feedback in
the case of invalid inputs.

(Notice that documenting the function and its specification is a common
theme?)

Sometimes checking or sanitizing the inputs means trying to use them
identifying errors, such as NaNs, errno, or failure codes from other
functions.

>
> Then +/- INF/NAN/IND is a possible when writing a sign() function.
>

I disagree. Don't call the function when the input might be a NaN. I'd
consider "sign" to be a "type 1" function from the list above.

Remember, NaN's don't have a sign - they are unordered with respect to
0. So it makes no sense to call the "sign" function on a NaN - doing so
is a user error. There is no possibility of getting a correct answer
when the function is specified to return -1, 0 or 1, and it is given a
NaN. In computer science, there is a name sometimes given to a function
that will produce the correct output when given incorrect input -
"magic". It's great in specifications, but can't be implemented.
People who expect a library function to perform magic are usually
disappointed.

It's find to specify a different function that has a different
specification, and has a documented output for NaNs. It would be a
different function, with at least 4 possible output values - perhaps
given via an enumeration, or with an additional boolean "valid" flag, or
by setting errno.

> My bignum library which implements arbitrary precision floats
> specifically has representations for infinity and nans. This is the
> checking code needed for an add() operation:
>
>     switch (getbintype(a,b)) {
>     case nn_types:
>         break;
>     case zz_types:
>         bn_setzero(c);
>         return 1;
>     case nz_types:
>         bn_dupl(c,a);
>         return 1;
>     case zn_types:
>         bn_dupl(c,b);
>         return 1;
>     default:
>         bn_setnan(c);
>         return 0;
>     }
>
> (An 'n' type is a normal number; a 'z' type is just zero, which here is
> a special number.)
>
> Since this is a library, I can't predict what inputs will be supplied.

Agreed.

There had been no suggestion that "sign" was a library function when it
was described, or that it would be at the boundary of code sections.

>
>
>>> And of course in simple programs, you can often easily show
>>> that NaNs will never occur. In a program which isn't simple, you would
>>> have to go through every call to sign(), and prove that a NaN can't be
>>> passed to it. Which isn't going to be practical.
>>
>> That is simply terrible programming practice. Seriously. You should
>> know about the data that you are dealing with at any point in the code.
>
> You just haven't been writing the right programs. You don't always have
> control over all the code that is running.
>

Of course you do.

Your compiler has no control over what people type - maybe the source
code will have "int x = 1234567890123456789012345678901234567890;". So
when you are reading data from the source file, it's a string - you know
that. You check it for validity as a number - now you know it is a
string of decimal digits. Your code to turn that into an integer
constant will look at the length and value - now you know it is too
long. And so on.

At each stage, you know about the data you have.

Re: bart again (UCX64)

<20230911070056.703@kylheku.com>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29033&group=comp.lang.c#29033

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: 864-117-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Mon, 11 Sep 2023 14:13:19 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 37
Message-ID: <20230911070056.703@kylheku.com>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com>
<GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me>
<543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me>
<7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
<udk4a5$hkp5$1@dont-email.me>
<8238a0d7-aeb2-4b26-8b1d-611caa37049an@googlegroups.com>
<udmddn$v4e7$1@dont-email.me>
<2ead0d9a-bc68-48c6-8fba-f10756997a6an@googlegroups.com>
<udmlvm$10ckq$1@dont-email.me>
<0cea903b-f004-4fb5-aae4-b6263e553e74n@googlegroups.com>
<udn2u8$12b7a$1@dont-email.me>
<120791af-c99c-4fd8-8017-016f19038d8fn@googlegroups.com>
Injection-Date: Mon, 11 Sep 2023 14:13:19 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="6a2d7add99c0dd7f4a95f93722b49487";
logging-data="1149913"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18r/V+48cmi/y2LQO2YBJrJioYQwDG8w0I="
User-Agent: slrn/pre1.0.4-9 (Linux)
Cancel-Lock: sha1:UyPVJYvKsEINiMNKFHFSENY3JOk=

by: Kaz Kylheku - Mon, 11 Sep 2023 14:13 UTC

On 2023-09-11, Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:
> Now this function
>
> int sign(double x)
> {
> if (x < 0) return -1;
> if (x == 0) return 0;
> if (x > 0) return 1;
> }
>
> is an exception. Control will fall off the end if we pass NaN. And because we've returned the
> sign as an int, we have no NaN representation. So it's not correct for NaN.

We could write it like this:

{
if (x < 0) return -1;
if (x > 0) return 1;
return 0;
}

Then in the reference manual we add the admonishment: "If the input is
NaN, then sign returns 0; a NaN input is not converted to a NaN output,
since the return type has no such representation. The application should
check the argument for NaN in situations where that is necessary."

The zero return is correct for NaN because we defined it that way;
it means that the input is neither positive nor negative, not
that it's a number.

This is the same like defining pow(0, 0) to be 1, etc.

Re: bart again (UCX64)

<udn7vg$136e1$1@dont-email.me>

copy mid

https://www.novabbs.com/devel/article-flat.php?id=29034&group=comp.lang.c#29034

copy link Newsgroups: comp.lang.c

Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: bart again (UCX64)
Date: Mon, 11 Sep 2023 16:24:47 +0200
Organization: A noiseless patient Spider
Lines: 149
Message-ID: <udn7vg$136e1$1@dont-email.me>
References: <1262755563@f172.n1.z21.fsxnet>
<A5pKM.1180698$TPw2.694138@fx17.iad>
<87h6o51rts.fsf@nosuchdomain.example.com> <GWwKM.320358$uLJb.75975@fx41.iad>
<87y1hhmehf.fsf@nosuchdomain.example.com> <20230907235623.619@kylheku.com>
<udeksu$3crce$1@dont-email.me>
<4945f15a-22dd-431e-a732-81ed36615f27n@googlegroups.com>
<udepd1$3dal1$2@dont-email.me>
<543b9acc-de99-425c-bb1d-485e9f98889dn@googlegroups.com>
<udf9jn$3fta8$1@dont-email.me>
<7cf7b6bd-39d9-4e22-baee-e128cfb2f214n@googlegroups.com>
<udk4a5$hkp5$1@dont-email.me>
<8238a0d7-aeb2-4b26-8b1d-611caa37049an@googlegroups.com>
<udmddn$v4e7$1@dont-email.me>
<2ead0d9a-bc68-48c6-8fba-f10756997a6an@googlegroups.com>
<udmlvm$10ckq$1@dont-email.me>
<0cea903b-f004-4fb5-aae4-b6263e553e74n@googlegroups.com>
<udn2u8$12b7a$1@dont-email.me>
<120791af-c99c-4fd8-8017-016f19038d8fn@googlegroups.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 11 Sep 2023 14:24:48 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="f469f48e2216a857141c84288974ac33";
logging-data="1153473"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+buG6w29i39KvwjW7UCTN2CvjdAzARyxM="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101
Thunderbird/102.11.0
Cancel-Lock: sha1:fiErg9BIACjkFrAYaKuOyR+qSl0=
Content-Language: en-GB
In-Reply-To: <120791af-c99c-4fd8-8017-016f19038d8fn@googlegroups.com>

by: David Brown - Mon, 11 Sep 2023 14:24 UTC

On 11/09/2023 15:35, Malcolm McLean wrote:
> On Monday, 11 September 2023 at 13:59:03 UTC+1, David Brown wrote:
>> On 11/09/2023 12:16, Malcolm McLean wrote:
>>> On Monday, 11 September 2023 at 10:17:58 UTC+1, David Brown wrote:
>>>> On 11/09/2023 10:01, Malcolm McLean wrote:
>>>>
>>>>> Degenerate geometry, bascially. An example of degenerate geometry is a
>>>>> quad which is actually a triangle because one of its sides is of zero length.
>>>> So check for it.
>>>>
>>> You are just mouthing platitudes as if they were great insights, and arrogantly
>>> presenting them as your superior wisdom.
>>> So you seriously think that if it was as simple as "check for it" I wouldn't have
>>> thought of that? Seriously?
>>> .
>> From what you are saying, yes. Seriously.
>>
>> From what you have said before about matrix determinants, I know you
>> don't have a lot of knowledge or experience in numerical methods and
>> algorithms. (There's nothing wrong with that - no one can be
>> experienced in everything.) So do I think you might have a geometry
>> algorithm in your code that is numerically unstable? Absolutely -
>> certainly if your application can tolerate some visual glitches (in such
>> cases, an occasionally unstable algorithm might be better than a more
>> complex stable algorithm). And do I think you are lax about checking?
>> Yes - you've said as much.
>>
>> I know different kinds of software require different levels of
>> development quality. You need to put a lot more effort into ensuring
>> the correctness of code that will be in a card buried in a oil well head
>> at the bottom of the ocean than you need for a computer game that can be
>> easily updated online. But I like to believe that most programmers are
>> interested in writing correct code.
>>
>> So if your data might be invalid, you check it. If your data might not
>> fit within the domain of the function you want to call, you check it.
>> If your function might not be able to return correct results, your
>> function needs to return error indicators in some way, so that the
>> caller function knows what it is getting. And then the caller function
>> handles the errors.
>>
>> What you don't do is leave the problem data wandering around your
>> program, ready to cause more trouble. You don't cripple the performance
>> by writing all functions so that they handle NaNs in some predictable
>> way - you make sure they are not called with NaNs. You specify your
>> functions - including whether they can accept NaNs and what range of
>> valid data they accept, and what range of valid data and/or NaNs they
>> return. That all makes your code easier to write, lets you have more
>> efficient functions, keeps it cleaner, makes it easier to analyse (so
>> that you know when your data is valid), and makes it easier to debug.
>>
>> No one said this was simple. Perhaps "check the data" is hard in your
>> case. Sometimes programming /is/ hard.
>>
> No. But the point is that this function
>
> double lengthSQ(Vector *v)
> {
> return v->x *v->x + v->y *v->y;
> }
>
> will return NaN if v has a NaN member. And it might well be the case that
> the person who wrote it is completely ignorant that IEEE floating points have
> a NaN representation.
> And that's typical. Most operations on NaN return NaN, so most functions are
> correct within the domain of NaN.

Okay. So that function is "NaN propagating" - that's common for
functions that operate on real numbers and return real numbers.

>
> Now this function
>
> int sign(double x)
> {
> if (x < 0) return -1;
> if (x == 0) return 0;
> if (x > 0) return 1;
> }
>
> is an exception. Control will fall off the end if we pass NaN. And because we've returned the
> sign as an int, we have no NaN representation. So it's not correct for NaN. And that's a bit
> unusual.

It is not at all unusual. This is not a calculation function, it is a
classification function, operating on finite numbers and returning an
integer.

If that's not what you want, use or write a different function.

>
> Now if we make it double valued and "return x" instead of falling off the brace, we won't "cripple
> performance". But there might be a slight perfomance impact.

The impact on the performance is dependent on the way the code is used -
a function like this will typically be inlined. "Cripple" may be too
strong a word - it may also be too weak.

> Is it worthwhile? Hard to
> say.

Agreed.

But it won't give you the sign of the argument when passed a NaN, and so
the name would be wrong.

> But there's definitely a case for it. It just depends what the consequences of getting things
> incorrect are.

I am usually much more interested in encouraging correct coding than
worrying about the consequences of incorrect coding. Sometimes the
consequences of incorrect coding need to be the focus, such as in
security-critical code. But most often, you are better off spending the
effort getting the code right than minimising the consequences of
getting it wrong.

And code that does not do what it says, or does not say what it does, is
encouraging incorrect coding.

> And, as you say, probably if you are passing NaN to sign() that indicates a
> programming error upstream anyway.

Yes. So concentrate on that, not on the function that is correct.

>
> But checking for every possible degeneracy that could cause a NaN in geometrical algorithms
> is hard, and sometimes unacceptably expensive. And degenerate doesn't always mean "invalid".
> For instance a quad with a zero-length side is still clockwise or anti-clockwise. However you
> can't tell which by taking a wedge product of two adjacent sides, if one of them is the zero
> length side.
>

If that is an issue, then perhaps you have the wrong data structure.
Maybe you should store additional information (such as the orientation)
along with the points. Maybe you should be using different formats for
your points, such as quaternions instead of three real dimensions.
Maybe you should take care to order your quads so that any zero-length
sides come last, or split one side in two so that your sides all have
non-zero length. Or change your orientation function's algorithm to
find the most numerically stable choice of sides and use that for
determining the orientation.

There are many possibilities, and the "best" will of course be a
trade-off, and of course depend on how you are using the data.

System going down in 5 minutes.

devel / comp.lang.c / Re: bart again (UCX64)

devel / comp.lang.c / Re: bart again (UCX64)

Subject	Author
Re: bart again (UCX64)	candycane
Re: bart again (UCX64)	Richard Damon
Re: bart again (UCX64)	Keith Thompson
Re: bart again (UCX64)	Richard Damon
Re: bart again (UCX64)	Keith Thompson
Re: bart again (UCX64)	Kaz Kylheku
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Bart
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Kaz Kylheku
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Kaz Kylheku
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Bart
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Bart
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Kaz Kylheku
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Kaz Kylheku
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Chris M. Thomasson
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Bart
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Kaz Kylheku
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Kaz Kylheku
Re: bart again (UCX64)	Bart
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	Kaz Kylheku
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	Bart
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	Bart
Re: bart again (UCX64)	Scott Lurndal
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	Bart
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	Scott Lurndal
Re: bart again (UCX64)	Bart
Re: bart again (UCX64)	Richard Damon
Re: bart again (UCX64)	Scott Lurndal
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	Bart
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	Bart
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	Bart
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Tim Rentsch
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	Tim Rentsch
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	Scott Lurndal
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	Scott Lurndal
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Scott Lurndal
Re: bart again (UCX64)	James Kuyper
Re: bart again (UCX64)	Kaz Kylheku
[meta] spam in thread	Malcolm McLean
Re: [meta] spam in thread	Kaz Kylheku
Re: [meta] spam in thread	Malcolm McLean
Re: [meta] spam in thread	David Brown
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	Tim Rentsch
Re: bart again (UCX64)	Keith Thompson
Re: bart again (UCX64)	Tim Rentsch
Re: bart again (UCX64)	Chris M. Thomasson
Re: bart again (UCX64)	Keith Thompson
Re: bart again (UCX64)	Malcolm McLean
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	Ben Bacarisse
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Kaz Kylheku
Re: bart again (UCX64)	Scott Lurndal
Re: bart again (UCX64)	David Brown
Re: bart again (UCX64)	Richard Damon