novaBBS - comp.arch - Re: Paper about ISO C

On 07/10/2021 22:47, BGB wrote:
> On 10/7/2021 2:35 PM, David Brown wrote:
>> On 07/10/2021 12:52, clamky@hotmail.com wrote:
>>> Might strike a cord with Anton and/or Mich
>>>
>>> https://www.yodaiken.com/2021/10/06/plos-2021-paper-how-iso-c-became-unusable-for-operating-system-development/
>>>
>>>
>>
>> It seems to me that the authors have misunderstood the language entirely.
>>
>> C was never meant to be a "high-level assembler" or a "portable
>> assembler". It was never intended that you could write low-level
>> systems programs in portable C. It was designed to reduce the need to
>> write code in assembler, and to improve the portability of code.
>> (Re-read that sentence if you like.) It was intended to be useable for
>> writing some kinds of code in a highly portable manner, and also to be
>> useful in non-portable systems code that depended strongly on the target
>> and the compiler.
>>
>> From day one, operating systems "written in C" contained assembly code,
>> compiler extensions, and other non-portable code. That remains true
>> today. "Coding tricks" were often used to get efficient results due to
>> more limited compilers - these are much less common now as compilers
>> optimise better.
>>
>>
>> As I got further in the document, it seems to be just another "the
>> compiler should do what I want it to do, not necessarily what I /told/
>> it to do" rant. "C today should work the way /I/ want it to do - the
>> way /I/ interpret the precise wording of what someone decades ago".
>> "Thompson and Ritchie were saints and prophets, and their words were
>> holy and eternal - regardless of how computers, languages, and
>> programming needs have changed since the inception of C". "We were all
>> happy and no bugs existed in code until gcc sinned by optimising code,
>> and we were thrown out of Eden into a hell of undefined behaviour".
>> "Yes, we know that compilers often give us ways to get the semantics we
>> want by specific flags, but that's not good enough - everyone else
>> should get slower code so that we don't have to think as much or follow
>> the rules of the language".
>>
>
> The author of the paper (Victor Yodaiken) has sort of a long history of
> "making a mountain out of a molehill" about all this stuff...
>

It seems that way. Often there are some good points - as I wrote, no
language or tool is perfect, and it's important that people find weak
points or make suggestions for improvement. Massively exaggerated
rants, however, are worse than useless - they encourage other fanatics
who feed on the same myths, doom-saying and exaggerations. And they
ensure that those who actually have a say in the tools and language -
the standards committees and compiler developers - will ignore all their
concerns and write them off as fanatics. Good points they raise get
thrown out and ignored along with the reams of rubbish,
misunderstandings and misrepresentations that hide them.

>
> I have my opinions, but will note that "the system" is self-regulating
> to some extent, as compiler people would have a hard-time keeping people
> using their compilers if they set out to break large amounts of existing
> "real world" code (and will tend to back down when they face sufficient
> backlash).

Agreed. Compiler writers have no interest in making tools that people
don't want to use. If feedback suggests a new compiler version does not
work the way people expect and want, then the compiler developers take
that seriously - they discuss matters with the users, they change the
behaviour of the tools, they add flags or options to give users more
control of the behaviour. In particular, they aim to test out new
changes during pre-release and beta test stages so that potential
problems get handled sooner.

A similar process applies to language standards - there are long periods
of proposals, discussions, and voting before changes are made. Big
compiler users (like Linux distributions, corporations such as IBM,
Microsoft, Google), big compiler implementers, researchers, and anyone
else who is interested can get involved and share their opinion.

You are never going to please everyone all of the time. And you are
never going to please the small but vocal community who object to
everything and accuse compiler writers and/or language committees of all
sorts of evil motives and conspiracies. But they do a solid job
overall, which is why languages like C are still a good choice for a lot
of tasks.

>
> In cases where stuff has broken in more significant ways, it has usually
> been fairly isolated (a particular build of a particular compiler),
> rather than a long-term change to the status quo.
>

And usually a "significant break" is found during testing stages, rather
than post release.

>>
>> Are there imperfections in C as a language standard? Sure. Show me a
>> perfect language standard as an alternative. Are there questionable
>> judgements in some of the design decisions of C toolchains? Sure -
>> again, show me a perfect toolchain. Is C not quite an exact fit for all
>> your needs for writing an OS? Sure. It's a general purpose language
>> that is good for a very wide range of uses - but it's unlikely to be
>> /perfect/ for any given use-case. And it never has been, and it never
>> will be. Nor will anything else. Modern C is a vastly better language
>> than the earliest K&R C. Modern C toolchains are vastly better than the
>> tools of that era. As well as the hundred steps forward, there have no
>> doubt been a dozen steps backwards for some people and some uses.
>> That's life. If you don't like it, give up programming and stick to a
>> safer hobby.
>>
>
>>
>> (For the record, I don't think type-based alias analysis actually gives
>> very significant optimisation opportunities in most code. On the other
>> hand, I also don't think it causes problems very often - not nearly as
>> often as the "optimisation is evil" crowd seem to behave. I write
>> low-level code all the time, and it is extraordinarily rarely that I
>> have to find a way around it.)
>>
>
> Will agree on this point.
>
> In my own testing, basic caching while also aggressively invalidating
> the cached results (on any explicit memory store), can give much of the
> same performance benefit, while still being fairly conservative
> semantically.
>

There is a move towards "providence tracking" for memory and pointers as
an alternative to things like type-based alias analysis. This is a lot
more powerful, as it tracks when different references to the same type
cannot alias, and thus can give much more optimisation opportunities and
static error analysis benefits. It subsumes most TBAA, since pointers
to different types usually have different providences. But in cases
where the providence is actually the same, aliases (or potential
aliases) are handled correctly. However, it turns out to be quite
difficult to specify all this, and it is a work in progress (both for
language standards and implementations).

> The problem though is when optimizations start being treated as "all or
> nothing", eg, "you either accept TBAA or can't use optimizations at
> all", which is a problem IMO.
>
> Say:
> x=foo->x;
> *ptr=3;
> y=foo->x; //Do we reuse prior result here?
>
> Under TBAA, the second "foo->x" may reuse the result of the first, but
> under more conservative semantics, the "*ptr=" would effectively
> invalidate the cached result regardless of the pointer type.
>
> Whereas, say:
> x=foo->bar->x;
> y=foo->bar->y;
> Becomes, effectively:
> t0=foo->bar;
> x=t0->x;
> y=t0->y;
>
> Where no invalidation happens because local variables don't count as
> stores (it is generally pretty safe to assume that local variables exist
> off in their own space, roughly entirely independent from physical
> memory, unless their address is taken).
>
>
> One option is to specify subsets of optimizations based on settings, say:
> Os, optimize for size (take slower options if smaller);
> O1, optimize for speed, favoring size minimization over speed;
> O2, optimize for speed, favoring speed over size;
> O3, optimize for speed, enable more aggressive optimizations.
>
> So, for example, with Os, O1, and O2, TBAA is not used, whereas for O3
> or similar, things like TBAA would be enabled (with secondary options to
> enable or disable TBAA semantics).
>

That is pretty much what compilers do at the moment, except that TBAA is
a perfectly valid optimisation, relatively fast in the compiler, and
with no risk of producing bigger or slower code (unlike, say, loop
unrolling) - thus it is in -Os, -O2 and -O3 in gcc.

If you think that TBAA is not necessarily valid, then it should not be
enabled at /any/ level - but only be an explicit flag. If you think
that TBAA /is/ valid, but that some code might want to break the strict
aliasing rules, then what is needed is a standardised pragma to change
the language semantics. You can write:

#ifdef __GNUC__
#pragma GCC optimize "-fno-strict-aliasing"
#endif

It would be better if there were standardised pragmas, in the manner of
"#pragma STDC FP_CONTRACT ON" that exists today for fine-tuning floating
point semantics.

>
> MSVC does something vaguely similar to this, though from what I have
> found, for some types of programs (such as my emulators), O1 tends to be
> faster than O2 (I suspect because MSVC's O2 option is a lot more prone
> to make use of "misguided attempts at autovectorization" when compared
> with O1).
>

All compilers do something similar.

But "aggressive optimisations" at higher "O" levels do not mean
optimisations that might "break" code, veer from the standards, or
generate incorrect object code. (Compilers might have options like
that, but they will require explicit enabling.) It means optimisations
that might take excessive compile time in relation to their benefits, or
might result in code that is still correct but is actually slower than
you would otherwise get.

>
> Granted, arguably I could upgrade MSVC and see if anything is
> difference, but given I am kinda battling with space on "C:\" (and there
> is no real way to make it bigger), not particularly inclined to do so.
>
> Eg, "320GB should be more than enough space for anyone."
> Windows: "Nope!"
> Me: "Can you at least give me an option to expand the partition?"
> Windows: "What sort of insanity is this?..."
>
> Still have space on other drives, but this isn't of much use if one
> wants to install another version of Visual Studio or similar.
>
> ...
>

Subject	Replies	Author
Paper about ISO C By: clamky on Thu, 7 Oct 2021	815	clamky

UNIX enhancements aren't.

computers / comp.arch / Re: Paper about ISO C