Message-ID:

Reserve your abuse for your true friends. -- Larry Wall in <199712041852.KAA19364@wall.org>

devel / comp.lang.c / Re: "Some sanity for C and C++ development on Windows" by Chris Wellons

Re: "Some sanity for C and C++ development on Windows" by Chris Wellons

<krMCJ.117486$_Y5.84485@fx29.iad>

https://www.novabbs.com/devel/article-flat.php?id=19924&group=comp.lang.c#19924

Path: i2pn2.org!i2pn.org!news.swapon.de!newsreader4.netcologne.de!news.netcologne.de!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx29.iad.POSTED!not-for-mail
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0)
Gecko/20100101 Thunderbird/91.4.1
Subject: Re: "Some sanity for C and C++ development on Windows" by Chris
Wellons
Content-Language: en-US
Newsgroups: comp.lang.c
References: <sr0psj$g2d$1@dont-email.me>
<761b391e-f071-484e-8507-f58eeb44a8e9n@googlegroups.com>
<sr53qo$vbl$1@dont-email.me> <_mpBJ.219710$qz4.56726@fx97.iad>
<36c23681-a90b-4de4-8451-e31e74f6c838n@googlegroups.com>
<b13c9427-f475-4bcc-98c8-5de476b4e75bn@googlegroups.com>
<27fc916b-9aee-4a76-85e8-6d4a2281b74bn@googlegroups.com>
<884c9725-5b12-4727-98a1-6b7c46efb4aen@googlegroups.com>
<c52c7902-0ce0-4db2-af97-1f9fc5c2a9fan@googlegroups.com>
<74dd4f1f-c5ff-4c9e-9a04-3616a978fb04n@googlegroups.com>
<4a405512-8c50-479a-9928-857fc7d5fac4n@googlegroups.com>
<000e93e1-4d5e-4dda-91da-67ded6d70f83n@googlegroups.com>
<ccc7a06a-6454-4ae4-b29c-075cd76494f9n@googlegroups.com>
<srdd2b$om7$1@gioia.aioe.org>
<c8b3d237-403a-44a4-a74c-91a3ae26605an@googlegroups.com>
<srf2q6$c63$1@gioia.aioe.org>
<2f16854b-ab61-4aa1-af0a-d976535eaa00n@googlegroups.com>
<srg1jr$1rpr$1@gioia.aioe.org>
From: Rich...@Damon-Family.org (Richard Damon)
In-Reply-To: <srg1jr$1rpr$1@gioia.aioe.org>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Lines: 113
Message-ID: <krMCJ.117486$_Y5.84485@fx29.iad>
X-Complaints-To: abuse@easynews.com
Organization: Forte - www.forteinc.com
X-Complaints-Info: Please be sure to forward a copy of ALL headers otherwise we will be unable to process your complaint properly.
Date: Sun, 9 Jan 2022 21:00:48 -0500
X-Received-Bytes: 7251

by: Richard Damon - Mon, 10 Jan 2022 02:00 UTC

On 1/9/22 8:19 PM, Manfred wrote:
> On 1/9/2022 11:35 PM, Öö Tiib wrote:
>> On Sunday, 9 January 2022 at 18:34:25 UTC+2, Manfred wrote:
>>> On 1/9/2022 1:44 PM, Öö Tiib wrote:
>>>> On Sunday, 9 January 2022 at 03:17:10 UTC+2, Manfred wrote:
>>>>> On 1/8/2022 11:50 PM, Öö Tiib wrote:
>>>>>> On Saturday, 8 January 2022 at 20:49:01 UTC+2,
>>>>>> james...@alumni.caltech.edu wrote:
>>>>>>> On Saturday, January 8, 2022 at 11:17:53 AM UTC-5, Öö Tiib wrote:
>>>>>>>> On Saturday, 8 January 2022 at 06:52:33 UTC+2,
>>>>>>>> james...@alumni.caltech.edu wrote:
>> ...
>>>>
>>>>>>>> FILE *f = fopen( "Foo😀Bar.txt", "w");
>>>>>>>> That should work unless underlying file system does not support
>>>>>>>> files
>>>>>>>> named "Foo😀Bar.txt" If it supports but the code does not work
>>>>>>>> then it indicates
>>>>>>>> bad standard that allows implementations to weasel away. No
>>>>>>>> garbage like
>>>>>>>> u8fopen( u8"Foo😀Bar.txt", "w") coming somewhere maybe in C35 or
>>>>>>>> so is
>>>>>>>> needed as it already works like in my example on vast majority
>>>>>>>> of things.
>>>>>>>
>>>>>>> Nothing in the standard prevents an implementation from doing
>>>>>>> that. If one doesn't
>>>>>>> already do so, that's a choice made by the implementors, and you
>>>>>>> should ask them
>>>>>>> about it. Your real beef is with the implementors, not the standard.
>>>>>>
>>>>>> My beef is with standards. Adding garbage that does not work to
>>>>>> standard is wrong
>>>>>> and not adding what everybody at least half sane does use to
>>>>>> standard is also wrong.
>>>>>>
>>>>> Also agreed, but since utf-8 is transparent to ascii functions, what
>>>>> should have been added?
>>>>
>>>> Something that makes it clear that it is defect when
>>>> "Foo≡ƒÿÇBar.txt" is silently opened
>>>> on file-system that fully supports files named "Foo😀Bar.txt" I
>>>> suppose.
>>>>
>>> Assuming that "Foo≡ƒÿÇBar.txt" "Foo😀Bar.txt" have the same binary
>>> representation, what's the difference? One form or the other shows up
>>> only when it is displayed in some UI - the filesystem isn't one, which
>>> leads to the implementation's runtime behavior.
>>
>> How you mean same binary representation? Both "Foo≡ƒÿÇBar.txt" and
>> "Foo😀Bar.txt" files can be in same directory. Both have Unicode
>> names in underlying file system precisely as posted.
>
> I mean the same byte sequence in their name, but different UI
> representation, e.g. when decoded as utf-8 or w-1252 or whatever.

But that seems to imply that the file system keeps track of file name
encoding at the entry level, which I don't know any that do that.

>
> What you are saying assumes a Unicode-aware filesystem, that's not free
> from the point of view of the standard.
> But, in order to support utf-8, it would be enough to have a char based
> filesystem that treats names as plain 0-terminated char[]. That's
> easier, probably free on most platforms, but it's different from
> Unicode-aware (which could be UTF-16 like Windows, and there you have
> your problems).
>
>>
>>> If they are actually different in their binary sequence, and this is the
>>> result of the utf-8 string being wrongly converted multiple times, this
>>> looks like a bad implementation, rather than a problem with the
>>> standard.
>>> IIUC you are advocating for some statement in the standard that prevents
>>> implementations from messing up with "character sets" in null terminated
>>> char strings?
>>
>> I mean that standard should require that all char* texts are treated as
>> UTF-8 by standard library unless said otherwise. If implementation needs
>> some other encoding of such byte sequence then it provides
>> platform-specific functions or compiler switches and/or extends language
>> with implementation-defined char_iso8859_1_t character types and
>> prefixes. If it is noteworthy handy type then add it to standards too, I
>> don't care.
>
> I see this hard to win, and probably not ideal - suppose in 10 years
> some better encoding than utf-8 shows up, then you are screwed again.
>
> I'd rather stick to the fact that utf-8 is compatible with 0-terminated
> char[], and so a plausible wish would be that such strings are not
> screwed by the implementation; for example when you store a file name in
> a filesystem with fopen() and the name is given as char[], then the
> standard could mandate that reading back that same name as char[] gives
> back the same byte sequence.
>
> Currently I guess one could use a utf-8 string as a name to fopen() on
> Windows, then the OS assumes it is W-1252 and converts it into UTF-16,
> at which point it is screwed, and when you read it back into char[] it
> is garbage.
>
>>
>> If standard can define that overflow in signed atomics is well defined
>> and two's complement is mandated there then it also can define that all
>> char* texts are UTF-8. The only question is if what I suggest is
>> reasonable
>> or not. From viewpoint of implementer of standard library or users it
>> is likely blessing ... so I think it is question of
>> business/politics/religions.
>
> I agree with Richard here. Two's complement is not like utf-8.
> I still think it's technical rather than business/politics/religions in
> this case - as I said above I'm not sure it would even be ideal.

Re: "Some sanity for C and C++ development on Windows" by Chris Wellons

<d10748b6-4244-454d-b2f6-ed3c72618e2dn@googlegroups.com>

Subject	Author
"Some sanity for C and C++ development on Windows" by Chris Wellons	Lynn McGuire
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris	Vir Campestris
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Scott Lurndal
Re: "Some sanity for C and C++ development on Windows" by Chris	Kaz Kylheku
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Scott Lurndal
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	james...@alumni.caltech.edu
Re: "Some sanity for C and C++ development on Windows" by Chris	Guillaume
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Malcolm McLean
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	james...@alumni.caltech.edu
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Malcolm McLean
Re: "Some sanity for C and C++ development on Windows" by Chris	Bart
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Malcolm McLean
Re: "Some sanity for C and C++ development on Windows" by Chris	Bart
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Malcolm McLean
Re: "Some sanity for C and C++ development on Windows" by Chris	Bart
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Malcolm McLean
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	james...@alumni.caltech.edu
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	james...@alumni.caltech.edu
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Malcolm McLean
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris	Mateusz Viste
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris	Mateusz Viste
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Malcolm McLean
Re: "Some sanity for C and C++ development on Windows" by Chris	David Brown
Re: "Some sanity for C and C++ development on Windows" by Chris	Richard Damon
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Malcolm McLean
Re: "Some sanity for C and C++ development on Windows" by Chris	Richard Damon
Re: "Some sanity for C and C++ development on Windows" by Chris	David Brown
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Ben Bacarisse
Re: "Some sanity for C and C++ development on Windows" by Chris	David Brown
Re: "Some sanity for C and C++ development on Windows" by Chris	David Brown
Re: "Some sanity for C and C++ development on Windows" by Chris	Bart
Re: "Some sanity for C and C++ development on Windows" by Chris	Manfred
Re: "Some sanity for C and C++ development on Windows" by Chris	Guillaume
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Ben Bacarisse
Re: "Some sanity for C and C++ development on Windows" by Chris	Richard Damon
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Ben Bacarisse
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Malcolm McLean
Re: "Some sanity for C and C++ development on Windows" by Chris	Mateusz Viste
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Ben Bacarisse
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris	Mateusz Viste
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Malcolm McLean
Re: "Some sanity for C and C++ development on Windows" by Chris	Richard Damon
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Malcolm McLean
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	james...@alumni.caltech.edu
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris	Manfred
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris	Manfred
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris	Richard Damon
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris	Richard Damon
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris	Richard Damon
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris	Vir Campestris
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Scott Lurndal
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris	Vir Campestris
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris	Kaz Kylheku
Re: "Some sanity for C and C++ development on Windows" by Chris	Manfred
Re: "Some sanity for C and C++ development on Windows" by Chris	Richard Damon
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	james...@alumni.caltech.edu
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	james...@alumni.caltech.edu
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Po Lu
Re: "Some sanity for C and C++ development on Windows" by Chris	James Kuyper
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons	Öö Tiib
Re: "Some sanity for C and C++ development on Windows" by Chris	Richard Damon