Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

<wiggy> bwah, vodka in my mouse


devel / comp.lang.c / Re: "Some sanity for C and C++ development on Windows" by Chris Wellons

SubjectAuthor
* "Some sanity for C and C++ development on Windows" by Chris WellonsLynn McGuire
`* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
 `* Re: "Some sanity for C and C++ development on Windows" by ChrisVir Campestris
  `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsScott Lurndal
   +- Re: "Some sanity for C and C++ development on Windows" by ChrisKaz Kylheku
   `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
    +- Re: "Some sanity for C and C++ development on Windows" by Chris WellonsScott Lurndal
    `* Re: "Some sanity for C and C++ development on Windows" by Chris Wellonsjames...@alumni.caltech.edu
     +- Re: "Some sanity for C and C++ development on Windows" by ChrisGuillaume
     `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      +- Re: "Some sanity for C and C++ development on Windows" by Chris WellonsMalcolm McLean
      +* Re: "Some sanity for C and C++ development on Windows" by Chris Wellonsjames...@alumni.caltech.edu
      |+* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsMalcolm McLean
      ||+* Re: "Some sanity for C and C++ development on Windows" by ChrisBart
      |||`* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsMalcolm McLean
      ||| `* Re: "Some sanity for C and C++ development on Windows" by ChrisBart
      |||  `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsMalcolm McLean
      |||   `* Re: "Some sanity for C and C++ development on Windows" by ChrisBart
      |||    `- Re: "Some sanity for C and C++ development on Windows" by Chris WellonsMalcolm McLean
      ||`- Re: "Some sanity for C and C++ development on Windows" by Chris Wellonsjames...@alumni.caltech.edu
      |+* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||`* Re: "Some sanity for C and C++ development on Windows" by Chris Wellonsjames...@alumni.caltech.edu
      || `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||  +* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsMalcolm McLean
      ||  |`* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||  | `* Re: "Some sanity for C and C++ development on Windows" by ChrisMateusz Viste
      ||  |  +* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||  |  |+* Re: "Some sanity for C and C++ development on Windows" by ChrisMateusz Viste
      ||  |  ||+- Re: "Some sanity for C and C++ development on Windows" by Chris WellonsMalcolm McLean
      ||  |  ||`* Re: "Some sanity for C and C++ development on Windows" by ChrisDavid Brown
      ||  |  || `* Re: "Some sanity for C and C++ development on Windows" by ChrisRichard Damon
      ||  |  ||  +* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsMalcolm McLean
      ||  |  ||  |+* Re: "Some sanity for C and C++ development on Windows" by ChrisRichard Damon
      ||  |  ||  ||`- Re: "Some sanity for C and C++ development on Windows" by ChrisDavid Brown
      ||  |  ||  |`* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsBen Bacarisse
      ||  |  ||  | `- Re: "Some sanity for C and C++ development on Windows" by ChrisDavid Brown
      ||  |  ||  `- Re: "Some sanity for C and C++ development on Windows" by ChrisDavid Brown
      ||  |  |`- Re: "Some sanity for C and C++ development on Windows" by ChrisBart
      ||  |  +- Re: "Some sanity for C and C++ development on Windows" by ChrisManfred
      ||  |  +- Re: "Some sanity for C and C++ development on Windows" by ChrisGuillaume
      ||  |  `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsBen Bacarisse
      ||  |   +* Re: "Some sanity for C and C++ development on Windows" by ChrisRichard Damon
      ||  |   |`* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsBen Bacarisse
      ||  |   | `- Re: "Some sanity for C and C++ development on Windows" by Chris WellonsMalcolm McLean
      ||  |   `* Re: "Some sanity for C and C++ development on Windows" by ChrisMateusz Viste
      ||  |    +- Re: "Some sanity for C and C++ development on Windows" by Chris WellonsBen Bacarisse
      ||  |    `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||  |     +* Re: "Some sanity for C and C++ development on Windows" by ChrisMateusz Viste
      ||  |     |`- Re: "Some sanity for C and C++ development on Windows" by Chris WellonsMalcolm McLean
      ||  |     `* Re: "Some sanity for C and C++ development on Windows" by ChrisRichard Damon
      ||  |      `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||  |       `- Re: "Some sanity for C and C++ development on Windows" by Chris WellonsMalcolm McLean
      ||  `* Re: "Some sanity for C and C++ development on Windows" by Chris Wellonsjames...@alumni.caltech.edu
      ||   `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||    `* Re: "Some sanity for C and C++ development on Windows" by ChrisManfred
      ||     `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||      `* Re: "Some sanity for C and C++ development on Windows" by ChrisManfred
      ||       `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||        +* Re: "Some sanity for C and C++ development on Windows" by ChrisRichard Damon
      ||        |`* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||        | +* Re: "Some sanity for C and C++ development on Windows" by ChrisRichard Damon
      ||        | |`* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||        | | +- Re: "Some sanity for C and C++ development on Windows" by ChrisRichard Damon
      ||        | | `- Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||        | `* Re: "Some sanity for C and C++ development on Windows" by ChrisVir Campestris
      ||        |  `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsScott Lurndal
      ||        |   `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||        |    `* Re: "Some sanity for C and C++ development on Windows" by ChrisVir Campestris
      ||        |     `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||        |      `- Re: "Some sanity for C and C++ development on Windows" by ChrisKaz Kylheku
      ||        +* Re: "Some sanity for C and C++ development on Windows" by ChrisManfred
      ||        |+- Re: "Some sanity for C and C++ development on Windows" by ChrisRichard Damon
      ||        |`- Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||        `* Re: "Some sanity for C and C++ development on Windows" by Chris Wellonsjames...@alumni.caltech.edu
      ||         `* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      ||          +- Re: "Some sanity for C and C++ development on Windows" by Chris Wellonsjames...@alumni.caltech.edu
      ||          `- Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      |`* Re: "Some sanity for C and C++ development on Windows" by Chris WellonsPo Lu
      | `* Re: "Some sanity for C and C++ development on Windows" by ChrisJames Kuyper
      |  `- Re: "Some sanity for C and C++ development on Windows" by Chris WellonsÖö Tiib
      `- Re: "Some sanity for C and C++ development on Windows" by ChrisRichard Damon

Pages:1234
Re: "Some sanity for C and C++ development on Windows" by Chris Wellons

<cfbb82e9-fa5c-46c6-bad7-e797b1c49db3n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19957&group=comp.lang.c#19957

  copy link   Newsgroups: comp.lang.c
X-Received: by 2002:a05:622a:15cc:: with SMTP id d12mr2234855qty.190.1641868936996;
Mon, 10 Jan 2022 18:42:16 -0800 (PST)
X-Received: by 2002:a05:620a:199d:: with SMTP id bm29mr1840279qkb.450.1641868936860;
Mon, 10 Jan 2022 18:42:16 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Mon, 10 Jan 2022 18:42:16 -0800 (PST)
In-Reply-To: <1b26a8ea-98c3-4424-95e8-237cb9c31e04n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=94.246.251.164; posting-account=pysjKgkAAACLegAdYDFznkqjgx_7vlUK
NNTP-Posting-Host: 94.246.251.164
References: <sr0psj$g2d$1@dont-email.me> <761b391e-f071-484e-8507-f58eeb44a8e9n@googlegroups.com>
<sr53qo$vbl$1@dont-email.me> <_mpBJ.219710$qz4.56726@fx97.iad>
<36c23681-a90b-4de4-8451-e31e74f6c838n@googlegroups.com> <b13c9427-f475-4bcc-98c8-5de476b4e75bn@googlegroups.com>
<27fc916b-9aee-4a76-85e8-6d4a2281b74bn@googlegroups.com> <884c9725-5b12-4727-98a1-6b7c46efb4aen@googlegroups.com>
<c52c7902-0ce0-4db2-af97-1f9fc5c2a9fan@googlegroups.com> <74dd4f1f-c5ff-4c9e-9a04-3616a978fb04n@googlegroups.com>
<4a405512-8c50-479a-9928-857fc7d5fac4n@googlegroups.com> <000e93e1-4d5e-4dda-91da-67ded6d70f83n@googlegroups.com>
<ccc7a06a-6454-4ae4-b29c-075cd76494f9n@googlegroups.com> <srdd2b$om7$1@gioia.aioe.org>
<c8b3d237-403a-44a4-a74c-91a3ae26605an@googlegroups.com> <srf2q6$c63$1@gioia.aioe.org>
<2f16854b-ab61-4aa1-af0a-d976535eaa00n@googlegroups.com> <35c5adf1-a311-4832-a1a9-1090074b2621n@googlegroups.com>
<91da42d7-2503-4525-9ea4-8764a475b2b3n@googlegroups.com> <1b26a8ea-98c3-4424-95e8-237cb9c31e04n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <cfbb82e9-fa5c-46c6-bad7-e797b1c49db3n@googlegroups.com>
Subject: Re: "Some sanity for C and C++ development on Windows" by Chris Wellons
From: oot...@hot.ee (Öö Tiib)
Injection-Date: Tue, 11 Jan 2022 02:42:16 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 148
 by: Öö Tiib - Tue, 11 Jan 2022 02:42 UTC

On Monday, 10 January 2022 at 19:01:31 UTC+2, james...@alumni.caltech.edu wrote:
> On Monday, January 10, 2022 at 1:11:08 AM UTC-5, Öö Tiib wrote:
> > On Monday, 10 January 2022 at 07:24:01 UTC+2, james...@alumni.caltech.edu wrote:
> > > On Sunday, January 9, 2022 at 5:35:42 PM UTC-5, Öö Tiib wrote:
> > > > On Sunday, 9 January 2022 at 18:34:25 UTC+2, Manfred wrote:
> > > > > On 1/9/2022 1:44 PM, Öö Tiib wrote:
> > > ...
> > > > > > Something that makes it clear that it is defect when "Foo😀Bar.txt" is silently opened
> > > > > > on file-system that fully supports files named "Foo😀Bar.txt" I suppose.
> > > > > >
> > > > > Assuming that "Foo≡ƒÿÇBar.txt" "Foo😀Bar.txt" have the same binary
> > > > > representation, what's the difference? One form or the other shows up
> > > > > only when it is displayed in some UI - the filesystem isn't one, which
> > > > > leads to the implementation's runtime behavior.
> > > > How you mean same binary representation? Both "Foo😀Bar.txt" and
> > > > "Foo😀Bar.txt" files can be in same directory. Both have Unicode
> > > > names in underlying file system precisely as posted.
> > > Have you checked to make sure? Any system where passing the UTF-8 string
> > > "Foo😀Bar.txt" to fopen() opens a file whose name displays as Foo≡ƒÿÇBar.txt
> > > is likely to be a system where the file names are displayed using some single-
> > > byte encoding. The UTF-8 encoding of "Foo😀Bar.txt" is
> > >
> > > 0X46 0X6F 0X6F 0XF0 0X9F 0X98 0X80 0X42 0X61 0X72 0X2E 0X74 0X78 0X74
> > >
> > > After quite a bit of searching, I found Code page 865 (MS-DOS Nordic), which
> > > has 0XF0 = '≡', 0x9F = 'ƒ' and 0X98 = 'ÿ'. If the utilities that you use to display file
> > > names used that encoding to interpret the file name, that would explain your results.
> > Need a screenshot with files named "Foo😀Bar.txt" and "Foo≡ƒÿÇBar.txt"
> > side-by-side? ...
>
> No, that wouldn't be relevant. I realized shortly after I posted my message that you
> might not understand what I meant by "Have you checked to make sure?" I started
> composing a message in my head explaining in more detail. However, it was very late,
> I had to get to bed, and when I checked this morning you'd already confirmed that you
> didn't realize what I meant.
>
> As I understand it, you've opened a file using "Foo😀Bar.txt" as the file name, and
> somehow determined that the "actual" name of the file that got opened was
> "Foo😀Bar.txt". You didn't specify, but I presume you reached that conclusion by
> doing something like get a directory listing at the command line or using a GUI file
> browser to look at the directory.
> The UTF-8 encoding of "Foo😀Bar.txt" is
> 0X46 0X6F 0X6F 0XE2 0X89 0XA1 0XC6 0X92 0XC3 0XBF 0XC3 0X87 0X42 0X61 0X72 0X2E 0X74 0X78 0X7
>
> Those same bytes, if interpreted using Code page 865, represent the string
> "Foo😀Bar.txt". I know very little about Windows internals, so I'm not sure why that
> might be relevant. It could be just a coincidence, but that seems excessively
> unlikely. What I was suggestinging, and what I think Manfred was hinting at, is that the
> string you provide as the file name is stored by the file system using UTF-8 encoding.
> Whatever method you used to determine the "actual" file name interpreted those
> bytes using a single-byte encoding, which could be Code Page 865, or possibly some
> other encoding that encodes those particular characters the same way as Code Pag
> 865. There's a lot of different code pages out there, so I couldn't check them all, but of
> the dozen or so I checked, that is the only one where 0xF0 represents '≡'. The "MS
> DOS Nordic" code page was one of the first ones I checked, based upon your e-mail
> address oot...@hot.ee, where I presume "ee" refers to Estonia.
>
> If that is indeed the case, consider what should happen if you try to open a file using
> the name "Foo😀Bar.txt". If I understand you correctly, I believe that you would
> expect it to open the same file that got opened when you specified "Foo😀Bar.txt".
> However, the UTF-8 encoding of "Foo😀Bar.txt" is
>
> 0X46 0X6F 0X6F 0XE2 0X89 0XA1 0XC6 0X92 0XC3 0XBF 0XC3 0X87 0X42 0X61 0X72 0X2E 0X74 0X78 0X74
>
> I expect that you will end up opening a different file. If the file name is being displayed
> using a single-byte encoding, it should have 19 characters. If that encoding is in fact
> Code Page 865, then that name should be "Foo😀Bar.txt". So, what result do
> you get?
>
> If the problem is in fact that the file name is being interpreted using a single byte
> encoding by whatever utility you're using determine what the actual name is, then
> there's absolutely nothing the standards can do about that - the behavior of any such
> utility is completely outside the scope of either standard.
>
> > ... Just that for "Foo😀Bar.txt" one needs to use non-standard
> > _wfopen( L"Foo😀Bar.txt", L"w"). That is default both with MSVC and MinGW
> > gcc.
> As you say, it's non-standard. Therefore, nothing the C standard says could do
> anything to constrain it's behavior. If your complaint is indeed about the behavior of
> _wfopen(), it's not relevant to either the C or C++ standards, and should be posted to a
> Windows-specific forum.

Hmm. I'm thinking you analyze correctly. To clarify wherever I refer to file
named "Foo😀Bar.txt" and file named "Foo≡ƒÿÇBar.txt" then it is how Windows
(or other operating system if I make the files reachable) displays names of the files.
The string "Foo😀Bar.txt" is what I see in text editor of UTF-8 source code. The
bytes of yours are correct. Windows C runtime translates all char* strings to
UTF-16 and so it is all about algorithm how it does that.

Re: "Some sanity for C and C++ development on Windows" by Chris Wellons

<52ccd9a4-8e51-497f-95af-4ead180bd31an@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19958&group=comp.lang.c#19958

  copy link   Newsgroups: comp.lang.c
X-Received: by 2002:ac8:5845:: with SMTP id h5mr2994719qth.365.1641894732772;
Tue, 11 Jan 2022 01:52:12 -0800 (PST)
X-Received: by 2002:ad4:5ba3:: with SMTP id 3mr2932671qvq.59.1641894732590;
Tue, 11 Jan 2022 01:52:12 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Tue, 11 Jan 2022 01:52:12 -0800 (PST)
In-Reply-To: <3FWCJ.205729$np6.54600@fx46.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=84.50.190.130; posting-account=pysjKgkAAACLegAdYDFznkqjgx_7vlUK
NNTP-Posting-Host: 84.50.190.130
References: <sr0psj$g2d$1@dont-email.me> <761b391e-f071-484e-8507-f58eeb44a8e9n@googlegroups.com>
<sr53qo$vbl$1@dont-email.me> <_mpBJ.219710$qz4.56726@fx97.iad>
<36c23681-a90b-4de4-8451-e31e74f6c838n@googlegroups.com> <b13c9427-f475-4bcc-98c8-5de476b4e75bn@googlegroups.com>
<27fc916b-9aee-4a76-85e8-6d4a2281b74bn@googlegroups.com> <884c9725-5b12-4727-98a1-6b7c46efb4aen@googlegroups.com>
<c52c7902-0ce0-4db2-af97-1f9fc5c2a9fan@googlegroups.com> <74dd4f1f-c5ff-4c9e-9a04-3616a978fb04n@googlegroups.com>
<4a405512-8c50-479a-9928-857fc7d5fac4n@googlegroups.com> <314f4088-9ea3-4117-b034-356d77a705cen@googlegroups.com>
<73f9b4a9-fa69-4a99-a9cb-15daa9725048n@googlegroups.com> <srcll6$14f3$1@gioia.aioe.org>
<87tuec49lz.fsf@bsb.me.uk> <srgsnj$5gn$3@gioia.aioe.org> <1e59a2a7-baaf-4648-afa5-9bc636f7883en@googlegroups.com>
<3FWCJ.205729$np6.54600@fx46.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <52ccd9a4-8e51-497f-95af-4ead180bd31an@googlegroups.com>
Subject: Re: "Some sanity for C and C++ development on Windows" by Chris Wellons
From: oot...@hot.ee (Öö Tiib)
Injection-Date: Tue, 11 Jan 2022 09:52:12 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 42
 by: Öö Tiib - Tue, 11 Jan 2022 09:52 UTC

On Monday, 10 January 2022 at 15:38:21 UTC+2, Richard Damon wrote:
> On 1/10/22 8:15 AM, Öö Tiib wrote:
> > On Monday, 10 January 2022 at 11:02:54 UTC+2, Mateusz Viste wrote:
> >>
> >> Now, I am not saying that writing a utf-8 strlen() is incredibly
> >> difficult of course. I am only saying it is an extra layer of
> >> complexity compared to UCS-2 or UTF-32. And that is why I understand
> >> why people often choose to internally store strings in one of these
> >> encodings instead of utf-8 (esp. if dealing with fixed-width character
> >> outputs). It's simply easier to deal with an array of values that maps
> >> directly to codepoints rather than parse a utf-8 string taking care not
> >> to explode on encoding errors or edge cases.
> >
> > That argument feels like result of misinterpretation of formation of
> > UCS-2 and UTF-32 glyphs. Both encodings contain variety of combining
> > characters, modifiers, accents and tabulators. So even with monospaced
> > font (in world where proportional fonts are more frequently used) one
> > can't decide the width of result on screen without examining all
> > characters in sequence. But if to examine all characters of sequence
> > anyway then UTF-8 is often.just bit less memory to examine. Somehow
> > it does not look like the other options are "simply easier".
> >
> And this is the reason that Unicode doesn't really meet the requirements
> of a C 'Wide Character Type'. Wide characters are supposed to be 1
> character = 1 storage unit. Because of combining characters Unicode
> doesn't meet this requirement.
>
> Ultimately, we have to live with it and accept that programming in the
> face of full compliance with the rules of the character set are going to
> add complexity.

Unicode is quite successful in supporting nuances of texts that
people expect to see. So it is most popular text format in world.
If it contradicts with requirements of C then ... C has to change
as world is lot harder to change.

Re: "Some sanity for C and C++ development on Windows" by Chris Wellons

<a9c0e089-f4a8-4c9c-9431-ba7cfab461a7n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19959&group=comp.lang.c#19959

  copy link   Newsgroups: comp.lang.c
X-Received: by 2002:a05:622a:1a08:: with SMTP id f8mr3180844qtb.94.1641900027842;
Tue, 11 Jan 2022 03:20:27 -0800 (PST)
X-Received: by 2002:a05:620a:1a24:: with SMTP id bk36mr2643476qkb.513.1641900027695;
Tue, 11 Jan 2022 03:20:27 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Tue, 11 Jan 2022 03:20:27 -0800 (PST)
In-Reply-To: <52ccd9a4-8e51-497f-95af-4ead180bd31an@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:8d7f:6c89:e920:9028;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:8d7f:6c89:e920:9028
References: <sr0psj$g2d$1@dont-email.me> <761b391e-f071-484e-8507-f58eeb44a8e9n@googlegroups.com>
<sr53qo$vbl$1@dont-email.me> <_mpBJ.219710$qz4.56726@fx97.iad>
<36c23681-a90b-4de4-8451-e31e74f6c838n@googlegroups.com> <b13c9427-f475-4bcc-98c8-5de476b4e75bn@googlegroups.com>
<27fc916b-9aee-4a76-85e8-6d4a2281b74bn@googlegroups.com> <884c9725-5b12-4727-98a1-6b7c46efb4aen@googlegroups.com>
<c52c7902-0ce0-4db2-af97-1f9fc5c2a9fan@googlegroups.com> <74dd4f1f-c5ff-4c9e-9a04-3616a978fb04n@googlegroups.com>
<4a405512-8c50-479a-9928-857fc7d5fac4n@googlegroups.com> <314f4088-9ea3-4117-b034-356d77a705cen@googlegroups.com>
<73f9b4a9-fa69-4a99-a9cb-15daa9725048n@googlegroups.com> <srcll6$14f3$1@gioia.aioe.org>
<87tuec49lz.fsf@bsb.me.uk> <srgsnj$5gn$3@gioia.aioe.org> <1e59a2a7-baaf-4648-afa5-9bc636f7883en@googlegroups.com>
<3FWCJ.205729$np6.54600@fx46.iad> <52ccd9a4-8e51-497f-95af-4ead180bd31an@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <a9c0e089-f4a8-4c9c-9431-ba7cfab461a7n@googlegroups.com>
Subject: Re: "Some sanity for C and C++ development on Windows" by Chris Wellons
From: malcolm....@gmail.com (Malcolm McLean)
Injection-Date: Tue, 11 Jan 2022 11:20:27 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 61
 by: Malcolm McLean - Tue, 11 Jan 2022 11:20 UTC

On Tuesday, 11 January 2022 at 09:52:19 UTC, Öö Tiib wrote:
> On Monday, 10 January 2022 at 15:38:21 UTC+2, Richard Damon wrote:
> > On 1/10/22 8:15 AM, Öö Tiib wrote:
> > > On Monday, 10 January 2022 at 11:02:54 UTC+2, Mateusz Viste wrote:
> > >>
> > >> Now, I am not saying that writing a utf-8 strlen() is incredibly
> > >> difficult of course. I am only saying it is an extra layer of
> > >> complexity compared to UCS-2 or UTF-32. And that is why I understand
> > >> why people often choose to internally store strings in one of these
> > >> encodings instead of utf-8 (esp. if dealing with fixed-width character
> > >> outputs). It's simply easier to deal with an array of values that maps
> > >> directly to codepoints rather than parse a utf-8 string taking care not
> > >> to explode on encoding errors or edge cases.
> > >
> > > That argument feels like result of misinterpretation of formation of
> > > UCS-2 and UTF-32 glyphs. Both encodings contain variety of combining
> > > characters, modifiers, accents and tabulators. So even with monospaced
> > > font (in world where proportional fonts are more frequently used) one
> > > can't decide the width of result on screen without examining all
> > > characters in sequence. But if to examine all characters of sequence
> > > anyway then UTF-8 is often.just bit less memory to examine. Somehow
> > > it does not look like the other options are "simply easier".
> > >
> > And this is the reason that Unicode doesn't really meet the requirements
> > of a C 'Wide Character Type'. Wide characters are supposed to be 1
> > character = 1 storage unit. Because of combining characters Unicode
> > doesn't meet this requirement.
> >
> > Ultimately, we have to live with it and accept that programming in the
> > face of full compliance with the rules of the character set are going to
> > add complexity.
> Unicode is quite successful in supporting nuances of texts that
> people expect to see. So it is most popular text format in world.
> If it contradicts with requirements of C then ... C has to change
> as world is lot harder to change.
>
UTF-8 is designed so that programs written in C, as well as many other
programming languages, have a good chance of working correctly
even if not UTF-8 aware.
However if you are displaying text, and start developing for an exclusively
English-speaking end user, then moving to non-English texts isn't always as
simple as changing the raster patterns of the glyphs. That's inherent in the
complexities of human languages and writing systems. Representing
strings in UTF-8 is just the start.

Re: "Some sanity for C and C++ development on Windows" by Chris Wellons

<srni6d$mhl$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19962&group=comp.lang.c#19962

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!aioe.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: vir.camp...@invalid.invalid (Vir Campestris)
Newsgroups: comp.lang.c
Subject: Re: "Some sanity for C and C++ development on Windows" by Chris
Wellons
Date: Wed, 12 Jan 2022 21:45:49 +0000
Organization: A noiseless patient Spider
Lines: 34
Message-ID: <srni6d$mhl$1@dont-email.me>
References: <sr0psj$g2d$1@dont-email.me>
<4a405512-8c50-479a-9928-857fc7d5fac4n@googlegroups.com>
<000e93e1-4d5e-4dda-91da-67ded6d70f83n@googlegroups.com>
<ccc7a06a-6454-4ae4-b29c-075cd76494f9n@googlegroups.com>
<srdd2b$om7$1@gioia.aioe.org>
<c8b3d237-403a-44a4-a74c-91a3ae26605an@googlegroups.com>
<srf2q6$c63$1@gioia.aioe.org>
<2f16854b-ab61-4aa1-af0a-d976535eaa00n@googlegroups.com>
<GvKCJ.77541$KV.71777@fx14.iad>
<471b4523-4568-4f46-9bdd-5fb5bcc7cee3n@googlegroups.com>
<sri8rh$cg1$1@dont-email.me> <A82DJ.76383$cW6.58145@fx08.iad>
<4286cf61-5e4c-43d7-94d5-97c4e4f07808n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 12 Jan 2022 21:45:49 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="fb8a4df5007d24b7c34bae6f869b4e56";
logging-data="23093"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19l+PyaQx/UrVaUTSdyS1wQFeuJkIjE4F0="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
Cancel-Lock: sha1:FMyaZpeUUzdTphyCdsKmoImOo6U=
In-Reply-To: <4286cf61-5e4c-43d7-94d5-97c4e4f07808n@googlegroups.com>
Content-Language: en-GB
 by: Vir Campestris - Wed, 12 Jan 2022 21:45 UTC

On 11/01/2022 02:03, Öö Tiib wrote:
> On Tuesday, 11 January 2022 at 00:09:48 UTC+2, Scott Lurndal wrote:
>> Vir Campestris <vir.cam...@invalid.invalid> writes:
>>> On 10/01/2022 00:51, Öö Tiib wrote:
>>>> But do there exist machines that do want to support texts as char* but
>>>> do not want to support UTF-8? Describe those machines, give examples.
>>>
>>> All the mainframes that run EBCDIC. There are a lot of them still.
>> Here's the implementation guide for one:
>>
>> https://public.support.unisys.com/aseries/docs/ClearPath-MCP-18.0/86002268-207.pdf
>>
>> See appendix E for I18N.
>
> Seems out of context as these guys do not look like wanting to upgrade to C99 or
> something. But maybe at 2060 or so they start to think about usefulness of
> UTF-8 too. The EBCDIC is usual, but irrelevant red herring in discussions like this.
>

I'm not going to read the whole of that spec - but it does say the
compiler by default reads EBCDIC encoded source files (with a switch for
ASCII) and it suggests text data files are EBCDIC too.

I haven't programmed for an EBCDIC machine in 30 years - in fact I
didn't learn C until I stopped using them - but that doesn't mean they
don't exist.

I don't see why they are a red herring.

It would be a perfectly valid decision to target Windows only, or
Android only, but if you want to be truly portable EBCDIC should always
be in the back of your mind.

Andy

Re: "Some sanity for C and C++ development on Windows" by Chris Wellons

<2f236bc7-f6c3-454e-914a-2fcf493ed790n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19963&group=comp.lang.c#19963

  copy link   Newsgroups: comp.lang.c
X-Received: by 2002:a37:9f52:: with SMTP id i79mr1788570qke.717.1642038001662;
Wed, 12 Jan 2022 17:40:01 -0800 (PST)
X-Received: by 2002:a05:620a:1906:: with SMTP id bj6mr1924826qkb.94.1642038001479;
Wed, 12 Jan 2022 17:40:01 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Wed, 12 Jan 2022 17:40:01 -0800 (PST)
In-Reply-To: <srni6d$mhl$1@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=94.246.251.164; posting-account=pysjKgkAAACLegAdYDFznkqjgx_7vlUK
NNTP-Posting-Host: 94.246.251.164
References: <sr0psj$g2d$1@dont-email.me> <4a405512-8c50-479a-9928-857fc7d5fac4n@googlegroups.com>
<000e93e1-4d5e-4dda-91da-67ded6d70f83n@googlegroups.com> <ccc7a06a-6454-4ae4-b29c-075cd76494f9n@googlegroups.com>
<srdd2b$om7$1@gioia.aioe.org> <c8b3d237-403a-44a4-a74c-91a3ae26605an@googlegroups.com>
<srf2q6$c63$1@gioia.aioe.org> <2f16854b-ab61-4aa1-af0a-d976535eaa00n@googlegroups.com>
<GvKCJ.77541$KV.71777@fx14.iad> <471b4523-4568-4f46-9bdd-5fb5bcc7cee3n@googlegroups.com>
<sri8rh$cg1$1@dont-email.me> <A82DJ.76383$cW6.58145@fx08.iad>
<4286cf61-5e4c-43d7-94d5-97c4e4f07808n@googlegroups.com> <srni6d$mhl$1@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2f236bc7-f6c3-454e-914a-2fcf493ed790n@googlegroups.com>
Subject: Re: "Some sanity for C and C++ development on Windows" by Chris Wellons
From: oot...@hot.ee (Öö Tiib)
Injection-Date: Thu, 13 Jan 2022 01:40:01 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 53
 by: Öö Tiib - Thu, 13 Jan 2022 01:40 UTC

On Wednesday, 12 January 2022 at 23:46:03 UTC+2, Vir Campestris wrote:
> On 11/01/2022 02:03, Öö Tiib wrote:
> > On Tuesday, 11 January 2022 at 00:09:48 UTC+2, Scott Lurndal wrote:
> >> Vir Campestris <vir.cam...@invalid.invalid> writes:
> >>> On 10/01/2022 00:51, Öö Tiib wrote:
> >>>> But do there exist machines that do want to support texts as char* but
> >>>> do not want to support UTF-8? Describe those machines, give examples..
> >>>
> >>> All the mainframes that run EBCDIC. There are a lot of them still.
> >> Here's the implementation guide for one:
> >>
> >> https://public.support.unisys.com/aseries/docs/ClearPath-MCP-18.0/86002268-207.pdf
> >>
> >> See appendix E for I18N.
> >
> > Seems out of context as these guys do not look like wanting to upgrade to C99 or
> > something. But maybe at 2060 or so they start to think about usefulness of
> > UTF-8 too. The EBCDIC is usual, but irrelevant red herring in discussions like this.
> >
> I'm not going to read the whole of that spec - but it does say the
> compiler by default reads EBCDIC encoded source files (with a switch for
> ASCII) and it suggests text data files are EBCDIC too.
>
> I haven't programmed for an EBCDIC machine in 30 years - in fact I the
> didn't learn C until I stopped using them - but that doesn't mean they
> don't exist.
>
> I don't see why they are a red herring.

I tried to tell. Because the vendors of those platforms do not want to migrate even to
more than 20 year old C99 standard.

> It would be a perfectly valid decision to target Windows only, or
> Android only, but if you want to be truly portable EBCDIC should always
> be in the back of your mind.

"Truly portable" is another red herring. Everything is portable only between
certain platforms & versions. Windows program that one writes now most likely
does not run on Windows 2000. Android programs released today have always
list of Android version that these run on. Portability is achieved with certain
amount of work per platform supported. Some old platforms are not worth to
invest that work in.

Re: "Some sanity for C and C++ development on Windows" by Chris Wellons

<20220113150059.667@kylheku.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19964&group=comp.lang.c#19964

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!rocksolid2!news.neodome.net!news.mixmin.net!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: 480-992-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.c
Subject: Re: "Some sanity for C and C++ development on Windows" by Chris
Wellons
Date: Thu, 13 Jan 2022 23:12:59 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 44
Message-ID: <20220113150059.667@kylheku.com>
References: <sr0psj$g2d$1@dont-email.me>
<4a405512-8c50-479a-9928-857fc7d5fac4n@googlegroups.com>
<000e93e1-4d5e-4dda-91da-67ded6d70f83n@googlegroups.com>
<ccc7a06a-6454-4ae4-b29c-075cd76494f9n@googlegroups.com>
<srdd2b$om7$1@gioia.aioe.org>
<c8b3d237-403a-44a4-a74c-91a3ae26605an@googlegroups.com>
<srf2q6$c63$1@gioia.aioe.org>
<2f16854b-ab61-4aa1-af0a-d976535eaa00n@googlegroups.com>
<GvKCJ.77541$KV.71777@fx14.iad>
<471b4523-4568-4f46-9bdd-5fb5bcc7cee3n@googlegroups.com>
<sri8rh$cg1$1@dont-email.me> <A82DJ.76383$cW6.58145@fx08.iad>
<4286cf61-5e4c-43d7-94d5-97c4e4f07808n@googlegroups.com>
<srni6d$mhl$1@dont-email.me>
<2f236bc7-f6c3-454e-914a-2fcf493ed790n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 13 Jan 2022 23:12:59 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="19a3c29b31366dfc04ef300bce13b80c";
logging-data="16771"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+Bztiz35m/JkM21KZ8oLD7i3Suxb9iC4M="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:G3W4BypRHhr36tPnqv2TIXlDWfo=
 by: Kaz Kylheku - Thu, 13 Jan 2022 23:12 UTC

On 2022-01-13, Öö Tiib <ootiib@hot.ee> wrote:
> On Wednesday, 12 January 2022 at 23:46:03 UTC+2, Vir Campestris wrote:
>> It would be a perfectly valid decision to target Windows only, or
>> Android only, but if you want to be truly portable EBCDIC should always
>> be in the back of your mind.
>
> "Truly portable" is another red herring. Everything is portable only between

Indeed, "actually ported" beats "truly portable".

If you don't have a single test case in your system which covers EBCDIC,
or have written a few, but never had the opportunity to run them, then
the portability of the system to EBCDIC is only theoretical.

There could be snags in an actual porting effort that you have no clue
about.

When we write code, we have certain platforms in mind (and their
toolchains). Practical portable coding means having a few more kinds of
platforms in mind, without going overboard (such as ones that actually
exist and that some of the code might plausibly go to).

Where it is practical and easy, you make the coding decisions to be a
little more portable than actually required.

In the C language, a lot of this thinking is spent in not platform
concerns but compiler concerns. We have to worry a lot less about EBCDIC
or 36 bit pointers (which will never happen to our code) than, say,
about how deeply some future compiler makes inferences based on the
assumption of well-defined behavior. A C programmer's portability brain
cycles are much more profitably (or less unprofitably) spent on these
language semantic issues: as a strategist, you have to fortify yourself
where you expect to be attacked!

Strategically, worrying about EBCDIC is like building a fortress against
rocks and arrows, when your enemy has long abandoned those and moved on
to rockets and grenades. Well kind of. Imagine if there was a way of
fortifying that works gainst rockets and grenades, but succumbs to
arrows and rocks ... but arrows and rocks won't happen because almost
nobody remembers them. :)

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal


devel / comp.lang.c / Re: "Some sanity for C and C++ development on Windows" by Chris Wellons

Pages:1234
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor