Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Beware of bugs in the above code; I have only proved it correct, not tried it. -- Donald Knuth


devel / comp.lang.c / Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

SubjectAuthor
* K&R, 2nd edition, Brian's concerns with ``char c = EOF''Meredith Montgomery
+* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Keith Thompson
|+* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Meredith Montgomery
||`* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Keith Thompson
|| +- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Meredith Montgomery
|| `- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Scott Lurndal
|`* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Philipp Klaus Krause
| `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Keith Thompson
|  `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Philipp Klaus Krause
|   `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Keith Thompson
|    `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Philipp Klaus Krause
|     `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Tim Rentsch
|      `- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''David Brown
+* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Barry Schwarz
|+- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Meredith Montgomery
|`* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''David Brown
| `- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Meredith Montgomery
+- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Mark Bluemel
+* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Mark Bluemel
|`* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Keith Thompson
| +- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Mark Bluemel
| `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Malcolm McLean
|  `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Keith Thompson
|   `- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Scott Lurndal
+* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Dave Dunfield
|`* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Dave Dunfield
| +* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Keith Thompson
| |`* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Dave Dunfield
| | +* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Tim Rentsch
| | |`* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Dave Dunfield
| | | +- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Manfred
| | | +* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Ben Bacarisse
| | | |`* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Bart
| | | | +* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Dave Dunfield
| | | | |`- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Ben Bacarisse
| | | | `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Ben Bacarisse
| | | |  `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Bart
| | | |   `- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Ben Bacarisse
| | | `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Tim Rentsch
| | |  `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Keith Thompson
| | |   `- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Tim Rentsch
| | +* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Paul
| | |`- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Chris M. Thomasson
| | `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''luser droog
| |  `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Dave Dunfield
| |   +- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Manfred
| |   +- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Bart
| |   `- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Tim Rentsch
| `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Mark Bluemel
|  `* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Bart
|   `- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Dave Dunfield
+- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Andrey Tarasevich
`* Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''Kaz Kylheku
 `- Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''James Kuyper

Pages:123
K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<86pmqwkm7u.fsf@levado.to>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19074&group=comp.lang.c#19074

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!aioe.org!A5viUP7ZdBSpM8+So1ywOw.user.46.165.242.75.POSTED!not-for-mail
From: mmontgom...@levado.to (Meredith Montgomery)
Newsgroups: comp.lang.c
Subject: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Fri, 19 Nov 2021 15:57:09 -0300
Organization: Aioe.org NNTP Server
Message-ID: <86pmqwkm7u.fsf@levado.to>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: gioia.aioe.org; logging-data="16528"; posting-host="A5viUP7ZdBSpM8+So1ywOw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2
Cancel-Lock: sha1:x2eMJpMGT0zQunz7BbLmhVeyQOM=
 by: Meredith Montgomery - Fri, 19 Nov 2021 18:57 UTC

I did not get Brian Kernighan's concern with setting EOF to a char c.
The context is

char c = getchar();

And Kernighan says:

--8<---------------cut here---------------start------------->8---
We must declare c to be a type big enough to hold any value that getchar
returns. We can't use char since c must be big enough to hold EOF in
addition to any possible char . Therefore we use int.
--8<---------------cut here---------------end--------------->8---

I'm trying to justify this using C99's document. Here's what I find.

--8<---------------cut here---------------start------------->8---
Section 6.25 Types
[...]

3. An object declared as type char is large enough to store any member
of the basic execution character set. If a member of the basic execution
character set is stored in a char object, its value is guaranteed to be
positive. If any other character is stored in a char object, the
resulting value is implementation-defined but shall be within the range
of values that can be represented in that type.
--8<---------------cut here---------------end--------------->8---

So I can't be sure a char would always be signed, for example. I can't
be sure EOF would fit in a char.

Am I looking at the right place, thinking the right thing? Thank you.

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<878rxjq4l9.fsf@nosuchdomain.example.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19078&group=comp.lang.c#19078

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: Keith.S....@gmail.com (Keith Thompson)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Fri, 19 Nov 2021 12:21:22 -0800
Organization: None to speak of
Lines: 75
Message-ID: <878rxjq4l9.fsf@nosuchdomain.example.com>
References: <86pmqwkm7u.fsf@levado.to>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: reader02.eternal-september.org; posting-host="1ce4a9086bdec89a73c6da8f53f5390c";
logging-data="24512"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/NKoksuE1KogQD/Pemzwhc"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Cancel-Lock: sha1:JP3WJ3k3jQnah+UDXuM4unFZisE=
sha1:VSuOreEkAii0Q4Kyj5VIWdhfKJE=
 by: Keith Thompson - Fri, 19 Nov 2021 20:21 UTC

Meredith Montgomery <mmontgomery@levado.to> writes:
> I did not get Brian Kernighan's concern with setting EOF to a char c.
> The context is
>
> char c = getchar();
>
> And Kernighan says:
>
> --8<---------------cut here---------------start------------->8---
> We must declare c to be a type big enough to hold any value that getchar
> returns. We can't use char since c must be big enough to hold EOF in
> addition to any possible char . Therefore we use int.
> --8<---------------cut here---------------end--------------->8---
>
>
> I'm trying to justify this using C99's document. Here's what I find.
>
> --8<---------------cut here---------------start------------->8---
> Section 6.25 Types
> [...]
>
> 3. An object declared as type char is large enough to store any member
> of the basic execution character set. If a member of the basic execution
> character set is stored in a char object, its value is guaranteed to be
> positive. If any other character is stored in a char object, the
> resulting value is implementation-defined but shall be within the range
> of values that can be represented in that type.
> --8<---------------cut here---------------end--------------->8---
>
> So I can't be sure a char would always be signed, for example. I can't
> be sure EOF would fit in a char.
>
> Am I looking at the right place, thinking the right thing? Thank you.

Yes, that's pretty much it -- except that the same issue applies (with
some differences) whether plain char is signed or unsigned.

For simplicity, assume CHAR_BIT==8, 2's-complement for signed types, and
EOF==-1. Then a char object can hold any of exactly 256 distinct
values, either 0..255 or -128..127.

A call to getchar() can return any of exactly 257 distinct values. If
it succeeds, the result is in the range 0..255. If it fails, it returns
EOF. (A cleaner design choice might have been to separate the data from
the status, but C does it this way and we're stuck with it.)

Suppose you (unwisely) store the result of getchar() in a char:

char c = getchar();

If char is signed and getchar() returns EOF, then c==-1 -- but -1 could
also be a valid character value. The test (c == EOF) will succeed if
you read a valid character with the value -1.

If char is unsigned and getchar() returns EOF, then c==255 -- and again,
255 could be a valid character value. But now the test (c == EOF) will
never succeed, because the converted value is no longer equal to EOF.
(In the 8-bit Latin-1 character set, character 255 is 'ΓΏ').

The byte value 0xff never appears in valid UTF-8, but of course it can
easily appear in binary data.

Note that if you have an exotic implementation with CHAR_BIT==16 and
sizeof(int)==1, storing the result of getchar() in an int still doesn't
let you distinguish between EOF and a valid character with the same
value. In practice, you probably wouldn't be doing normal character
input on such a system (which is likey to be a DSP). But if you had to,
you could call feof() and ferror() after the getchar() call to determine
whether the EOF value is an actual end-of-file indicator or a valid
16-bit character.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<lk1gpglc7tieirh54eeqdbiujmid08f8ej@4ax.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19079&group=comp.lang.c#19079

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: schwa...@delq.com (Barry Schwarz)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Fri, 19 Nov 2021 12:22:08 -0800
Organization: A noiseless patient Spider
Lines: 41
Message-ID: <lk1gpglc7tieirh54eeqdbiujmid08f8ej@4ax.com>
References: <86pmqwkm7u.fsf@levado.to>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Info: reader02.eternal-september.org; posting-host="e53a38fe975a34b3bd09572e738ea6d5";
logging-data="22532"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX197fZfJpSFsTSgczLEqLmY5K75p4zvQFjo="
Cancel-Lock: sha1:x1/l4V+ON2yC/nWO+AbW6oFalJE=
X-Newsreader: Forte Agent 4.2/32.1118
 by: Barry Schwarz - Fri, 19 Nov 2021 20:22 UTC

On Fri, 19 Nov 2021 15:57:09 -0300, Meredith Montgomery
<mmontgomery@levado.to> wrote:

>I did not get Brian Kernighan's concern with setting EOF to a char c.
>The context is
>
> char c = getchar();
>
>And Kernighan says:
>
>--8<---------------cut here---------------start------------->8---
>We must declare c to be a type big enough to hold any value that getchar
>returns. We can't use char since c must be big enough to hold EOF in
>addition to any possible char . Therefore we use int.
>--8<---------------cut here---------------end--------------->8---
>
>I'm trying to justify this using C99's document. Here's what I find.
>
>--8<---------------cut here---------------start------------->8---
>Section 6.25 Types
>[...]
>
>3. An object declared as type char is large enough to store any member
>of the basic execution character set. If a member of the basic execution
>character set is stored in a char object, its value is guaranteed to be
>positive. If any other character is stored in a char object, the
>resulting value is implementation-defined but shall be within the range
>of values that can be represented in that type.
>--8<---------------cut here---------------end--------------->8---
>
>So I can't be sure a char would always be signed, for example. I can't
>be sure EOF would fit in a char.
>
>Am I looking at the right place, thinking the right thing? Thank you.

Yes., you are. On some systems (for example, on IBM mainframes which
use EBCDIC coding), char is unsigned. Since EOF must be negative, it
would not fit. Thus getchar returns an int.

--
Remove del for email

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<86wnl3kg25.fsf@levado.to>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19083&group=comp.lang.c#19083

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!aioe.org!A5viUP7ZdBSpM8+So1ywOw.user.46.165.242.75.POSTED!not-for-mail
From: mmontgom...@levado.to (Meredith Montgomery)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Fri, 19 Nov 2021 18:10:10 -0300
Organization: Aioe.org NNTP Server
Message-ID: <86wnl3kg25.fsf@levado.to>
References: <86pmqwkm7u.fsf@levado.to>
<lk1gpglc7tieirh54eeqdbiujmid08f8ej@4ax.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: gioia.aioe.org; logging-data="59200"; posting-host="A5viUP7ZdBSpM8+So1ywOw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
Cancel-Lock: sha1:5//ROKKvHTYgukd2Lv0/3fF6NB0=
X-Notice: Filtered by postfilter v. 0.9.2
 by: Meredith Montgomery - Fri, 19 Nov 2021 21:10 UTC

Barry Schwarz <schwarzb@delq.com> writes:

> On Fri, 19 Nov 2021 15:57:09 -0300, Meredith Montgomery
> <mmontgomery@levado.to> wrote:
>
>>I did not get Brian Kernighan's concern with setting EOF to a char c.
>>The context is
>>
>> char c = getchar();
>>
>>And Kernighan says:
>>
>>--8<---------------cut here---------------start------------->8---
>>We must declare c to be a type big enough to hold any value that getchar
>>returns. We can't use char since c must be big enough to hold EOF in
>>addition to any possible char . Therefore we use int.
>>--8<---------------cut here---------------end--------------->8---
>>
>>I'm trying to justify this using C99's document. Here's what I find.
>>
>>--8<---------------cut here---------------start------------->8---
>>Section 6.25 Types
>>[...]
>>
>>3. An object declared as type char is large enough to store any member
>>of the basic execution character set. If a member of the basic execution
>>character set is stored in a char object, its value is guaranteed to be
>>positive. If any other character is stored in a char object, the
>>resulting value is implementation-defined but shall be within the range
>>of values that can be represented in that type.
>>--8<---------------cut here---------------end--------------->8---
>>
>>So I can't be sure a char would always be signed, for example. I can't
>>be sure EOF would fit in a char.
>>
>>Am I looking at the right place, thinking the right thing? Thank you.
>
> Yes., you are. On some systems (for example, on IBM mainframes which
> use EBCDIC coding), char is unsigned. Since EOF must be negative, it
> would not fit. Thus getchar returns an int.

Nice. Thanks very much for the IBM mainframe example.

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<86ee7bkfv6.fsf@levado.to>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19084&group=comp.lang.c#19084

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!aioe.org!A5viUP7ZdBSpM8+So1ywOw.user.46.165.242.75.POSTED!not-for-mail
From: mmontgom...@levado.to (Meredith Montgomery)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Fri, 19 Nov 2021 18:14:21 -0300
Organization: Aioe.org NNTP Server
Message-ID: <86ee7bkfv6.fsf@levado.to>
References: <86pmqwkm7u.fsf@levado.to>
<878rxjq4l9.fsf@nosuchdomain.example.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="2614"; posting-host="A5viUP7ZdBSpM8+So1ywOw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2
Cancel-Lock: sha1:cl2EfqvQu5lRzakHraFRuWfyDOU=
 by: Meredith Montgomery - Fri, 19 Nov 2021 21:14 UTC

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

> Meredith Montgomery <mmontgomery@levado.to> writes:
>> I did not get Brian Kernighan's concern with setting EOF to a char c.
>> The context is
>>
>> char c = getchar();
>>
>> And Kernighan says:
>>
>> --8<---------------cut here---------------start------------->8---
>> We must declare c to be a type big enough to hold any value that getchar
>> returns. We can't use char since c must be big enough to hold EOF in
>> addition to any possible char . Therefore we use int.
>> --8<---------------cut here---------------end--------------->8---
>>
>>
>> I'm trying to justify this using C99's document. Here's what I find.
>>
>> --8<---------------cut here---------------start------------->8---
>> Section 6.25 Types
>> [...]
>>
>> 3. An object declared as type char is large enough to store any member
>> of the basic execution character set. If a member of the basic execution
>> character set is stored in a char object, its value is guaranteed to be
>> positive. If any other character is stored in a char object, the
>> resulting value is implementation-defined but shall be within the range
>> of values that can be represented in that type.
>> --8<---------------cut here---------------end--------------->8---
>>
>> So I can't be sure a char would always be signed, for example. I can't
>> be sure EOF would fit in a char.
>>
>> Am I looking at the right place, thinking the right thing? Thank you.
>
> Yes, that's pretty much it -- except that the same issue applies (with
> some differences) whether plain char is signed or unsigned.
>
> For simplicity, assume CHAR_BIT==8, 2's-complement for signed types, and
> EOF==-1. Then a char object can hold any of exactly 256 distinct
> values, either 0..255 or -128..127.
>
> A call to getchar() can return any of exactly 257 distinct values. If
> it succeeds, the result is in the range 0..255. If it fails, it returns
> EOF. (A cleaner design choice might have been to separate the data from
> the status, but C does it this way and we're stuck with it.)
>
> Suppose you (unwisely) store the result of getchar() in a char:
>
> char c = getchar();
>
> If char is signed and getchar() returns EOF, then c==-1 -- but -1 could
> also be a valid character value. The test (c == EOF) will succeed if
> you read a valid character with the value -1.
>
> If char is unsigned and getchar() returns EOF, then c==255 -- and again,
> 255 could be a valid character value. But now the test (c == EOF) will
> never succeed, because the converted value is no longer equal to EOF.
> (In the 8-bit Latin-1 character set, character 255 is 'ΓΏ').
>
> The byte value 0xff never appears in valid UTF-8, but of course it can
> easily appear in binary data.
>
> Note that if you have an exotic implementation with CHAR_BIT==16 and
> sizeof(int)==1, storing the result of getchar() in an int still doesn't
> let you distinguish between EOF and a valid character with the same
> value. In practice, you probably wouldn't be doing normal character
> input on such a system (which is likey to be a DSP). But if you had to,
> you could call feof() and ferror() after the getchar() call to determine
> whether the EOF value is an actual end-of-file indicator or a valid
> 16-bit character.

Thank you the careful description of what could happen in each case ---
and for the example of the DSP system, which appears to be a hardware by
Texas Instruments.

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<87zgpzokzf.fsf@nosuchdomain.example.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19086&group=comp.lang.c#19086

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: Keith.S....@gmail.com (Keith Thompson)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Fri, 19 Nov 2021 14:10:12 -0800
Organization: None to speak of
Lines: 20
Message-ID: <87zgpzokzf.fsf@nosuchdomain.example.com>
References: <86pmqwkm7u.fsf@levado.to>
<878rxjq4l9.fsf@nosuchdomain.example.com> <86ee7bkfv6.fsf@levado.to>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="1ce4a9086bdec89a73c6da8f53f5390c";
logging-data="3453"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/LRrQVnpIhT6C1wBFjHuQR"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Cancel-Lock: sha1:/fxpJBzCTeMUMkbj/C2oXZ0q2x4=
sha1:QTCUwtW46O3JFo3MmQOuTBD+/fg=
 by: Keith Thompson - Fri, 19 Nov 2021 22:10 UTC

Meredith Montgomery <mmontgomery@levado.to> writes:
[...]
> Thank you the careful description of what could happen in each case ---
> and for the example of the DSP system, which appears to be a hardware by
> Texas Instruments.

DSP (Digital Signal Processor) is a generic term for a kind of
specialized microprocessor chip. TI makes them, but so do other
manufacturers.

I've never worked with them myself. They're of interest in the context
of C mainly because they're perhaps the only kind of modern system
likely to have a C implementation with CHAR_BIT != 8.

https://en.wikipedia.org/wiki/Digital_signal_processor

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<snapep$qp4$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19089&group=comp.lang.c#19089

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Sat, 20 Nov 2021 13:24:56 +0100
Organization: A noiseless patient Spider
Lines: 24
Message-ID: <snapep$qp4$1@dont-email.me>
References: <86pmqwkm7u.fsf@levado.to>
<lk1gpglc7tieirh54eeqdbiujmid08f8ej@4ax.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 20 Nov 2021 12:24:57 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="97de0f5820112279033bb40994fc41ca";
logging-data="27428"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/+pjqs6Gx2Frw5YeS9IkKFJvf5sBXUc+U="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:Mi4XcXcLifuPNXwAD/fGWKI/PyU=
In-Reply-To: <lk1gpglc7tieirh54eeqdbiujmid08f8ej@4ax.com>
Content-Language: en-GB
 by: David Brown - Sat, 20 Nov 2021 12:24 UTC

On 19/11/2021 21:22, Barry Schwarz wrote:
> On Fri, 19 Nov 2021 15:57:09 -0300, Meredith Montgomery
> <mmontgomery@levado.to> wrote:

>> So I can't be sure a char would always be signed, for example. I can't
>> be sure EOF would fit in a char.
>>
>> Am I looking at the right place, thinking the right thing? Thank you.
>
> Yes., you are. On some systems (for example, on IBM mainframes which
> use EBCDIC coding), char is unsigned. Since EOF must be negative, it
> would not fit. Thus getchar returns an int.
>

Unsigned plain char is also common on microcontrollers. Really, it is
the concept of having plain chars be /signed/ that is the anachronism,
coming from a time when there was no alternative explicit "signed char"
and "unsigned char" and thus plain "char" doubled up as a small integer.

Basically, if your code depends on whether plain char is signed or
unsigned, you've written pointlessly unportable code. Don't assume that
"char" is signed unless you are using a dinosaur or something weird and
niche, in the way that you can assume that CHAR_BIT == 8 unless you are
using a dinosaur or something niche (like a DSP).

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<86lf1iin7j.fsf@levado.to>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19096&group=comp.lang.c#19096

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!aioe.org!aljJUUbxiHYYh5oXD5+Htw.user.46.165.242.75.POSTED!not-for-mail
From: mmontgom...@levado.to (Meredith Montgomery)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Sat, 20 Nov 2021 17:30:56 -0300
Organization: Aioe.org NNTP Server
Message-ID: <86lf1iin7j.fsf@levado.to>
References: <86pmqwkm7u.fsf@levado.to>
<878rxjq4l9.fsf@nosuchdomain.example.com> <86ee7bkfv6.fsf@levado.to>
<87zgpzokzf.fsf@nosuchdomain.example.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: gioia.aioe.org; logging-data="44614"; posting-host="aljJUUbxiHYYh5oXD5+Htw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2
Cancel-Lock: sha1:MR9nb+mIiSiudICG5sOFL1q1CFA=
 by: Meredith Montgomery - Sat, 20 Nov 2021 20:30 UTC

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

> Meredith Montgomery <mmontgomery@levado.to> writes:
> [...]
>> Thank you the careful description of what could happen in each case ---
>> and for the example of the DSP system, which appears to be a hardware by
>> Texas Instruments.
>
> DSP (Digital Signal Processor) is a generic term for a kind of
> specialized microprocessor chip. TI makes them, but so do other
> manufacturers.

Thanks!

> I've never worked with them myself. They're of interest in the context
> of C mainly because they're perhaps the only kind of modern system
> likely to have a C implementation with CHAR_BIT != 8.

Thank you! So, yes, it's an excellent example for the context.

> https://en.wikipedia.org/wiki/Digital_signal_processor

Thanks!

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<86czmuin5n.fsf@levado.to>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19097&group=comp.lang.c#19097

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!aioe.org!aljJUUbxiHYYh5oXD5+Htw.user.46.165.242.75.POSTED!not-for-mail
From: mmontgom...@levado.to (Meredith Montgomery)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Sat, 20 Nov 2021 17:32:04 -0300
Organization: Aioe.org NNTP Server
Message-ID: <86czmuin5n.fsf@levado.to>
References: <86pmqwkm7u.fsf@levado.to>
<lk1gpglc7tieirh54eeqdbiujmid08f8ej@4ax.com>
<snapep$qp4$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: gioia.aioe.org; logging-data="44614"; posting-host="aljJUUbxiHYYh5oXD5+Htw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
X-Notice: Filtered by postfilter v. 0.9.2
Cancel-Lock: sha1:ny7Iik9NL3Z19qE81JFSlGfCVWw=
 by: Meredith Montgomery - Sat, 20 Nov 2021 20:32 UTC

David Brown <david.brown@hesbynett.no> writes:

> On 19/11/2021 21:22, Barry Schwarz wrote:
>> On Fri, 19 Nov 2021 15:57:09 -0300, Meredith Montgomery
>> <mmontgomery@levado.to> wrote:
>
>>> So I can't be sure a char would always be signed, for example. I can't
>>> be sure EOF would fit in a char.
>>>
>>> Am I looking at the right place, thinking the right thing? Thank you.
>>
>> Yes., you are. On some systems (for example, on IBM mainframes which
>> use EBCDIC coding), char is unsigned. Since EOF must be negative, it
>> would not fit. Thus getchar returns an int.
>>
>
> Unsigned plain char is also common on microcontrollers. Really, it is
> the concept of having plain chars be /signed/ that is the anachronism,
> coming from a time when there was no alternative explicit "signed char"
> and "unsigned char" and thus plain "char" doubled up as a small integer.
>
> Basically, if your code depends on whether plain char is signed or
> unsigned, you've written pointlessly unportable code. Don't assume that
> "char" is signed unless you are using a dinosaur or something weird and
> niche, in the way that you can assume that CHAR_BIT == 8 unless you are
> using a dinosaur or something niche (like a DSP).

Will surely do! Thanks!

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<g3umJ.116418$IW4.64708@fx48.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19119&group=comp.lang.c#19119

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!aioe.org!feeder1.feed.usenet.farm!feed.usenet.farm!newsfeed.xs4all.nl!newsfeed9.news.xs4all.nl!news-out.netnews.com!news.alt.net!fdc2.netnews.com!peer03.ams1!peer.ams1.xlned.com!news.xlned.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx48.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Newsgroups: comp.lang.c
References: <86pmqwkm7u.fsf@levado.to> <878rxjq4l9.fsf@nosuchdomain.example.com> <86ee7bkfv6.fsf@levado.to> <87zgpzokzf.fsf@nosuchdomain.example.com>
Lines: 20
Message-ID: <g3umJ.116418$IW4.64708@fx48.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Sun, 21 Nov 2021 16:01:16 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Sun, 21 Nov 2021 16:01:16 GMT
X-Received-Bytes: 1537
 by: Scott Lurndal - Sun, 21 Nov 2021 16:01 UTC

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>Meredith Montgomery <mmontgomery@levado.to> writes:
>[...]
>> Thank you the careful description of what could happen in each case ---
>> and for the example of the DSP system, which appears to be a hardware by
>> Texas Instruments.
>
>DSP (Digital Signal Processor) is a generic term for a kind of
>specialized microprocessor chip. TI makes them, but so do other
>manufacturers.

Tensilica, Ceva and a handful of others.

>
>I've never worked with them myself. They're of interest in the context
>of C mainly because they're perhaps the only kind of modern system
>likely to have a C implementation with CHAR_BIT != 8.

They're also more likely to have a Harvard Architecture.

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<a0f04370-f1b1-44c3-93e7-5bc348d67477n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19129&group=comp.lang.c#19129

  copy link   Newsgroups: comp.lang.c
X-Received: by 2002:ac8:5f89:: with SMTP id j9mr28942443qta.391.1637569650015;
Mon, 22 Nov 2021 00:27:30 -0800 (PST)
X-Received: by 2002:a0c:f8cc:: with SMTP id h12mr97109353qvo.6.1637569649808;
Mon, 22 Nov 2021 00:27:29 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Mon, 22 Nov 2021 00:27:29 -0800 (PST)
In-Reply-To: <86pmqwkm7u.fsf@levado.to>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23c7:a280:3401:2c9a:df92:189a:ec79;
posting-account=3LA7mQoAAAByiBtHIUvpFq0_QEKnHGc9
NNTP-Posting-Host: 2a00:23c7:a280:3401:2c9a:df92:189a:ec79
References: <86pmqwkm7u.fsf@levado.to>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <a0f04370-f1b1-44c3-93e7-5bc348d67477n@googlegroups.com>
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
From: mark.blu...@gmail.com (Mark Bluemel)
Injection-Date: Mon, 22 Nov 2021 08:27:30 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 31
 by: Mark Bluemel - Mon, 22 Nov 2021 08:27 UTC

On Friday, 19 November 2021 at 18:59:47 UTC, Meredith Montgomery wrote:
> I did not get Brian Kernighan's concern with setting EOF to a char c.
> The context is
>
> char c = getchar();
>
> And Kernighan says:
>
> --8<---------------cut here---------------start------------->8---
> We must declare c to be a type big enough to hold any value that getchar
> returns. We can't use char since c must be big enough to hold EOF in
> addition to any possible char . Therefore we use int.
> --8<---------------cut here---------------end--------------->8---
>
> I'm trying to justify this using C99's document. Here's what I find.
>
> --8<---------------cut here---------------start------------->8---
> Section 6.25 Types
> [...]
>
> 3. An object declared as type char is large enough to store any member
> of the basic execution character set. If a member of the basic execution
> character set is stored in a char object, its value is guaranteed to be
> positive. If any other character is stored in a char object, the
> resulting value is implementation-defined but shall be within the range
> of values that can be represented in that type.
> --8<---------------cut here---------------end--------------->8---
>
> So I can't be sure a char would always be signed, for example. I can't
> be sure EOF would fit in a char.
>
> Am I looking at the right place, thinking the right thing? Thank you.

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<adc13d23-34da-4340-a092-f22c38f3fc1dn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19130&group=comp.lang.c#19130

  copy link   Newsgroups: comp.lang.c
X-Received: by 2002:ad4:5f87:: with SMTP id jp7mr97847456qvb.65.1637570059996;
Mon, 22 Nov 2021 00:34:19 -0800 (PST)
X-Received: by 2002:a0c:eb49:: with SMTP id c9mr97537470qvq.30.1637570059879;
Mon, 22 Nov 2021 00:34:19 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Mon, 22 Nov 2021 00:34:19 -0800 (PST)
In-Reply-To: <86pmqwkm7u.fsf@levado.to>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23c7:a280:3401:2c9a:df92:189a:ec79;
posting-account=3LA7mQoAAAByiBtHIUvpFq0_QEKnHGc9
NNTP-Posting-Host: 2a00:23c7:a280:3401:2c9a:df92:189a:ec79
References: <86pmqwkm7u.fsf@levado.to>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <adc13d23-34da-4340-a092-f22c38f3fc1dn@googlegroups.com>
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
From: mark.blu...@gmail.com (Mark Bluemel)
Injection-Date: Mon, 22 Nov 2021 08:34:19 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 47
 by: Mark Bluemel - Mon, 22 Nov 2021 08:34 UTC

On Friday, 19 November 2021 at 18:59:47 UTC, Meredith Montgomery wrote:
> I did not get Brian Kernighan's concern with setting EOF to a char c.
> The context is
>
> char c = getchar();
>
> And Kernighan says:
>
> --8<---------------cut here---------------start------------->8---
> We must declare c to be a type big enough to hold any value that getchar
> returns. We can't use char since c must be big enough to hold EOF in
> addition to any possible char . Therefore we use int.
> --8<---------------cut here---------------end--------------->8---
>
> I'm trying to justify this using C99's document. Here's what I find.
>
> --8<---------------cut here---------------start------------->8---
> Section 6.25 Types
> [...]
>
> 3. An object declared as type char is large enough to store any member
> of the basic execution character set. If a member of the basic execution
> character set is stored in a char object, its value is guaranteed to be
> positive. If any other character is stored in a char object, the
> resulting value is implementation-defined but shall be within the range
> of values that can be represented in that type.
> --8<---------------cut here---------------end--------------->8---
>
> So I can't be sure a char would always be signed, for example. I can't
> be sure EOF would fit in a char.
>
> Am I looking at the right place, thinking the right thing? Thank you.

I find the question of whether or not char is signed unhelpful.

For me, the best approach is to consider reading binary data,
not text, from a file.

You need to be able to read a byte (in C terms "char") at a time from
the data. So char needs to be able to contain every conceivable byte
value.

You also need to be able to recognise end of file. This can't be represented
by a specific byte value, as that could also be valid in binary data.

So, given that C only allows single return values from functions, the return
value from "read()" needs to be able to represent any byte (char) value plus
one other value. So it needs to be bigger than a 'C' char.

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<87mtlwik0c.fsf@nosuchdomain.example.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19131&group=comp.lang.c#19131

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: Keith.S....@gmail.com (Keith Thompson)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Mon, 22 Nov 2021 02:04:35 -0800
Organization: None to speak of
Lines: 12
Message-ID: <87mtlwik0c.fsf@nosuchdomain.example.com>
References: <86pmqwkm7u.fsf@levado.to>
<adc13d23-34da-4340-a092-f22c38f3fc1dn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="58c43985092162f95fdbb2eb9dbfb717";
logging-data="15407"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18eGlZMZac0i7lSR3jPL5jG"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Cancel-Lock: sha1:T/Azha9+RU3Ly9f+Uv78qoef63s=
sha1:Hh3UYBxAe40RQeGXzUE3Nnc0g4s=
 by: Keith Thompson - Mon, 22 Nov 2021 10:04 UTC

Mark Bluemel <mark.bluemel@gmail.com> writes:
[...]
> So, given that C only allows single return values from functions, the return
> value from "read()" needs to be able to represent any byte (char) value plus
> one other value. So it needs to be bigger than a 'C' char.

You mean from getchar(), not read().

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<snfr3l$chl$1@solani.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19132&group=comp.lang.c#19132

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!reader5.news.weretis.net!news.solani.org!.POSTED!not-for-mail
From: pkk...@spth.de (Philipp Klaus Krause)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Mon, 22 Nov 2021 11:23:48 +0100
Message-ID: <snfr3l$chl$1@solani.org>
References: <86pmqwkm7u.fsf@levado.to>
<878rxjq4l9.fsf@nosuchdomain.example.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 22 Nov 2021 10:23:49 -0000 (UTC)
Injection-Info: solani.org;
logging-data="12853"; mail-complaints-to="abuse@news.solani.org"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
Cancel-Lock: sha1:fvFuPfQbFLjITO1zXbZOPfcslbw=
X-User-ID: eJwNx8cBwDAIBLCVONPCOBSz/wiOflI2WLuYmujqRiOJFd/wtOPEWikT9aYHhnH+g7eulN+8DxDrESA=
Content-Language: en-US
In-Reply-To: <878rxjq4l9.fsf@nosuchdomain.example.com>
 by: Philipp Klaus Krause - Mon, 22 Nov 2021 10:23 UTC

Am 19.11.21 um 21:21 schrieb Keith Thompson:

> Note that if you have an exotic implementation with CHAR_BIT==16 and
> sizeof(int)==1, storing the result of getchar() in an int still doesn't
> let you distinguish between EOF and a valid character with the same
> value.
Many such implementations, however do use 8-bit octets for interfacing
with the outside world. I.e. on such ain implementation even where
CHAR_MAX >= 256, getchar will still always return either a value in the
range [0, 255] or EOF.

Philipp

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<ca864bb6-0f64-4274-a5d7-4d0481a3ee5bn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19134&group=comp.lang.c#19134

  copy link   Newsgroups: comp.lang.c
X-Received: by 2002:a05:622a:53:: with SMTP id y19mr30109520qtw.96.1637577979650;
Mon, 22 Nov 2021 02:46:19 -0800 (PST)
X-Received: by 2002:a05:6214:c65:: with SMTP id t5mr33858812qvj.27.1637577979488;
Mon, 22 Nov 2021 02:46:19 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Mon, 22 Nov 2021 02:46:19 -0800 (PST)
In-Reply-To: <87mtlwik0c.fsf@nosuchdomain.example.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23c7:a280:3401:2c9a:df92:189a:ec79;
posting-account=3LA7mQoAAAByiBtHIUvpFq0_QEKnHGc9
NNTP-Posting-Host: 2a00:23c7:a280:3401:2c9a:df92:189a:ec79
References: <86pmqwkm7u.fsf@levado.to> <adc13d23-34da-4340-a092-f22c38f3fc1dn@googlegroups.com>
<87mtlwik0c.fsf@nosuchdomain.example.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ca864bb6-0f64-4274-a5d7-4d0481a3ee5bn@googlegroups.com>
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
From: mark.blu...@gmail.com (Mark Bluemel)
Injection-Date: Mon, 22 Nov 2021 10:46:19 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 8
 by: Mark Bluemel - Mon, 22 Nov 2021 10:46 UTC

On Monday, 22 November 2021 at 10:04:48 UTC, Keith Thompson wrote:
> Mark Bluemel <mark.b...@gmail.com> writes:
> [...]
> > So, given that C only allows single return values from functions, the return
> > value from "read()" needs to be able to represent any byte (char) value plus
> > one other value. So it needs to be bigger than a 'C' char.
> You mean from getchar(), not read().

True - thanks for the correction, Keith.

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<87ilwjj4gl.fsf@nosuchdomain.example.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19141&group=comp.lang.c#19141

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: Keith.S....@gmail.com (Keith Thompson)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Mon, 22 Nov 2021 12:55:06 -0800
Organization: None to speak of
Lines: 20
Message-ID: <87ilwjj4gl.fsf@nosuchdomain.example.com>
References: <86pmqwkm7u.fsf@levado.to>
<878rxjq4l9.fsf@nosuchdomain.example.com> <snfr3l$chl$1@solani.org>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="58c43985092162f95fdbb2eb9dbfb717";
logging-data="8872"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/iWIZhKAd1coKb/iJIXe9u"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Cancel-Lock: sha1:MZUPA2IRCNZcrXEsV1YhWjhLjPk=
sha1:8jSHBYlpA94FyUbHTrLbb2Bpxdc=
 by: Keith Thompson - Mon, 22 Nov 2021 20:55 UTC

Philipp Klaus Krause <pkk@spth.de> writes:
> Am 19.11.21 um 21:21 schrieb Keith Thompson:
>> Note that if you have an exotic implementation with CHAR_BIT==16 and
>> sizeof(int)==1, storing the result of getchar() in an int still doesn't
>> let you distinguish between EOF and a valid character with the same
>> value.
> Many such implementations, however do use 8-bit octets for interfacing
> with the outside world. I.e. on such ain implementation even where
> CHAR_MAX >= 256, getchar will still always return either a value in the
> range [0, 255] or EOF.

Interesting. That's likely to be non-conforming, since for example you
should be able to write a byte value of 0xABCD to a stream and read it
back as 0xABCD (N1570 7.21.2) -- but useful and non-conforming is
sometimes better than conforming.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<6c7ef07b-29d4-47f0-82eb-4d071fe65db8n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19154&group=comp.lang.c#19154

  copy link   Newsgroups: comp.lang.c
X-Received: by 2002:ac8:5b83:: with SMTP id a3mr3272928qta.62.1637644950472;
Mon, 22 Nov 2021 21:22:30 -0800 (PST)
X-Received: by 2002:ac8:7f4f:: with SMTP id g15mr3234902qtk.309.1637644950332;
Mon, 22 Nov 2021 21:22:30 -0800 (PST)
Path: i2pn2.org!rocksolid2!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Mon, 22 Nov 2021 21:22:30 -0800 (PST)
In-Reply-To: <87mtlwik0c.fsf@nosuchdomain.example.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:1d5:2649:210d:3556;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:1d5:2649:210d:3556
References: <86pmqwkm7u.fsf@levado.to> <adc13d23-34da-4340-a092-f22c38f3fc1dn@googlegroups.com>
<87mtlwik0c.fsf@nosuchdomain.example.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6c7ef07b-29d4-47f0-82eb-4d071fe65db8n@googlegroups.com>
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
From: malcolm....@gmail.com (Malcolm McLean)
Injection-Date: Tue, 23 Nov 2021 05:22:30 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 9
 by: Malcolm McLean - Tue, 23 Nov 2021 05:22 UTC

On Monday, 22 November 2021 at 10:04:48 UTC, Keith Thompson wrote:
> Mark Bluemel <mark.b...@gmail.com> writes:
> [...]
> > So, given that C only allows single return values from functions, the return
> > value from "read()" needs to be able to represent any byte (char) value plus
> > one other value. So it needs to be bigger than a 'C' char.
> You mean from getchar(), not read().
>
getc() / fgetc(). getchar() reads from stdin, which is text. ASCII does in fact
have end of text / end of transmission codes (EXT = 3, EOT = 4).

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<87wnkzh0hj.fsf@nosuchdomain.example.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19155&group=comp.lang.c#19155

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: Keith.S....@gmail.com (Keith Thompson)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Mon, 22 Nov 2021 22:03:52 -0800
Organization: None to speak of
Lines: 29
Message-ID: <87wnkzh0hj.fsf@nosuchdomain.example.com>
References: <86pmqwkm7u.fsf@levado.to>
<adc13d23-34da-4340-a092-f22c38f3fc1dn@googlegroups.com>
<87mtlwik0c.fsf@nosuchdomain.example.com>
<6c7ef07b-29d4-47f0-82eb-4d071fe65db8n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="b82abc4b3fdda89f0afa055b2002bcb0";
logging-data="2850"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/mFgK20MEMLRIIqVzpPrX2"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Cancel-Lock: sha1:DpGl67LmywgvS1SFywK34Si5aQo=
sha1:OAD8CzwUhF0T84sB+0pNfBoWlHo=
 by: Keith Thompson - Tue, 23 Nov 2021 06:03 UTC

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
> On Monday, 22 November 2021 at 10:04:48 UTC, Keith Thompson wrote:
>> Mark Bluemel <mark.b...@gmail.com> writes:
>> [...]
>> > So, given that C only allows single return values from functions, the return
>> > value from "read()" needs to be able to represent any byte (char) value plus
>> > one other value. So it needs to be bigger than a 'C' char.
>> You mean from getchar(), not read().
>>
> getc() / fgetc(). getchar() reads from stdin, which is text. ASCII does in fact
> have end of text / end of transmission codes (EXT = 3, EOT = 4).

getchar() does read from stdin, but getc() and fgetc() take a FILE*
argument and can easily read from binary streams.

Yes, ASCII does define EXT and EOT characters. EOT is commonly entered
as Control-D, which typically triggers an end-of-file condition on
Unix-like systems -- but typically those characters when stored in a
file are just treated as ordinary characters (for which iscntrl() will
return a true value). An EOT character stored in a file is meaningless
(or rather, it has whatever meaning you choose to assign to it).

Few systems use more than a handful of the ASCII control characters with
their originally intended meanings.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<Ve7nJ.62021$np6.3586@fx46.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19165&group=comp.lang.c#19165

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx46.iad.POSTED!not-for-mail
X-newsreader: xrn 9.03-beta-14-64bit
Sender: scott@dragon.sl.home (Scott Lurndal)
From: sco...@slp53.sl.home (Scott Lurndal)
Reply-To: slp53@pacbell.net
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Newsgroups: comp.lang.c
References: <86pmqwkm7u.fsf@levado.to> <adc13d23-34da-4340-a092-f22c38f3fc1dn@googlegroups.com> <87mtlwik0c.fsf@nosuchdomain.example.com> <6c7ef07b-29d4-47f0-82eb-4d071fe65db8n@googlegroups.com> <87wnkzh0hj.fsf@nosuchdomain.example.com>
Lines: 35
Message-ID: <Ve7nJ.62021$np6.3586@fx46.iad>
X-Complaints-To: abuse@usenetserver.com
NNTP-Posting-Date: Tue, 23 Nov 2021 14:52:37 UTC
Organization: UsenetServer - www.usenetserver.com
Date: Tue, 23 Nov 2021 14:52:37 GMT
X-Received-Bytes: 2427
X-Original-Bytes: 2376
 by: Scott Lurndal - Tue, 23 Nov 2021 14:52 UTC

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
>> On Monday, 22 November 2021 at 10:04:48 UTC, Keith Thompson wrote:
>>> Mark Bluemel <mark.b...@gmail.com> writes:
>>> [...]
>>> > So, given that C only allows single return values from functions, the return
>>> > value from "read()" needs to be able to represent any byte (char) value plus
>>> > one other value. So it needs to be bigger than a 'C' char.
>>> You mean from getchar(), not read().
>>>
>> getc() / fgetc(). getchar() reads from stdin, which is text. ASCII does in fact
>> have end of text / end of transmission codes (EXT = 3, EOT = 4).
>
>getchar() does read from stdin, but getc() and fgetc() take a FILE*
>argument and can easily read from binary streams.
>
>Yes, ASCII does define EXT and EOT characters. EOT is commonly entered
>as Control-D, which typically triggers an end-of-file condition on
>Unix-like systems -- but typically those characters when stored in a
>file are just treated as ordinary characters (for which iscntrl() will
>return a true value). An EOT character stored in a file is meaningless
>(or rather, it has whatever meaning you choose to assign to it).
>
>Few systems use more than a handful of the ASCII control characters with
>their originally intended meanings.

Yes, the old protocols for poll-select and contention mode.

[SYN] SOH <ad1> <ad2> <xm> STX <data bytes> ETX BCC

<ad1> <ad2> are two byte poll/select station address.
<xm> poll vs. select byte.

Still used when accessing obsolete block-mode terminals (I have two
working examples).

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<snj3eu$6a8$1@solani.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19169&group=comp.lang.c#19169

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!reader5.news.weretis.net!news.solani.org!.POSTED!not-for-mail
From: pkk...@spth.de (Philipp Klaus Krause)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Tue, 23 Nov 2021 17:04:46 +0100
Message-ID: <snj3eu$6a8$1@solani.org>
References: <86pmqwkm7u.fsf@levado.to>
<878rxjq4l9.fsf@nosuchdomain.example.com> <snfr3l$chl$1@solani.org>
<87ilwjj4gl.fsf@nosuchdomain.example.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Tue, 23 Nov 2021 16:04:46 -0000 (UTC)
Injection-Info: solani.org;
logging-data="6472"; mail-complaints-to="abuse@news.solani.org"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
X-User-ID: eJwFwQkBACAIBLBK8mMcgaN/BDcTJ+9QN1db27Ttl9t5ARKvunlGETwpiNI5A6U1BvhpyAcwuxGi
Content-Language: en-US
Cancel-Lock: sha1:HiOgjccXd5rd1TW0OxNOjR8c8Oc=
In-Reply-To: <87ilwjj4gl.fsf@nosuchdomain.example.com>
 by: Philipp Klaus Krause - Tue, 23 Nov 2021 16:04 UTC

Am 22.11.21 um 21:55 schrieb Keith Thompson:
> but useful and non-conforming is
> sometimes better than conforming.
>

A 16-bit ptrdiff_t on a freestanding implementation that does not
support objects bigger than (1 << 15) would be another common example.
Though that is only nonconforming for C99, C11 and C17, but valid C90,
C95 and C23.

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<87pmqqhinf.fsf@nosuchdomain.example.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19179&group=comp.lang.c#19179

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: Keith.S....@gmail.com (Keith Thompson)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Tue, 23 Nov 2021 09:43:48 -0800
Organization: None to speak of
Lines: 17
Message-ID: <87pmqqhinf.fsf@nosuchdomain.example.com>
References: <86pmqwkm7u.fsf@levado.to>
<878rxjq4l9.fsf@nosuchdomain.example.com> <snfr3l$chl$1@solani.org>
<87ilwjj4gl.fsf@nosuchdomain.example.com> <snj3eu$6a8$1@solani.org>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="b82abc4b3fdda89f0afa055b2002bcb0";
logging-data="17228"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ARipILqaqBhPbzNIM1z98"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Cancel-Lock: sha1:FusH6fZYFyhfKqSmqJegArx6Hzw=
sha1:EmJE28f3rzL7aNYf/TzZ4uf0Ttg=
 by: Keith Thompson - Tue, 23 Nov 2021 17:43 UTC

Philipp Klaus Krause <pkk@spth.de> writes:
> Am 22.11.21 um 21:55 schrieb Keith Thompson:
>> but useful and non-conforming is
>> sometimes better than conforming.
>
> A 16-bit ptrdiff_t on a freestanding implementation that does not
> support objects bigger than (1 << 15) would be another common example.
> Though that is only nonconforming for C99, C11 and C17, but valid C90,
> C95 and C23.

How would it be valid C23? The N2731 draft defines PTRDIFF_WIDTH in
<stdint.h> and requires it to be at least 17.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<snl40j$7qn$1@solani.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19207&group=comp.lang.c#19207

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!reader5.news.weretis.net!news.solani.org!.POSTED!not-for-mail
From: pkk...@spth.de (Philipp Klaus Krause)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Wed, 24 Nov 2021 11:26:27 +0100
Message-ID: <snl40j$7qn$1@solani.org>
References: <86pmqwkm7u.fsf@levado.to>
<878rxjq4l9.fsf@nosuchdomain.example.com> <snfr3l$chl$1@solani.org>
<87ilwjj4gl.fsf@nosuchdomain.example.com> <snj3eu$6a8$1@solani.org>
<87pmqqhinf.fsf@nosuchdomain.example.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
Injection-Date: Wed, 24 Nov 2021 10:26:27 -0000 (UTC)
Injection-Info: solani.org;
logging-data="8023"; mail-complaints-to="abuse@news.solani.org"
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.14.0
In-Reply-To: <87pmqqhinf.fsf@nosuchdomain.example.com>
Content-Language: en-US
X-User-ID: eJwFwQkBwDAIA0BLUAiPnMEa/xJ6BwuNTQ+Eg+CVzdsMcv8dcsoSpDZnXOW7aigZsXPapOQBRm8Rjg==
Cancel-Lock: sha1:EPnCOpFzcnWgVO4f0XM2JDR+nSo=
 by: Philipp Klaus Krause - Wed, 24 Nov 2021 10:26 UTC

Am 23.11.21 um 18:43 schrieb Keith Thompson:
> Philipp Klaus Krause <pkk@spth.de> writes:
>> Am 22.11.21 um 21:55 schrieb Keith Thompson:
>>> but useful and non-conforming is
>>> sometimes better than conforming.
>>
>> A 16-bit ptrdiff_t on a freestanding implementation that does not
>> support objects bigger than (1 << 15) would be another common example.
>> Though that is only nonconforming for C99, C11 and C17, but valid C90,
>> C95 and C23.
>
> How would it be valid C23? The N2731 draft defines PTRDIFF_WIDTH in
> <stdint.h> and requires it to be at least 17.
>

N2808 was voted into C23 on Thursday (conditionally on not causing
problems for C++, which will be discussed at one of the monthly
WG14/WG21 liaison meetings).

Philipp

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<4fcda596-2e00-4dd6-bf8f-4616218f4398n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19254&group=comp.lang.c#19254

  copy link   Newsgroups: comp.lang.c
X-Received: by 2002:a05:620a:4495:: with SMTP id x21mr13917435qkp.633.1637940717103;
Fri, 26 Nov 2021 07:31:57 -0800 (PST)
X-Received: by 2002:ac8:5950:: with SMTP id 16mr16232596qtz.462.1637940716782;
Fri, 26 Nov 2021 07:31:56 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Fri, 26 Nov 2021 07:31:56 -0800 (PST)
In-Reply-To: <86pmqwkm7u.fsf@levado.to>
Injection-Info: google-groups.googlegroups.com; posting-host=142.112.241.86; posting-account=TrQTOgoAAAC2NG571LtljP2OpjnuTN79
NNTP-Posting-Host: 142.112.241.86
References: <86pmqwkm7u.fsf@levado.to>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <4fcda596-2e00-4dd6-bf8f-4616218f4398n@googlegroups.com>
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
From: dave.dun...@gmail.com (Dave Dunfield)
Injection-Date: Fri, 26 Nov 2021 15:31:57 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 98
 by: Dave Dunfield - Fri, 26 Nov 2021 15:31 UTC

On Friday, November 19, 2021 at 1:59:47 PM UTC-5, Meredith Montgomery wrote:

>I did not get Brian Kernighan's concern with setting EOF to a char c.

A bit of an interesting topic. As someone who created a somewhat well
received C compiler "back in the day", here is my $0.02

* Note, I wrote my compiler mid 80's, and although I later updated it to
* support some more modern syntax, it is essentially a K&R compiler.
+ I've not read through all the comments, so sorry if I duplicate.

Lets assume a very simple 16 bit system with byte support:
int = -32768 to 32767 = 0x8000 to 0x7FFF : -1 = 0xFFFF
unsigned int = 0 to 65535 = 0x0000 to 0xFFFF
char = -128 to 127 = 0x80 to 0x7F : -1 = 0xFF
unsigned char = 0 to 255 = 0x00 to 0xFF

Also, lets call signed int/char variables: si/sc
and unsigned int/char variables: ui/uc

getc() must return 16 bits, because a char value of 255 inside a binary
file is NOT EOF.

Assuming 255 is the next byte read:

If you do: uc = getc()
then uc will have 255 which is NOT EOF(-1)

If you do: sc = getc()
then sc will have -1 which IS EOF! (wrong)

** The effect is that if you are reading an ASCII text file where 0xFF
** never occurs, it appears to work correctly.
** But if you are reading a binary file into signed characters (something
** I would never do) it might see EOF instead of 0xFF characters.

If however you do: si = getc()
then si will have 255!

I find in practice with older compilers (including mine), this may
not always "get you". Consider a CPU with registrers R16 and R8:
if((c = getc()) == EOF) ...
might generate something like:

CALL getc ; 16 bit resuts comes back in R16
STORE lower-half-of R16 to c
CMP R16,0xFFFF ; R16 still has 0xFF
JZ ... ; Not taken!

In this case, because getc() returned int, even though it was stored
in a byte location, the compiler knows it's still an int value and
would probably do full 16 bit compare.

If however:
c = getc();
if(c == EOF) ...
You might see something like:

CALL getc
STORE lower-half-of R16 to c
LOAD R8 from c
CMP R8,0xFF ; now TRUE
JZ ... ; Now it IS taken!
And yes, optimization might prevent the extra load but the compiler would
still do an 8-bit compare - since it's comparing c with -1, 255 is EOF!

Regarding signed .vs. unsigned, I find that many older compilers only
take this into account for magnitude comparisons. Simple == and != would
often work regardless of signededness - this may not apply to more modern
standards where extra code might be generated to caused an unsigned 255
(0xFF) to NOT be the same as a signed -1 (0xFF).. in general I temd not
to trust that they will be different (or the same)!

I worked mainly on (and designed my compiler for) very small embedded
systems.. most "real programmers" dislike my style as I tend to use "unsigned"
and "unsigned char" instead of "int" and "char" when I don't actually
need/want signed values. On very tiny CPUs sign-extension may require more
code than simple zero-extension.

I also don't much rely on signed characters in general. If I want to
know a character is not ASCII (0x00-0x7F) I will usually: if(c & 0x80) ...

-- Interesting but mostly irrevlent note --

This does actually come into play with my C-FLEA, a virtual CPU I designed
to be a fairly optimal target for my compiler with very small code size.
C-FLEA has one "accumulator" which is always 16-bit. 8-bit is only supported
for memory accesses which when used as a value is auto zero-extended.

This means "char" appears as a "signed" positive value between 0 and 255
where "unsigned char" is an unsigned positive value between 0 and 255!
The "signed"/"unsigned" attribute only affects how the compiler propagates
the signededness of results.

Dave

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<91a56855-b917-45a2-89be-7e2ed806d357n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19303&group=comp.lang.c#19303

  copy link   Newsgroups: comp.lang.c
X-Received: by 2002:ac8:5a84:: with SMTP id c4mr8513539qtc.565.1638977191596;
Wed, 08 Dec 2021 07:26:31 -0800 (PST)
X-Received: by 2002:a05:622a:48e:: with SMTP id p14mr8455537qtx.553.1638977191171;
Wed, 08 Dec 2021 07:26:31 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Wed, 8 Dec 2021 07:26:30 -0800 (PST)
In-Reply-To: <4fcda596-2e00-4dd6-bf8f-4616218f4398n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=142.112.241.86; posting-account=TrQTOgoAAAC2NG571LtljP2OpjnuTN79
NNTP-Posting-Host: 142.112.241.86
References: <86pmqwkm7u.fsf@levado.to> <4fcda596-2e00-4dd6-bf8f-4616218f4398n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <91a56855-b917-45a2-89be-7e2ed806d357n@googlegroups.com>
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
From: dave.dun...@gmail.com (Dave Dunfield)
Injection-Date: Wed, 08 Dec 2021 15:26:31 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 173
 by: Dave Dunfield - Wed, 8 Dec 2021 15:26 UTC

Having nothing better to do this morning, I decided to write a little
test of this for various compilers I have readily available:

My own Micro-C (DOS and DVM-Win32)
Borland Turbo-C (DOS)
LCCwin32 (Win32)
GCC under (Linux Ubunto 20.10)

The tests simply creates a small (5 byte) file containing
'a', 'b', 0xFF, 'c' and 'd'
then reads it, testing for EOF with:
if((var = getc(fp)) = EOF) ...
and also:
var = getc(fp);
if(var == EOF) ...
using a 'var' variable of:
signed int, unsigned int, signed char & unsigned char

Here are the results:
--------------------------------
Micro-C/DOS EOF= -1 FFFF
signed int : if(v=getc())..
'a''b'[FF]'c''d' =5
unsigned int : if(v=getc())..
'a''b'[FF]'c''d' =5
signed char : if(v=getc())..
'a''b'[FFFF]'c''d' =5
unsigned char : if(v=getc())..
'a''b'[FF]'c''d' =5
signed int : v=getcc();if(v)...
'a''b'[FF]'c''d'[FFFF] =5
unsigned int : v=getcc();if(v)...
'a''b'[FF]'c''d'[FFFF] =5
signed char : v=getcc();if(v)...
'a''b'[FFFF] =2
unsigned char : v=getcc();if(v)...
'a''b'[FF] =2

Micro-C/DVM EOF= -1 FFFF
signed int : if(v=getc())..
'a''b'[FF]'c''d' =5
unsigned int : if(v=getc())..
'a''b'[FF]'c''d' =5
signed char : if(v=getc())..
'a''b'[FF]'c''d' =5
unsigned char : if(v=getc())..
'a''b'[FF]'c''d' =5
signed int : v=getcc();if(v)...
'a''b'[FF]'c''d'[FFFF] =5
unsigned int : v=getcc();if(v)...
'a''b'[FF]'c''d'[FFFF] =5
signed char : v=getcc();if(v)...
'a''b'[FF] =2
unsigned char : v=getcc();if(v)...
'a''b'[FF] =2

Turbo-C EOF= -1 ffff
signed int : if(v=getc())..
'a''b'[ff]'c''d' =5
unsigned int : if(v=getc())..
'a''b'[ff]'c''d' =5
signed char : if(v=getc())..
'a''b' =2
unsigned char : if(v=getc())..
'a''b'[ff]'c''d'[ff][ff][ff][ff] =9
signed int : v=getcc();if(v)...
'a''b'[ff]'c''d'[ffff] =5
unsigned int : v=getcc();if(v)...
'a''b'[ff]'c''d'[ffff] =5
signed char : v=getcc();if(v)...
'a''b'[ffff] =2
unsigned char : v=getcc();if(v)...
'a''b'[ff]'c''d'[ff][ff][ff][ff] =9

LCC: EOF= -1 ffffffff
signed int : if(v=getc())..
'a''b'[ff]'c''d' =5
unsigned int : if(v=getc())..
'a''b'[ff]'c''d' =5
signed char : if(v=getc())..
'a''b' =2
unsigned char : if(v=getc())..
'a''b'[ff]'c''d'[ff][ff][ff][ff] =9
signed int : v=getcc();if(v)...
'a''b'[ff]'c''d'[ff] =5
unsigned int : v=getcc();if(v)...
'a''b'[ff]'c''d'[ff] =5
signed char : v=getcc();if(v)...
'a''b'[ff] =2
unsigned char : v=getcc();if(v)...
'a''b'[ff]'c''d'[ff][ff][ff][ff] =9

GCC: EOF= -1 ffffffff
signed int : if(v=getc())..
'a''b'[ff]'c''d' =5
unsigned int : if(v=getc())..
'a''b'[ff]'c''d' =5
signed char : if(v=getc())..
'a''b' =2
unsigned char : if(v=getc())..
'a''b'[ff]'c''d'[ff][ff][ff][ff] =9
signed int : v=getcc();if(v)...
'a''b'[ff]'c''d'[ffffffff] =5
unsigned int : v=getcc();if(v)...
'a''b'[ff]'c''d'[ffffffff] =5
signed char : v=getcc();if(v)...
'a''b'[ffffffff] =2
unsigned char : v=getcc();if(v)...
'a''b'[ff]'c''d'[ff][ff][ff][ff] =9
--------------------------------

Some "interesting" notes:

Both Micro-C and Turbo-C are 16 bit compiler, LCC and GCC are 32 bit.

In all cases, EOF appears to have a value of -1

Micro-C does not do "extra work" to compare char (byte) variables. This is
mainly because it was designed with VERY small systems in mind (limited word
operations and often not much code storage space). In these cases "char"
variables are used a LOT and Micro-C generates only byte operations in
expressions having only 8 bit types.

Due to it's C-FLEA CPU, in Micro-C/DVM, "char" is treated as 16-bit signed
positive value between 0-255.

Other compilers appear to be doing more with "char" values, sign extending
them or EOF is defined in such a way it cannot be interpreted as a byte (-1
can be a byte).

Also, Micro-C keep the type/content of an expression when a partial result
is stored into a different type "part way along". Some other compilers
appear to treat the whole result differently when when this happens...

In case anyone wants to look at: TEST.C
This file is ENCTXTed to protect from online reformatting.
To Decode get: Daves Old Computers->Personal->Downloads->ENCTXT
----------------------------------------------------------------------
7u7p7J7f7p7f8T8k8y8z8y7f8n8g8t8j8r8o8t8m7f8u8l7f8E8O8F7f828o8z8n7f8y8o
8m8t8k8j7u808t:v:g7f8o8t8z7u8i8n8g8x:::B7I8x8k8y808r8z:P:T8i8u8s8v7t8r
8g8t8m7t8i7f8v8u8y8z:::Q:::B:P:B8D8g818k7f8D808t8l8o8k8r8j7f7f7f7s::At
8n8z8z8v8y757u7u8jA:Am7t8z8n8k8s8o8t8j8l8g8i8z8u8x847t::AG:::B7u7J7i8o
8t8i8r808j8k7f778y8z8j8o8u7t8n797J7J7i8j8k8l8o8t8k7I8T8F8I8L8E7I7h8T8M
8P7t8D8A8T7h7I7I7u7u:::E8s8v7f8l8o8r8k8t8g8s8k7J7J:PB57J7I7p8l8v767J7J
Af:n:P:07f7p8V:PCV8a8c7f787f86:fCH818g8x8o8g8h8r8k7f8z848v8k7f:PCV8y7J
7I7hAv:p7h7r::DZBP:n:fDmB:CpBvDm:fED7f88::Ck::CJ7w8y8z7f8z:::G7f757f8m
8k8z8i7n7o7fA::88y::BP:v:t8u:PDF7f:::M:fEl:::t8g8z7f7h8y::CW7f8z8o8s8k
7hAvBv8S8T7w7n817o8b7J7I8o8l7n7n81::C9:fEt8l8v7o7o7f7878:P:W:PFz7I8h8x
8k8g8q76::F08S8n8u82::Fx:PGZ8m8u::FE8g7w:vEc7xA:FMHvEq8n8k8t::Hh::FW81
8k8x7f8o8yA:FOBPFk7x:vFxBPF7C:GZ::F3::F7D:GKBfGo:PGd7f8g:fCv::BN::Ht81
8g8r808k7f7n8o8t7f8H8E8X7f8o8l7f8t8u8z7f8v8x:::w:PDK7o:PEe8N8u8z8k757f
8A8y8y80::DW7f8A8S8C8I8IAvJyB:JT8y8k8z7g7J818u8o8j:fJM7nAvCn7o7J867J7I
:fJz8l7n7f7n7n8i7f777f7m7f7m7o7f87877f::LQ797f7m897m::GG7;7f7h8a7k838c
7h::Eq7h7m7k8i7m7h7r7f8i7f7o767J88:fEd8P8k8x8l8u8x8s:PHh:vEl8u8t::JR8m
8o81::HkAfDG7l:vEl:PDP:vKt8D8u8T8y8zB:K3:PCx8s8u8j::J7::LE:::w7f8S:::w
767J7IBP:n7f8o7r7f8U:vNe:fCw8S:P:0BPNh:fCw8U:vOB7J:::782::BJ:fIU::F1::
C97v::NhA:LH7h7k8y::Eq7k8y8b8t7f7f:PDm7I:vC2:PNQ7l7y8c:PPI7n:PNQ::LS7z
:fLm:PIn78:vEt7o7t7t:fLv:vPu::Ew76:PIn::P27t7h::GR::L97J:PCI8I8t7f8i8g
8y8k:f:W:PJv8j8k8z8k::JZ8j7t7f8z8x847f777w7v:fFe8y7J8g7w75:PF27q7q8o::
LS7w7v7o7f8y:::b8i8n:fPd7o7f::LE7I:fQf7v7I75:vFs:PNd:POo:vRl7wAPRs:PN0
AvR47xAfRs:P:0AvR47yAfSDBvSj7z:vRs7xC:Rz70APTKBvSL71AfTKBvSj72AfTh:vSj
7I88AvOy7f787k808b8t::L48o7s7w:vL9:PNZ8s8g8o8t7n:fLCAv:n:PObAPO0:::X78
7f7k::As7k83:fUv:::X:fVvAPQU8F8o8x::Ej8i::GV8z8k:fCQ::AF::J28o8t:P:Q70
AvJT::DY7u7u7I7f7m8g7m7v83717w7r7f7m8h:PWy7x7r7f7v838F::Vz7m8i:PWy7y::
W38j:PWy7z:fF17g::GD::C98l8u8v8k8t7n:fB47r7f7h828h7h7o::GG:PRiAPO08C8g
8t7m8z7f8W8R8I8T8E:vP:::L4:fB4:fR48x8k8z808x8t7f7w767f::Ug8l8v808z8y7n
7h8g8h::L4:vOm::Yx8i7n:vXC:vOmA:Yw8i8jAfY58l8i8r8u8y8k:vOlEvXY8xEPXz8R
8E8A8DEvYQ7x:PYr::Yu8u8x7n8o787v767f:PRN73767f::RL7o::GR:vM88o:fQUBvZl
A:Yj::Ow887J
----------------------------------------------------------------------

Regards,
Dave - see "Daves Old computers" -> Personal

Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

<87a6hac0oj.fsf@nosuchdomain.example.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19304&group=comp.lang.c#19304

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: Keith.S....@gmail.com (Keith Thompson)
Newsgroups: comp.lang.c
Subject: Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''
Date: Wed, 08 Dec 2021 12:16:12 -0800
Organization: None to speak of
Lines: 44
Message-ID: <87a6hac0oj.fsf@nosuchdomain.example.com>
References: <86pmqwkm7u.fsf@levado.to>
<4fcda596-2e00-4dd6-bf8f-4616218f4398n@googlegroups.com>
<91a56855-b917-45a2-89be-7e2ed806d357n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="e65fa27a3e2f2815f32ec2b8e7052d4b";
logging-data="18679"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19BNHg3mD0buussOeQOMYyB"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Cancel-Lock: sha1:ty66pph1vCY6eyxxpr1oJ4Gn/Ks=
sha1:DslNY7EXp2J1Q0/vOHjP6xrSJ2Q=
 by: Keith Thompson - Wed, 8 Dec 2021 20:16 UTC

Dave Dunfield <dave.dunfield@gmail.com> writes:
> Having nothing better to do this morning, I decided to write a little
> test of this for various compilers I have readily available:
>
> My own Micro-C (DOS and DVM-Win32)
> Borland Turbo-C (DOS)
> LCCwin32 (Win32)
> GCC under (Linux Ubunto 20.10)
>
> The tests simply creates a small (5 byte) file containing
> 'a', 'b', 0xFF, 'c' and 'd'
> then reads it, testing for EOF with:
> if((var = getc(fp)) = EOF) ...
> and also:
> var = getc(fp);
> if(var == EOF) ...
> using a 'var' variable of:
> signed int, unsigned int, signed char & unsigned char
>
> Here are the results:
[SNIP]

> In case anyone wants to look at: TEST.C
> This file is ENCTXTed to protect from online reformatting.
> To Decode get: Daves Old Computers->Personal->Downloads->ENCTXT
> ----------------------------------------------------------------------
> 7u7p7J7f7p7f8T8k8y8z8y7f8n8g8t8j8r8o8t8m7f8u8l7f8E8O8F7f828o8z8n7f8y8o
> 8m8t8k8j7u808t:v:g7f8o8t8z7u8i8n8g8x:::B7I8x8k8y808r8z:P:T8i8u8s8v7t8r
[...]
> 8E8A8DEvYQ7x:PYr::Yu8u8x7n8o787v767f:PRN73767f::RL7o::GR:vM88o:fQUBvZl
> A:Yj::Ow887J
> ----------------------------------------------------------------------

I don't think many people are going to download EXCTXT so they can
decode your source file.

You can just include code inline in a post. As long as the lines aren't
too long it shouldn't be a problem. If absolutely necessary, you can
use a standard encoding like base64.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */


devel / comp.lang.c / Re: K&R, 2nd edition, Brian's concerns with ``char c = EOF''

Pages:123
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor