novaBBS - comp.lang.c - Re: 32-bit pointers, 64-bit longs

Bart <bc@freeuk.com> writes:
>On 19/05/2021 21:12, Scott Lurndal wrote:
>> Bart <bc@freeuk.com> writes:
>>> On 19/05/2021 20:04, Scott Lurndal wrote:
>>>> Bart <bc@freeuk.com> writes:
>>>>> On 19/05/2021 18:30, Scott Lurndal wrote:
>>>>>> Bart <bc@freeuk.com> writes:
>>>>>
>>>>>>> And here, also, the most used locals will likely reside in registers not
>>>>>>> memory.
>>>>>>
>>>>>> By far, the most prevelent data in a real program will be allocated
>>>>>> off the heap (e.g. structs), but for every four 64-bit locals you
>>>>>> need a cache line, where you can fit eight 32-bit locals in the same line.
>>>>>
>>>>> But I'm talking about individual variables, that is named variables
>>>>> comprising a single int value, of which there will be a small number in
>>>>> any function.
>>>>
>>>> Most functions in my experience aren't limited to stack local
>>>> data, and thus neither is their cache footprint dependent upon
>>>> local declarations.
>>>>
>>>
>>> This is my point. Whether those are int32 or int64 is not relevant on
>>> 64-bit machines.'
>>
>> How do you come to that conclusion? Their cache footprint is not dependent
>> upon the locals _alone_ because there is other data accessed from the function that
>> affects the cache footprint _more_. However, the locals still affect
>> the cache footprint, potentially significantly.
>>
>
>I'm slightly losing track of whether you are arguing for against using
>64-bit integers.

Neither. Use the appropriate type for the job.

If you need a value that fits in a 32-bit integer, declare it as such.

David Brown <david.brown@hesbynett.no> writes:

> On 18/05/2021 14:53, Ben Bacarisse wrote:
>> David Brown <david.brown@hesbynett.no> writes:
>>
>>> On 18/05/2021 14:15, Ben Bacarisse wrote:
>>>> David Brown <david.brown@hesbynett.no> writes:
>>>>
>>>>> We have a far greater range of languages now, with a wide selection of
>>>>> balances between features, simplicity, run-time efficiency, developer
>>>>> efficiency, ease-of-use, safety, etc. Some are more minimal, with
>>>>> perhaps just a single integer type at 64-bit, others have a selection of
>>>>> different sizes for different purposes.
>>>>>
>>>>> IME and IMHO, I would say there are three basic kinds of integers,
>>>>> depending on their usage.
>>>>>
>>>>> There are low-level uses - required for interaction with hardware,
>>>>> connection to data outside the program (file formats, network protocols,
>>>>> foreign-function interfaces, etc.), or for when you want accurate
>>>>> control such as for getting maximal efficiency from large data
>>>>> structures. In these cases, you want size-specific types. Whether you
>>>>> call them int32_t, i32, Int<32>, etc., is a matter of taste. But they
>>>>> should be size-specific and explicit.
>>>>
>>>> I don't see why you need size-specific integers for that. To avoid a
>>>> lot of pain, you need an integer type that is big enough for the values
>>>> you might have to do arithmetic on, but I don't see the need for
>>>> explicit sized integer types.
>>>
>>> As has been mentioned, you can use bytes to access file formats or
>>> network protocols. Size-specific integers can make it simpler and more
>>> efficient (given appropriate endianness and alignment), but they are not
>>> strictly necessary.
>>
>> Right. You seemed to be saying they were needed for this purpose.
>
> Wanted, rather than needed.

Ah, OK. I was not 100% sure what that "required" referred to and the
"should" sounded very dogmatic.

> But I would say that it is something you
> /really/ want. It's a little like saying C doesn't /need/ the "for"
> statement - "if" and "goto" can cover your needs. But you /want/
> "for".

Probably. The alternatives in C are not very attractive, but that's not
a strong argument outside of C.

>> And I don't see why they would necessarily be more efficient. On some
>> architectures, arithmetic on shorter integers is slow (or at least no
>> faster) than on long ones.
>
> You use them primarily for accessing data, rather than for arithmetic.

If you don't access them arithmetically, then why do they need to be
integer types?

> Code for handling externally-defined fixed structures is just much
> simpler and clearer to write when you can define a "struct" full of
> size-specific types that map directly to the defined format. And it is
> not unlikely that the results will be more efficient at run-time as
> well.

If you can't do anything else, you have to do this, but that does not
mean lots of different arithmetic types are needed.

> It is with good reason that most compilers (IME) have pre-defined macros
> that tell you the endianness of the target, many have extensions to let
> you have explicitly big-endian or little-endian types, and (IME) all
> have extensions letting you have "packed" structures.

All because there is no better way to do it in C. Having endianess
attached to types seems odd to me, but I've not seen that in any
compiler I've used. Maybe it works. It's needed when describing an
external format, but it's not an obvious attribute of an integer type.

> This means you
> read your file data or network packet directly into a struct and access
> the fields (checking for sanity, of course - never trust external data
> to be in the right format!).

Right. Lots extensions needed because C's solution is net really very
good.

>>> For hardware access, you need size-specific accesses.
>>
>> Yes. And alignment. And representation.
>
> Alignment is usually the smaller of the cpu bit width and the width of
> the type, but there are exceptions. However, it's easy to check (a
> static assertion on the size of the struct compared to the known size of
> the externally defined structure is simple and reliable).

Unless there are multiple alignment issues that don't show up in the
size, but I agree that for the usual situration this is another issue
that is not hard to overcome.

> Representation is two's complement for signed integers, IEEE for
> floating point, no padding bits. There hasn't been anything else made
> for the last 40 years or so, except a few Burroughs mainframes.

It helps when the world aligns itself with your preferred way of doing
things. I started using C in a networked environment in very different
times.

> If you need the absolute most portable code, then you have to do things
> long-hand with reading chars, building up your bigger types as you go.
> Certainly it can be done - but it is nice not to have to do it.

Yup. I never said these problems can't be solved.

>> Giving the language
>> size-specific integers is not the solution to this problem, but it's a
>> cheap one and has caught on at the expense of doing it "properly".
>
> Define "properly".

Well, the scare quotes were there because I don't really want to! I
suppose it boils down to "without all the extensions and static asserts
being needed".

<cut>
> Can you give an example of what you would like here?

I am going to plead lack of time and the fact that it doesn't really
matter. Nothing I come up with be of any value other than to explain
what I mean, and I hope I've given enough of a flavour of that already.

--
Ben.

On 19/05/2021 21:50, Ben Bacarisse wrote:
> Bart <bc@freeuk.com> writes:
>
>> unsigned long long
>> long unsigned long
>> long long unsigned
>> int unsigned long long
>> unsigned int long long
>> unsigned long int long
>> unsigned long long int
>> int long unsigned long
>> long int unsigned long
>> long unsigned int long
>> long unsigned long int
>> int long long unsigned
>> long int long unsigned
>> long long int unsigned
>> long long unsigned int
>
> How can a language with such a foolish, lax, permissive attitude to word
> order ever be understood? Sorry, I meant the permissive, lax, foolish
> attitude.

Or maybe your 'foolish, attitude, permissive, lax', or just 'permissive
lax foolish'?

Try declaring a function with 3 or 4 parameters like this (maybe with
each one using a different order), then see if it's still easy to
understand /at a glance/.

The other day I tried to change all 'int' types in a program to 'Int'
(which was going to be typedefed to something else), then found it
didn't work because of things like 'long long Int.

Even a primitive type can be multiple tokens in arbitrary order.

>> If you choose to make it const, then I think there are 124 ways of
>> writing the combinations with /one/ 'const' (and an unlimited number
>> with more than one, all legal).
>
> My, my, my. Some languages, eh?

Would /you/ have designed the language like this?

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

> On Tuesday, 18 May 2021 at 16:50:20 UTC+1, Ben Bacarisse wrote:
>>
>> First off, I'm not talking about C. C manages, and when there are
>> problems, we work round them, usually using external tools like the
>> build system. This method is unlikely to come unstuck because you
>> probably won't come across a machine with 36-bit sign and magnitude
>> integers (and thus no int32_t).
>>
>> C has, perhaps, led people to think that the solution must be lots of
>> integer types, ideally all with known sizes. I am saying that what you
>> really need is a notation in the language to describe data formats.
>>
>> Obviously there would be ways to get and set the bits without caring how
>> they map to a value. For example, a network address might simply by
>> copied from one place to another. But sometimes you need to get or set
>> the numeric value of a field. A language with, say, only one unbounded
>> integer type could do this just fine. If 'field' has been defined as
>> being 24-bits wide using sign and magnitude big-endian representation,
>> then
>>
>> header.field.asInteger
>>
>> will get the value for us to do arithmetic on.
>>
>> Such a notation would also probably include explicit alignment and
>> padding, so there would be no need for the external format to match what
>> the compiler produces.
>>
> This was my idea for B64. You'd have only two basic types, a 64 bit
> integer and a 64 bit floating point type. Strings would be zero-padded
> multiples of 8 bytes.
>
> The you'd have "bit buffers" for talking to non-B64 routines, and for
> specifiying higher-level structures. I never worked out an acceptable
> bit buffer description language, however.

It's along way from the Full Monty, but BCPL's SLCT ... OF expressions
were interesting, though too restrictive to be generally useful. (BCPL
can be thought of as a language with only one type -- a word.)

"Expressions of the form: SLCT len:shift:offset pack the three
constants len, shift and offset into a word. Such packed constants are
used by the field selection operator OF described in the next section.
SLCT shift:offset means SLCT 0:shift:offset, and SLCT offset means
SLCT 0:0:offset."

"An expression of the form K OF E accesses a field of consecutive bits
in memory. K must be a manifest constant (see section 2.2.10) equal to
SLCT len:shift:offset and E must yield a pointer, p say. The field is
contained entirely in the word at position p+offset. It has a bit
length of len and is shift bits from the right hand end of the word. A
length of zero is interpreted as the longest length possible consitent
with shift and the word length of the implementation. The operator ::
is a synonym of OF. Both may be used on right and left hand side of
assignments statements but not as the operand of @. When used in a
right hand context the selected field is shifted to the right hand end
of the result with vacated positions, if any, filled with zeros. A
shift to the left is performed when a field is updated. Suppose p!3
holds the value #x12345678, then after the assignment:

(SLCT 12:8:3) OF p := 1 + (SLCT 8:20:3) OF p

the value of p!3 is #x12302478."

--
Ben.

David Brown <david.brown@hesbynett.no> writes:
> On 19/05/2021 22:33, Keith Thompson wrote:
>> David Brown <david.brown@hesbynett.no> writes:
>> [...]
>>> I haven't heard of any. I know of a few DSP's with odd integer types,
>>> such as 18-bit char. And some have registers of unusual sizes, such as
>>> mainly 32-bit registers but a 40-bit "accumulator" for
>>> multiply-accumulate instructions.
>>
>> The R.O.U.S.'s? I don't think they exist. 8-)}
>
> You must be mixing it up with the Gruffalo, because such DSP's /do/
> exist. They are the kind of device that are only programmed by a few
> dozen people, but produced in countless millions - so they can be as
> horrible ISA's as they like if it saves a cent.

It's a line from The Princess Bride.

Buttercup: Westley, what about the R.O.U.S.'s?
Westley: Rodents Of Unusual Size? I don't think they exist.
[Immediately, an R.O.U.S. attacks him]

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

On 5/19/2021 7:18 AM, Joe Pfeiffer wrote:
> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
>
>> On Tuesday, 18 May 2021 at 09:14:37 UTC+1, muta...@gmail.com wrote:
>>>
>>>> There are "just a number" uses - a counter, an index, etc. The type for
>>>> this needs to be big enough that it is not going to overflow - the
>>>> programmer can treat it as though it were an unlimited size mathematical
>>>> integer. These days, that really means 64-bit - or an unlimited integer
>>>> type that grows as needed. The type name here should reflect that -
>>>> "number" would be good. In C, this is "int".
>>> But int is almost always 32-bits "these days". How do you
>>> reconcile that?
>>>
>> You only rarely have 2 billion data points. A 1024 * 1024 image is quite large,
>> for example, but it's only a million pixels.
>> Whist modern CPU operations with 64 bits will be as fast as 32 bit
>> operations, you
>> can store more 32 bit integers in the cache, and it's cache mises which are
>> the main determiner of performance.
>
> My son does deep learning at Microsoft. He tells me one of their recent
> projects gets 1GB of data per hour.
>
> My daughter does retina research at U Utah. I asked what sort of image
> sizes she works with (EM photos of retinas). The conversation went
>
> Me: I'm trying to remember -- how big (in bytes) did you say the images
> you're dealing with are? Came across someone saying a 2GB image is
> rare.
>
> Her: Lol
> Our volumes (stacks of images) are in the terabytes
> Off the scope I want to say a standard image (montage of tiles) is
> around 50, but that is super off the cuff. I'd have to actually
> look to know a real number.
>
> But 50gb images aren't unusual these days.
>

Ohhh yeah, your daughter knows! A large volumetric rendering can be
8192^3. For what its worth, I have some code that generates a "special"
Mandelbulb I created in the form of a volume. Its not contained in a
single DICOM file, but the volume is a stack of images. Here is my crude
code:

https://pastebin.com/raw/07TWQQYF

I take the image stack and view it using a volumetric renderer. ImageJ
in this case:

https://www.fractalforums.com/index.php?action=gallery;sa=view;id=17187

I am wondering what volumetric rendering software she is using? Is it an
expensive DICOM renderer for viewing 3d medical images? It almost has to be.

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

> Padding *bits* are bits within the representation of an integer type
> that do not contribute to its value. The concept was first explicitly
> acknowledged in C99. Note that the requirements on the predefined
> integer types are defines in terms of lower and upper bounds, not sizes.
>
> Most implementations don't use padding bits.

I think a case can be made that many, if not most, implementation do use
padding bits in _Bool objects.

If the CHAR_BIT-1 bits are not padding bits then they must be value
bits, and while I don't think C prohibits a _Bool object having a value
other than 0 or 1, I think most will happily trap on, or optimise away,
code like this:

_Bool b;
memcpy(&b, (unsigned char [1]){42}, sizeof b);
if (b > 1) printf("%d\n", b);

That's permitted if the extra bits are padding bits.

--
Ben.

On 19/05/2021 23:34, Keith Thompson wrote:
> David Brown <david.brown@hesbynett.no> writes:
>> On 19/05/2021 22:33, Keith Thompson wrote:
>>> David Brown <david.brown@hesbynett.no> writes:
>>> [...]
>>>> I haven't heard of any. I know of a few DSP's with odd integer types,
>>>> such as 18-bit char. And some have registers of unusual sizes, such as
>>>> mainly 32-bit registers but a 40-bit "accumulator" for
>>>> multiply-accumulate instructions.
>>>
>>> The R.O.U.S.'s? I don't think they exist. 8-)}
>>
>> You must be mixing it up with the Gruffalo, because such DSP's /do/
>> exist. They are the kind of device that are only programmed by a few
>> dozen people, but produced in countless millions - so they can be as
>> horrible ISA's as they like if it saves a cent.
>
> It's a line from The Princess Bride.
>
> Buttercup: Westley, what about the R.O.U.S.'s?
> Westley: Rodents Of Unusual Size? I don't think they exist.
> [Immediately, an R.O.U.S. attacks him]
>

I know that it was from The Princess Bride, but I didn't know the
context - that makes it clearer. It also makes it a lot more like the
Gruffalo. (If you haven't read that book, get hold of a young child,
grandchild, nephew, niece, etc., as an excuse and read it to them.)

On 19/05/2021 23:46, Ben Bacarisse wrote:
> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>
>> Padding *bits* are bits within the representation of an integer type
>> that do not contribute to its value. The concept was first explicitly
>> acknowledged in C99. Note that the requirements on the predefined
>> integer types are defines in terms of lower and upper bounds, not sizes.
>>
>> Most implementations don't use padding bits.
>
> I think a case can be made that many, if not most, implementation do use
> padding bits in _Bool objects.

_Bool does not count as an integer type, does it?

>
> If the CHAR_BIT-1 bits are not padding bits then they must be value
> bits, and while I don't think C prohibits a _Bool object having a value
> other than 0 or 1, I think most will happily trap on, or optimise away,
> code like this:
>
> _Bool b;
> memcpy(&b, (unsigned char [1]){42}, sizeof b);
> if (b > 1) printf("%d\n", b);
>
> That's permitted if the extra bits are padding bits.
>

I've seen cases where a _Bool was set (via an unsigned char* pointer) to
a value other than 0 or 1, and where it then failed both an "if (b)" and
an "if (!b)" test. It was an interesting debugging session.

On 5/18/2021 5:46 AM, Malcolm McLean wrote:
> On Tuesday, 18 May 2021 at 12:06:47 UTC+1, Bart wrote:
>> On 18/05/2021 11:43, Malcolm McLean wrote:
>>> On Tuesday, 18 May 2021 at 09:14:37 UTC+1, muta...@gmail.com wrote:
>>>>
>>>>> There are "just a number" uses - a counter, an index, etc. The type for
>>>>> this needs to be big enough that it is not going to overflow - the
>>>>> programmer can treat it as though it were an unlimited size mathematical
>>>>> integer. These days, that really means 64-bit - or an unlimited integer
>>>>> type that grows as needed. The type name here should reflect that -
>>>>> "number" would be good. In C, this is "int".
>>>> But int is almost always 32-bits "these days". How do you
>>>> reconcile that?
>>>>
>>> You only rarely have 2 billion data points. A 1024 * 1024 image is quite large,
>>> for example, but it's only a million pixels.
>> Loads of everyday figures can exceed two billion:
>>
>> * The size of a file (eg. a video file) expressed as bytes
>> * The capacity of a disk
>> * The amount of memory in a machine as bytes
>> * The world population
>> * How many seconds someone has been alive
>> * The number of views of a youtube video (and the total number of videos)
>> * The number of grains of rice on a square you will have when you're
>> only halfway along that chessboard
>>
>> Now take one of these figures, and try and use it in a calculation.
>>
>> i64 or u64 will handle all of these with ease (except for the final
>> square on the chessboard).
>>
> "Rarely" means that "you can find some counter-examples".
>>
>> Weren't you advocating for 64-bit ints everywhere a few years ago? (And
>> didn't we have the same discussion even more recently!)
>>
> Yes. I was advocating for a simplified language and architecture in which
> floating point values and integers are both 64 bits. However most people
> have decided to use 32 bits as the default for an integer, and there are
> reasons for that.
>

For deep zooms on a fractal, 64-bit floating point is not going to cut
it at all. We need to resort to arbitrary precision to get the really
deep zooms... ;^)

On 5/19/2021 2:34 PM, Keith Thompson wrote:
> David Brown <david.brown@hesbynett.no> writes:
>> On 19/05/2021 22:33, Keith Thompson wrote:
>>> David Brown <david.brown@hesbynett.no> writes:
>>> [...]
>>>> I haven't heard of any. I know of a few DSP's with odd integer types,
>>>> such as 18-bit char. And some have registers of unusual sizes, such as
>>>> mainly 32-bit registers but a 40-bit "accumulator" for
>>>> multiply-accumulate instructions.
>>>
>>> The R.O.U.S.'s? I don't think they exist. 8-)}
>>
>> You must be mixing it up with the Gruffalo, because such DSP's /do/
>> exist. They are the kind of device that are only programmed by a few
>> dozen people, but produced in countless millions - so they can be as
>> horrible ISA's as they like if it saves a cent.
>
> It's a line from The Princess Bride.
>
> Buttercup: Westley, what about the R.O.U.S.'s?
> Westley: Rodents Of Unusual Size? I don't think they exist.
> [Immediately, an R.O.U.S. attacks him]
>

:^D lol.

On 19/05/2021 23:18, Ben Bacarisse wrote:
> David Brown <david.brown@hesbynett.no> writes:
>
>> On 18/05/2021 14:53, Ben Bacarisse wrote:
>>> David Brown <david.brown@hesbynett.no> writes:
>>>
>>>> On 18/05/2021 14:15, Ben Bacarisse wrote:
>>>>> David Brown <david.brown@hesbynett.no> writes:
>>>>>
>>>>>> We have a far greater range of languages now, with a wide selection of
>>>>>> balances between features, simplicity, run-time efficiency, developer
>>>>>> efficiency, ease-of-use, safety, etc. Some are more minimal, with
>>>>>> perhaps just a single integer type at 64-bit, others have a selection of
>>>>>> different sizes for different purposes.
>>>>>>
>>>>>> IME and IMHO, I would say there are three basic kinds of integers,
>>>>>> depending on their usage.
>>>>>>
>>>>>> There are low-level uses - required for interaction with hardware,
>>>>>> connection to data outside the program (file formats, network protocols,
>>>>>> foreign-function interfaces, etc.), or for when you want accurate
>>>>>> control such as for getting maximal efficiency from large data
>>>>>> structures. In these cases, you want size-specific types. Whether you
>>>>>> call them int32_t, i32, Int<32>, etc., is a matter of taste. But they
>>>>>> should be size-specific and explicit.
>>>>>
>>>>> I don't see why you need size-specific integers for that. To avoid a
>>>>> lot of pain, you need an integer type that is big enough for the values
>>>>> you might have to do arithmetic on, but I don't see the need for
>>>>> explicit sized integer types.
>>>>
>>>> As has been mentioned, you can use bytes to access file formats or
>>>> network protocols. Size-specific integers can make it simpler and more
>>>> efficient (given appropriate endianness and alignment), but they are not
>>>> strictly necessary.
>>>
>>> Right. You seemed to be saying they were needed for this purpose.
>>
>> Wanted, rather than needed.
>
> Ah, OK. I was not 100% sure what that "required" referred to and the
> "should" sounded very dogmatic.

I probably overstated things (it is a bad habit I have) - consider my
wording to be changed to "want" rather than "need".

>
>> But I would say that it is something you
>> /really/ want. It's a little like saying C doesn't /need/ the "for"
>> statement - "if" and "goto" can cover your needs. But you /want/
>> "for".
>
> Probably. The alternatives in C are not very attractive, but that's not
> a strong argument outside of C.
>
>>> And I don't see why they would necessarily be more efficient. On some
>>> architectures, arithmetic on shorter integers is slow (or at least no
>>> faster) than on long ones.
>>
>> You use them primarily for accessing data, rather than for arithmetic.
>
> If you don't access them arithmetically, then why do they need to be
> integer types?

Often they would not need to be integer types. It would be possible to
have:

typedef struct { uint32_t x } data32_t;

and then use "data32_t" in at least some situations. (Actually, that
might be convenient when using gcc's "scaler_storage_order" attribute,
since it can only be attached to a struct, not a scaler.)

Having a set of fixed-size scaler types that can be assigned and read,
but without arithmetic operations, would be fine too. In a non-C
language suitable for this kind of thing, I would like types that are
designed for data storage and access with different fixed sizes, but
without arithmetic.

>
>> Code for handling externally-defined fixed structures is just much
>> simpler and clearer to write when you can define a "struct" full of
>> size-specific types that map directly to the defined format. And it is
>> not unlikely that the results will be more efficient at run-time as
>> well.
>
> If you can't do anything else, you have to do this, but that does not
> mean lots of different arithmetic types are needed.
>
>> It is with good reason that most compilers (IME) have pre-defined macros
>> that tell you the endianness of the target, many have extensions to let
>> you have explicitly big-endian or little-endian types, and (IME) all
>> have extensions letting you have "packed" structures.
>
> All because there is no better way to do it in C. Having endianess
> attached to types seems odd to me, but I've not seen that in any
> compiler I've used. Maybe it works. It's needed when describing an
> external format, but it's not an obvious attribute of an integer type.
>

Again, I am happy to think of these (in an non-C language) as "data
types" rather than integer types. Attaching endianness to the types (of
the scalers or of a struct) is a convenient short-hand rather than
attaching it to the operations of reading or writing. It is a little
like declaring objects to be "volatile", when really it is the accesses
to the objects that are "volatile".

>> This means you
>> read your file data or network packet directly into a struct and access
>> the fields (checking for sanity, of course - never trust external data
>> to be in the right format!).
>
> Right. Lots extensions needed because C's solution is net really very
> good.
>
>>>> For hardware access, you need size-specific accesses.
>>>
>>> Yes. And alignment. And representation.
>>
>> Alignment is usually the smaller of the cpu bit width and the width of
>> the type, but there are exceptions. However, it's easy to check (a
>> static assertion on the size of the struct compared to the known size of
>> the externally defined structure is simple and reliable).
>
> Unless there are multiple alignment issues that don't show up in the
> size, but I agree that for the usual situration this is another issue
> that is not hard to overcome.
>
>> Representation is two's complement for signed integers, IEEE for
>> floating point, no padding bits. There hasn't been anything else made
>> for the last 40 years or so, except a few Burroughs mainframes.
>
> It helps when the world aligns itself with your preferred way of doing
> things. I started using C in a networked environment in very different
> times.
>
>> If you need the absolute most portable code, then you have to do things
>> long-hand with reading chars, building up your bigger types as you go.
>> Certainly it can be done - but it is nice not to have to do it.
>
> Yup. I never said these problems can't be solved.
>
>>> Giving the language
>>> size-specific integers is not the solution to this problem, but it's a
>>> cheap one and has caught on at the expense of doing it "properly".
>>
>> Define "properly".
>
> Well, the scare quotes were there because I don't really want to! I
> suppose it boils down to "without all the extensions and static asserts
> being needed".
>

Compiler extensions are not always appropriate, but static assertions
are free, document assumptions or requirements in the code, and give you
a bit of extra safety from certain kinds of errors. I use them
regularly. (Prior to C11 I used macro versions.)

> <cut>
>> Can you give an example of what you would like here?
>
> I am going to plead lack of time and the fact that it doesn't really
> matter. Nothing I come up with be of any value other than to explain
> what I mean, and I hope I've given enough of a flavour of that already.
>

Fair enough.

David Brown <david.brown@hesbynett.no> writes:
> On 19/05/2021 23:46, Ben Bacarisse wrote:
>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>>> Padding *bits* are bits within the representation of an integer type
>>> that do not contribute to its value. The concept was first explicitly
>>> acknowledged in C99. Note that the requirements on the predefined
>>> integer types are defines in terms of lower and upper bounds, not sizes.
>>>
>>> Most implementations don't use padding bits.
>>
>> I think a case can be made that many, if not most, implementation do use
>> padding bits in _Bool objects.
>
> _Bool does not count as an integer type, does it?

It does. Specifically, it's a *standard unsigned integer type* and a
*standard integer type*. See N1570 6.2.5p6-7.

I think my weekend project will be to study what the standard says
about _Bool. Tentatively, I think an implementation *could* have
_Bool with 1 value bit, 7 padding bits, and 254 trap representations.
I'll try to figure out whether the standard allows it to have more
than one value bit. (Since conversions to _Bool yield 0 or 1,
getting a different value into a _Bool object without undefined
behavior is at best tricky.)

>> If the CHAR_BIT-1 bits are not padding bits then they must be value
>> bits, and while I don't think C prohibits a _Bool object having a value
>> other than 0 or 1, I think most will happily trap on, or optimise away,
>> code like this:
>>
>> _Bool b;
>> memcpy(&b, (unsigned char [1]){42}, sizeof b);
>> if (b > 1) printf("%d\n", b);
>>
>> That's permitted if the extra bits are padding bits.
>>
>
> I've seen cases where a _Bool was set (via an unsigned char* pointer) to
> a value other than 0 or 1, and where it then failed both an "if (b)" and
> an "if (!b)" test. It was an interesting debugging session.

That would be valid behavior if the stored value is a trap representation.

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

"Chris M. Thomasson" <chris.m.thomasson.1@gmail.com> writes:
>
> I am wondering what volumetric rendering software she is using? Is it
> an expensive DICOM renderer for viewing 3d medical images? It almost
> has to be.

Her work is on the retina, so it's pretty close to all 2D. I don't know
what volumetric or other rendering software she's using; last I talked
to her she was mainly running clustering algorithms classifying cells
and identifying conections (her research has been on macular
degeneration, in particular the breakdown and restructuring of the
connections between neurons during the process).

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

> David Brown <david.brown@hesbynett.no> writes:
>
>> We have a far greater range of languages now, with a wide selection of
>> balances between features, simplicity, run-time efficiency, developer
>> efficiency, ease-of-use, safety, etc. Some are more minimal, with
>> perhaps just a single integer type at 64-bit, others have a selection of
>> different sizes for different purposes.
>>
>> IME and IMHO, I would say there are three basic kinds of integers,
>> depending on their usage.
>>
>> There are low-level uses - required for interaction with hardware,
>> connection to data outside the program (file formats, network protocols,
>> foreign-function interfaces, etc.), or for when you want accurate
>> control such as for getting maximal efficiency from large data
>> structures. In these cases, you want size-specific types. Whether you
>> call them int32_t, i32, Int<32>, etc., is a matter of taste. But they
>> should be size-specific and explicit.
>
> I don't see why you need size-specific integers for that. To avoid a
> lot of pain, you need an integer type that is big enough for the values
> you might have to do arithmetic on, but I don't see the need for
> explicit sized integer types.

Size-specific types are important for people who insist
on thinking in assembly language.

Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

[...]

> I think my weekend project will be to study what the standard says
> about _Bool. Tentatively, I think an implementation *could* have
> _Bool with 1 value bit, 7 padding bits, and 254 trap representations.
> I'll try to figure out whether the standard allows it to have more
> than one value bit. (Since conversions to _Bool yield 0 or 1,
> getting a different value into a _Bool object without undefined
> behavior is at best tricky.)

I'm looking forward to reading your analysis.

On 20/05/2021 02:44, Keith Thompson wrote:
> David Brown <david.brown@hesbynett.no> writes:
>> On 19/05/2021 23:46, Ben Bacarisse wrote:
>>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>>>> Padding *bits* are bits within the representation of an integer type
>>>> that do not contribute to its value. The concept was first explicitly
>>>> acknowledged in C99. Note that the requirements on the predefined
>>>> integer types are defines in terms of lower and upper bounds, not sizes.
>>>>
>>>> Most implementations don't use padding bits.
>>>
>>> I think a case can be made that many, if not most, implementation do use
>>> padding bits in _Bool objects.
>>
>> _Bool does not count as an integer type, does it?
>
> It does. Specifically, it's a *standard unsigned integer type* and a
> *standard integer type*. See N1570 6.2.5p6-7.
>

You are (unsurprisingly) correct - and I think that would make Ben
(again, unsurprisingly) correct as well. _Bool is an unsigned integer
type with padding bits. (Since it has only 1 significant bit, but must
be stored as a sequence of bytes of at least 8 bits, padding is
unavoidable AFAIUI.)

N2176 (I have the C17 standard open at the moment, but there is little
difference from C11) 6.7.2.1p4 has a footnote:

"""
While the number of bits in a _Bool object is at least CHAR_BIT, the
width (number of sign and value bits) of a _Bool may be just 1 bit.
"""

The "may" here is interesting.

I think that would be possible.

> I'll try to figure out whether the standard allows it to have more
> than one value bit. (Since conversions to _Bool yield 0 or 1,
> getting a different value into a _Bool object without undefined
> behavior is at best tricky.)

The value bits in an unsigned type have to be consecutive powers of 2,
so I don't think multiple value bits are allowed. But we'll know more,
or be more confident of the answers, after you have had a chance to read
and think in detail.

>
>>> If the CHAR_BIT-1 bits are not padding bits then they must be value
>>> bits, and while I don't think C prohibits a _Bool object having a value
>>> other than 0 or 1, I think most will happily trap on, or optimise away,
>>> code like this:
>>>
>>> _Bool b;
>>> memcpy(&b, (unsigned char [1]){42}, sizeof b);
>>> if (b > 1) printf("%d\n", b);
>>>
>>> That's permitted if the extra bits are padding bits.
>>>
>>
>> I've seen cases where a _Bool was set (via an unsigned char* pointer) to
>> a value other than 0 or 1, and where it then failed both an "if (b)" and
>> an "if (!b)" test. It was an interesting debugging session.
>
> That would be valid behavior if the stored value is a trap representation.
>

I think that would be the case for a 2+ value stored in a _Bool.
(6.2.6.1p5).

On 20/05/2021 05:29, Tim Rentsch wrote:
> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>
>> David Brown <david.brown@hesbynett.no> writes:
>>
>>> We have a far greater range of languages now, with a wide selection of
>>> balances between features, simplicity, run-time efficiency, developer
>>> efficiency, ease-of-use, safety, etc. Some are more minimal, with
>>> perhaps just a single integer type at 64-bit, others have a selection of
>>> different sizes for different purposes.
>>>
>>> IME and IMHO, I would say there are three basic kinds of integers,
>>> depending on their usage.
>>>
>>> There are low-level uses - required for interaction with hardware,
>>> connection to data outside the program (file formats, network protocols,
>>> foreign-function interfaces, etc.), or for when you want accurate
>>> control such as for getting maximal efficiency from large data
>>> structures. In these cases, you want size-specific types. Whether you
>>> call them int32_t, i32, Int<32>, etc., is a matter of taste. But they
>>> should be size-specific and explicit.
>>
>> I don't see why you need size-specific integers for that. To avoid a
>> lot of pain, you need an integer type that is big enough for the values
>> you might have to do arithmetic on, but I don't see the need for
>> explicit sized integer types.
>
> Size-specific types are important for people who insist
> on thinking in assembly language.
>

That is probably true. But they are also very handy (but not actually
essential) for people who use C to reduce the need to write in assembly.
I am sure you are aware of the difference.

David Brown <david.brown@hesbynett.no> writes:

> On 20/05/2021 02:44, Keith Thompson wrote:
>> David Brown <david.brown@hesbynett.no> writes:
>>> On 19/05/2021 23:46, Ben Bacarisse wrote:
>>>> Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
>>>>> Padding *bits* are bits within the representation of an integer type
>>>>> that do not contribute to its value. The concept was first explicitly
>>>>> acknowledged in C99. Note that the requirements on the predefined
>>>>> integer types are defines in terms of lower and upper bounds, not sizes.
>>>>>
>>>>> Most implementations don't use padding bits.
>>>>
>>>> I think a case can be made that many, if not most, implementation do use
>>>> padding bits in _Bool objects.
>>>
>>> _Bool does not count as an integer type, does it?
>>
>> It does. Specifically, it's a *standard unsigned integer type* and a
>> *standard integer type*. See N1570 6.2.5p6-7.
>
> You are (unsurprisingly) correct - and I think that would make Ben
> (again, unsurprisingly) correct as well. _Bool is an unsigned integer
> type with padding bits. (Since it has only 1 significant bit, but must
> be stored as a sequence of bytes of at least 8 bits, padding is
> unavoidable AFAIUI.)

I don't think that padding is unavoidable because the values in _Bool
objects are not restricted to 0 and 1. Those restrictions apply
conversions, so, for example, you can't assign a value other than 0 or
1, but larger values can get into a _Bool object by other means without
there being any undefined behaviour.

My claim was only that most implementations do view the extra bits as
padding and use the consequent permission to trap (e.g. gcc's sanitizer)
or to optimise.

> N2176 (I have the C17 standard open at the moment, but there is little
> difference from C11) 6.7.2.1p4 has a footnote:
>
> """
> While the number of bits in a _Bool object is at least CHAR_BIT, the
> width (number of sign and value bits) of a _Bool may be just 1 bit.
> """
>
> The "may" here is interesting.

I think the standard wants to leave open the possibility that there may
be more value bits.

--
Ben.

On 20/05/2021 04:29, Tim Rentsch wrote:
> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>
>> David Brown <david.brown@hesbynett.no> writes:
>>
>>> We have a far greater range of languages now, with a wide selection of
>>> balances between features, simplicity, run-time efficiency, developer
>>> efficiency, ease-of-use, safety, etc. Some are more minimal, with
>>> perhaps just a single integer type at 64-bit, others have a selection of
>>> different sizes for different purposes.
>>>
>>> IME and IMHO, I would say there are three basic kinds of integers,
>>> depending on their usage.
>>>
>>> There are low-level uses - required for interaction with hardware,
>>> connection to data outside the program (file formats, network protocols,
>>> foreign-function interfaces, etc.), or for when you want accurate
>>> control such as for getting maximal efficiency from large data
>>> structures. In these cases, you want size-specific types. Whether you
>>> call them int32_t, i32, Int<32>, etc., is a matter of taste. But they
>>> should be size-specific and explicit.
>>
>> I don't see why you need size-specific integers for that. To avoid a
>> lot of pain, you need an integer type that is big enough for the values
>> you might have to do arithmetic on, but I don't see the need for
>> explicit sized integer types.
>
> Size-specific types are important for people who insist
> on thinking in assembly language.

Yeah, you're right.

That's why languages like Go, Julia, Odin, Nim, Rust, Zig and Swift all
have size-specific types such as Int32.

With languages such as D, C#, Java and Scala, while they might use
denotations like byte, short, int, long, those are precisely defined by
the language to be 1, 2, 4 or 8 bytes wide respectively, so are
effectively size-specific.

All used by assembly language programmers of course!

I mean, if you want to call a FFI function where a parameter uses an
'int' type in its implementation language, then all you need is to use
'int' in whatever language /you/ happen to be using.

Who cares whether they are actually the same size or not; that's just
some pesky detail that only assembly coders need to concern themselves with.

On 20/05/2021 14:20, Bart wrote:
> On 20/05/2021 04:29, Tim Rentsch wrote:
>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>
>>> David Brown <david.brown@hesbynett.no> writes:
>>>
>>>> We have a far greater range of languages now, with a wide selection of
>>>> balances between features, simplicity, run-time efficiency, developer
>>>> efficiency, ease-of-use, safety, etc. Some are more minimal, with
>>>> perhaps just a single integer type at 64-bit, others have a
>>>> selection of
>>>> different sizes for different purposes.
>>>>
>>>> IME and IMHO, I would say there are three basic kinds of integers,
>>>> depending on their usage.
>>>>
>>>> There are low-level uses - required for interaction with hardware,
>>>> connection to data outside the program (file formats, network
>>>> protocols,
>>>> foreign-function interfaces, etc.), or for when you want accurate
>>>> control such as for getting maximal efficiency from large data
>>>> structures. In these cases, you want size-specific types. Whether you
>>>> call them int32_t, i32, Int<32>, etc., is a matter of taste. But they
>>>> should be size-specific and explicit.
>>>
>>> I don't see why you need size-specific integers for that. To avoid a
>>> lot of pain, you need an integer type that is big enough for the values
>>> you might have to do arithmetic on, but I don't see the need for
>>> explicit sized integer types.
>>
>> Size-specific types are important for people who insist
>> on thinking in assembly language.
>
>
> Yeah, you're right.
>
> That's why languages like Go, Julia, Odin, Nim, Rust, Zig and Swift all
> have size-specific types such as Int32.
>
> With languages such as D, C#, Java and Scala, while they might use
> denotations like byte, short, int, long, those are precisely defined by
> the language to be 1, 2, 4 or 8 bytes wide respectively, so are
> effectively size-specific.
>
> All used by assembly language programmers of course!
>
> I mean, if you want to call a FFI function where a parameter uses an
> 'int' type in its implementation language, then all you need is to use
> 'int' in whatever language /you/ happen to be using.
>
> Who cares whether they are actually the same size or not; that's just
> some pesky detail that only assembly coders need to concern themselves
> with.
>

I don't want to try and guess what Tim might have meant to imply, but he
did not write that size-specific types are important /only/ for people
who think in assembly.

On 5/19/21 8:44 PM, Keith Thompson wrote:
> David Brown <david.brown@hesbynett.no> writes:
....
>> _Bool does not count as an integer type, does it?
>
> It does. Specifically, it's a *standard unsigned integer type* and a
> *standard integer type*. See N1570 6.2.5p6-7.

_Bool also qualifies as an "unsigned integer type" (p6), and therefore
as an "integer type" (p17). This all might seem obvious implications of
the fact that "standard unsigned integer type" contains the phrase
"integer type". However, as a general rule, when the standard defines
the meaning of a term, you're not allowed to derive any implications
from the individual words that make up that term, only from the things
that the standard explicitly says about it. (I know that you know this,
I'm explaining that for the benefit of others).
My favorite example is the "indeterminate value", which could be a "trap
representation", despite the fact that a trap representation "need not
represent a value".

My understanding is that the number of value bits in _Bool can be as
small a 1, but is not required to be 1. On a implementation where it has
more than one value bit, assigning a true value will set one of them,
but the only way to set any of the other value bits with
(implementation-)defined behavior is by type-punning.

"Been through Hell? Whaddya bring back for me?" -- A. Brilliant

devel / comp.lang.c / Re: 32-bit pointers, 64-bit longs

devel / comp.lang.c / Re: 32-bit pointers, 64-bit longs

Subject	Author
32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	Kaz Kylheku
Re: 32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	William Ahern
Re: 32-bit pointers, 64-bit longs	antispam
Re: 32-bit pointers, 64-bit longs	Scott Lurndal
Re: 32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	Scott Lurndal
Re: 32-bit pointers, 64-bit longs	Chris M. Thomasson
Re: 32-bit pointers, 64-bit longs	Chris M. Thomasson
Re: 32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	Vir Campestris
Re: 32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	Ben Bacarisse
Re: 32-bit pointers, 64-bit longs	Joe Pfeiffer
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	Keith Thompson
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	Keith Thompson
Re: 32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	Malcolm McLean
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	Ben Bacarisse
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	antispam
Re: 32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	Malcolm McLean
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	Malcolm McLean
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	Scott Lurndal
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	Scott Lurndal
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	Scott Lurndal
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	Scott Lurndal
Re: 32-bit pointers, 64-bit longs	Bart
Re: 32-bit pointers, 64-bit longs	Scott Lurndal
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	Keith Thompson
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	Chris M. Thomasson
Re: 32-bit pointers, 64-bit longs	Chris M. Thomasson
Re: 32-bit pointers, 64-bit longs	Joe Pfeiffer
Re: 32-bit pointers, 64-bit longs	Malcolm McLean
Re: 32-bit pointers, 64-bit longs	Scott Lurndal
Re: 32-bit pointers, 64-bit longs	Chris M. Thomasson
Re: 32-bit pointers, 64-bit longs	Joe Pfeiffer
Re: 32-bit pointers, 64-bit longs	Chris M. Thomasson
Re: 32-bit pointers, 64-bit longs	John Dill
Re: 32-bit pointers, 64-bit longs	Chris M. Thomasson
Re: 32-bit pointers, 64-bit longs	Ben Bacarisse
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	Ben Bacarisse
Re: 32-bit pointers, 64-bit longs	Malcolm McLean
Re: 32-bit pointers, 64-bit longs	Ben Bacarisse
Re: 32-bit pointers, 64-bit longs	John Dill
Re: 32-bit pointers, 64-bit longs	Ben Bacarisse
Re: 32-bit pointers, 64-bit longs	Malcolm McLean
Re: 32-bit pointers, 64-bit longs	Ben Bacarisse
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	Scott Lurndal
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	Keith Thompson
Re: 32-bit pointers, 64-bit longs	muta...@gmail.com
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	Scott Lurndal
Re: 32-bit pointers, 64-bit longs	Keith Thompson
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	Keith Thompson
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	Keith Thompson
Re: 32-bit pointers, 64-bit longs	Lew Pitcher
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	Keith Thompson
Re: 32-bit pointers, 64-bit longs	Ben Bacarisse
Re: 32-bit pointers, 64-bit longs	David Brown
Re: 32-bit pointers, 64-bit longs	Ben Bacarisse
Re: 32-bit pointers, 64-bit longs	Tim Rentsch
Re: 32-bit pointers, 64-bit longs	Joe Pfeiffer
Re: 32-bit pointers, 64-bit longs	antispam
Re: 32-bit pointers, 64-bit longs	antispam
Re: 32-bit pointers, 64-bit longs	Chris M. Thomasson
Re: 32-bit pointers, 64-bit longs	Pedro V