Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

In order to dial out, it is necessary to broaden one's dimension.


devel / comp.lang.c / Re: Padding between char array/VLA in struct?

SubjectAuthor
* Padding between char array/VLA in struct?Ian Pilcher
+* Re: Padding between char array/VLA in struct?James Kuyper
|`* Re: Padding between char array/VLA in struct?Keith Thompson
| `- Re: Padding between char array/VLA in struct?James Kuyper
+* Re: Padding between char array/VLA in struct?Bart
|`* Re: Padding between char array/VLA in struct?Keith Thompson
| `* Re: Padding between char array/VLA in struct?Bart
|  +* Re: Padding between char array/VLA in struct?Andrey Tarasevich
|  |+- Re: Padding between char array/VLA in struct?Andrey Tarasevich
|  |`* Re: Padding between char array/VLA in struct?Keith Thompson
|  | `* Re: Padding between char array/VLA in struct?Andrey Tarasevich
|  |  `- Re: Padding between char array/VLA in struct?David Brown
|  +- Re: Padding between char array/VLA in struct?Keith Thompson
|  `* Re: Padding between char array/VLA in struct?BGB
|   `* Re: Padding between char array/VLA in struct?Andrey Tarasevich
|    `* Re: Padding between char array/VLA in struct?BGB
|     `- Re: Padding between char array/VLA in struct?BGB
+* Re: Padding between char array/VLA in struct?Keith Thompson
|`* Re: Padding between char array/VLA in struct?Bart
| `- Re: Padding between char array/VLA in struct?Keith Thompson
+- Re: Padding between char array/VLA in struct?Andrey Tarasevich
`* Re: Padding between char array/VLA in struct?Ian Pilcher
 +* Re: Padding between char array/VLA in struct?Tim Rentsch
 |`* Re: Padding between char array/VLA in struct?Ian Pilcher
 | +- Re: Padding between char array/VLA in struct?Keith Thompson
 | `- Re: Padding between char array/VLA in struct?antispam
 `* Re: Padding between char array/VLA in struct?Keith Thompson
  `* Re: Padding between char array/VLA in struct?Manfred
   `- Re: Padding between char array/VLA in struct?Keith Thompson

Pages:12
Re: Padding between char array/VLA in struct?

<sedvjc$u6i$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17804&group=comp.lang.c#17804

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.c
Subject: Re: Padding between char array/VLA in struct?
Date: Wed, 4 Aug 2021 14:01:47 +0200
Organization: A noiseless patient Spider
Lines: 88
Message-ID: <sedvjc$u6i$1@dont-email.me>
References: <se9b5g$l2n$1@dont-email.me> <se9cfv$uta$1@dont-email.me>
<875ywnaehw.fsf@nosuchdomain.example.com> <se9htr$5ll$1@dont-email.me>
<se9jh0$gre$1@dont-email.me> <87o8af8u7v.fsf@nosuchdomain.example.com>
<seaatu$kii$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 4 Aug 2021 12:01:48 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="5c84e891ee178787f1df57542d7264bc";
logging-data="30930"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/mJ53vzMdSZbxPcJ+VrQq1Q7RWpkWW5kY="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.11.0
Cancel-Lock: sha1:PxRhKWSiXzh2OiLMTaf4ltmeccQ=
In-Reply-To: <seaatu$kii$1@dont-email.me>
Content-Language: en-GB
 by: David Brown - Wed, 4 Aug 2021 12:01 UTC

On 03/08/2021 04:50, Andrey Tarasevich wrote:
> On 8/2/2021 1:31 PM, Keith Thompson wrote:
>>
>> No, gcc's extension appears to allow VLAs as struct members.  This
>> compiles and runs without error and, as expected, prints different
>> values when it's executed again after more than a second:
>>
>> #include <stdio.h>
>> #include <time.h>
>> #include <stddef.h>
>> int main(void) {
>>      const int size = time(NULL) % 10 + 10;
>>      struct s {
>>          char s0[size];
>>          char s1[size];
>>          char s2[size];
>>      };
>>
>>      struct s obj;
>>
>>      printf("sizeof obj = %zu\n", sizeof obj);
>>      printf("%zu %zu %zu\n",
>>             offsetof(struct s, s0),
>>             offsetof(struct s, s1),
>>             offsetof(struct s, s2));
>> }
>>
>> (I didn't find this extension in gcc's documentation, but I didn't look
>> very hard.)

<https://gcc.gnu.org/onlinedocs/gcc/Variable-Length.html>

>>
>
> Oh, indeed... I missed that. Thank you for pointing it out.
>
> Interestingly, when GCC has to deal with a bunch of independent VLAs
>
>   char s0[n0];
>   char s1[n1];
>   char s2[n2];
>
> in the generated code it typically implements them as simple individual
> `char *` pointers, pointing into memory allocated by `alloca`-like
> mechanism.

There is a significant difference between VLA's and alloca, AFAIUI -
memory from alloca is deallocated at the end of the function, while for
a VLA it can be deallocated at the end of the VLA's lifetime (the
containing block). But in each case, you have an "alloca-like
mechanism" - generally a simple manipulation of the cpu stack pointer.

>
> But once they are bundled into a struct
>
>   struct s
>   {
>     char s0[n0];
>     char s1[n1];
>     char s2[n2];
>   } obj;
>
> GCC no longer has a luxury of implementing each array as an individual
> pointer. It wants to maintain a flat memory layout for a struct object,
> which forces it to generate code that allocates a memory buffer of size
> `n0+n1+n2` and then use arithmetic on-the-fly to locate each individual
> array in that buffer, e.g. to locate the beginning of `s2` it has to
> calculate `n0+n1`.
>
> In other words, the latter implementation does not come "for free": it
> is markedly different from the former.
>

All of n0, n1 and n2 must be known at the time the struct is declared,
and the sizes of the VLA's remain constant even if the variables n0, n1
and n2 are later changed.

So in the first case, the compiler might make three "char *" pointers
with a known relation between them (as they are all allocated on the
stack at the same time). In the second case, it could make one "struct
s *" pointer and use that with known offsets to access the individual
arrays. In each case, the addresses and the data known to the compiler
are the same. There is likely to be no significant difference in
implementation between the two, as the compiler will mix and match
multiple pointers with pointers plus offset, according to what is most
efficient at the time.

Re: Padding between char array/VLA in struct?

<seea3q$7a6$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17807&group=comp.lang.c#17807

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!rocksolid2!i2pn.org!aioe.org!Puiiztk9lHEEQC0y3uUjRA.user.46.165.242.75.POSTED!not-for-mail
From: non...@add.invalid (Manfred)
Newsgroups: comp.lang.c
Subject: Re: Padding between char array/VLA in struct?
Date: Wed, 4 Aug 2021 17:01:14 +0200
Organization: Aioe.org NNTP Server
Message-ID: <seea3q$7a6$1@gioia.aioe.org>
References: <se9b5g$l2n$1@dont-email.me> <se9fa1$ikv$1@dont-email.me>
<87sfzr8umf.fsf@nosuchdomain.example.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gioia.aioe.org; logging-data="7494"; posting-host="Puiiztk9lHEEQC0y3uUjRA.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.12.0
Content-Language: en-US
X-Notice: Filtered by postfilter v. 0.9.2
 by: Manfred - Wed, 4 Aug 2021 15:01 UTC

On 8/2/2021 10:22 PM, Keith Thompson wrote:
> Ian Pilcher <arequipeno@gmail.com> writes:
>> On 8/2/21 12:48 PM, Ian Pilcher wrote:
>>> Given the following:
>>>   struct foo {
>>>       /* other members */
>>>       char prefix[PREFIX_SIZE];
>>>       char name[];
>>>   };
>>
>> As has been pointed out, the 'name' member is a flexible array member,
>> not a variable length array. I apologize for the mistake.
>>
>> Also, PREFIX_SIZE is intended to represent a macro that expands to a
>> positive(!) integer constant expression.
>>
>> The consensus seems to be that no one can think of a good reason that a
>> compiler would insert padding between the 'prefix' and 'name' members,
>> but there's nothing in any of the language standards that would forbid
>> it from doing so. That would certainly explain my failure to find any
>> such rule.
>>
>> (_Static_assert here I come!)
>
> The ABI for whatever target platform you're interested in might have
> something to say about it.

It may be worth pointing out that ABIs set requirements for exported
interfaces - they don't pose restrictions to compilers for accesses to
local data and functions.

>
> (I can imagine that some compiler might find it convenient to give the
> name[] member an alignment stricter than 1 byte.)
>

Re: Padding between char array/VLA in struct?

<seehcc$r4e$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17811&group=comp.lang.c#17811

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: cr88...@gmail.com (BGB)
Newsgroups: comp.lang.c
Subject: Re: Padding between char array/VLA in struct?
Date: Wed, 4 Aug 2021 12:05:13 -0500
Organization: A noiseless patient Spider
Lines: 174
Message-ID: <seehcc$r4e$1@dont-email.me>
References: <se9b5g$l2n$1@dont-email.me> <se9cfv$uta$1@dont-email.me>
<875ywnaehw.fsf@nosuchdomain.example.com> <se9htr$5ll$1@dont-email.me>
<sebo28$ig7$1@dont-email.me> <sebpqf$utm$1@dont-email.me>
<sebvpf$bfs$1@dont-email.me>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 4 Aug 2021 17:05:16 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="2d7a2369067cbd151bcbb7e0f56184ff";
logging-data="27790"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+DAeZFUuXk5BhRzKOzPW9X"
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.12.0
Cancel-Lock: sha1:AJDXxmsnCHhYS78eDZZ4+36Nxok=
In-Reply-To: <sebvpf$bfs$1@dont-email.me>
Content-Language: en-US
 by: BGB - Wed, 4 Aug 2021 17:05 UTC

On 8/3/2021 12:52 PM, BGB wrote:
> On 8/3/2021 11:10 AM, Andrey Tarasevich wrote:
>> On 8/3/2021 8:40 AM, BGB wrote:
>>>
>>> FWIW: My compiler (BGBCC) also supports VLAs, though only for local
>>> variables and similar. In these contexts, they effectively decay into
>>> pointers, and are transformed into an "alloca()" call.
>>> ...
>>> But, as can be noted, my compiler doesn't do anything too fancy to
>>> support "alloca()"; in effect this gets turned by the compiler into a
>>> special runtime call for "call malloc() and then add the returned
>>> pointer to a linked list", and then in the epilog a function call is
>>> added to free anything that was added to the list.
>>>
>>> I also make no claim that multidimensional VLAs "actually work"...
>>> ...
>>
>> It appears that the amount of effort you spent implementing your
>> pseudo-VLA would be quite sufficient to implement the
>> standard-compliant VLA. You just needed to channel that effort
>> slightly differently. Which makes one wonder why you decided to eschew
>> the standard semantics...
>>
>
> Partly it was because this was what seemed easier, and also allowed
> doing pretty much all of the VLA stuff to be handled in the front-end
> (things like lambdas are also handled in the frontend, as are many
> operations involving "variant" types, *1, ...).
>

Also, decaying to a pointer type is easier from a semantics POV, since
fully dealing with VLAs in the type-system would be a lot more
complicated than:
int foo[n];
effectively decaying into:
int *foo;
Even if the simple case falls on its face if someone tries to write:
int bar[m][n];
Or similar.

Closest one can really handle the latter case is to decay it into
multiple levels of indirection, so for:
int bar[m][n];
i=bar[x][y];
Might become, say:
int *bar;
i=((int ***)bar)[-1][x][y];

Or, a memory blob pointing to the start of the lowest level array,
preceded by a pointer to the top level array, and followed by a number
of array indirection pointers, ...

With the type-system still needing to remember that it was a VLA, and
its number of indirection levels, ...

But, this latter case falls into "there be dragons here" territory.

> With the ABI design, variable-sized stack frames would have added more
> complexity relative to fixed-size stack frames (there is no frame
> pointer or similar in this case, and everything on-stack is referenced
> using fixed displacements relative to the stack pointer, and
> prolog/epilog adjustments use fixed offsets, ...).
>
>
> Note that this is generally for a target where:
>   I am using 128K as the default stack size;
>     (Except for interrupt handlers, which use an 8K stack).
>   The target is (or assumes) No-MMU operation;
>   The RAM space is typically measured in MB.
>
> Using malloc ends up preferable for larger memory allocations (VLAs or
> large stack arrays), since it isn't bound by the stack-size limit, and
> doesn't assume spending inordinate amounts of RAM on the program stacks.
>
> Though, a few cases exist where the total RAM is measured in KB (and the
> heap is just sorta wedged between ".bss" and the stack).
>

This does assume that the malloc is fast enough to not have a
significant adverse effect on performance, which seems to generally be
true in this case.

In the average case, most of the cost time into mapping the size to a
"size index" and then retrieving an item from a free-list corresponding
to this index. If one is repeatedly allocating and freeing objects of
the same size, this case works out reasonably fast.

It could be made faster if one could precalculate this index in the
compiler, but this would still be is N/A for VLAs.

The size-index is essentially a specialized 8-bit microfloat (E5.F3),
which stores the size in a rounded-up form (relative to an 8-byte unit
size), eg:
0, 8, 16, 24, 32, 40, 48, 56, //E=0, Step=8
64, 72, 80, 88, 96, 104, 112, 120, //E=1, Step=8
128, 144, 160, 176, 192, 208, 224, 240, //E=2, Step=16
256, 288, 320, 352, 384, ... //E=3, Step=32
...

For small objects, a cell-based allocator can be used, which allocates
the object as runs of 8-byte cells using an allocation bitmap.

For medium objects, the allocator can use linked-lists of memory blocks.
For large objects, the allocator might use memory pages instead.

>
> *1: The variant types are a non-standard extension which add dynamic
> type-checking via tagged pointers; and turns pretty much every operation
> on these types into a function call into the runtime.
>

FWIW: Given the nature of dynamic tags, there is no way to make them
particularly fast. The compiler does optimize for a few cases where it
knows the tag layout, but given pretty much every operation requires a
sort of dynamic dispatch, it is not viable to do this inline.

For most of it though, the compiler just sort of treats it as-if one had
written it out as a bunch of function calls.

In this case, a scheme is used where 64-bit tagged-references are used
with the tag bits in the high-order bits (can encode "fixnum" and
"flonum" types using 62 bits, with 48 bit available for pointers).

As noted, I also have a few features from the C23 proposals lists.

Lambdas:
int (*fn)(int x);
int i, j;
fn = [=](int x)->int { return(x*j); };

Currently only supports capture by value, and an extension syntax:
fn = __function(int x):int { return(x*j); };

Which differs slightly in that it supports unbounded lifespan (my
understanding of regarding the proposal being that it only supports
automatic lifespan, and doesn't provide any notation to specify dynamic
/ heap-allocated lambdas). Implicitly, capture-by-value does not care
if/when the original stack-frame that created it is destroyed (this
would be as bigger issue for capture-by-reference though).

Variable-sized integers:
_BitInt( 96) li; //96-bit integer (padded to 128 bits)
_BitInt(192) lj; //192 bits (padded to 256 bits)
_BitInt(384) lk; //384 bits
For sizes <= 128 bits, falling back to a corresponding integer type.

For sizes larger than 128 bits, padding to a multiple of 128 bits, and
implementing most operations via runtime calls.
In terms of the implementation, they have behavior partway between
structs and array (they have struct-like semantics, but in terms of the
type-system are represented more like a special sub-type of an array of
128-bit elements).

Note that 128-bit integers can at-least be semi-competently handled in
the base ISA in this case, so many operations can be handled inline
(more so if the use of 128-bit ALU operations is enabled in the compiler).

No support yet for "_Generic", but partly this is because I have yet to
run into many cases where using it "makes sense".

Support for features from newer standards is a bit cherry-picked though,
mostly "stuff that seemed useful" (along with features which overlap
with my own custom language).

....

Re: Padding between char array/VLA in struct?

<87fsvp879i.fsf@nosuchdomain.example.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17813&group=comp.lang.c#17813

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: Keith.S....@gmail.com (Keith Thompson)
Newsgroups: comp.lang.c
Subject: Re: Padding between char array/VLA in struct?
Date: Wed, 04 Aug 2021 10:11:21 -0700
Organization: None to speak of
Lines: 42
Message-ID: <87fsvp879i.fsf@nosuchdomain.example.com>
References: <se9b5g$l2n$1@dont-email.me> <se9fa1$ikv$1@dont-email.me>
<87sfzr8umf.fsf@nosuchdomain.example.com>
<seea3q$7a6$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Injection-Info: reader02.eternal-september.org; posting-host="99395c8d90a6c5103d50d5b3645d1b26";
logging-data="27407"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/zQEEKvkEk59JvH7XR6OMr"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)
Cancel-Lock: sha1:xDUHi5YT205zbDMnB6NhO2oicdc=
sha1:Mpi7ou7bGekiwmFRF3BjTqTbLk0=
 by: Keith Thompson - Wed, 4 Aug 2021 17:11 UTC

Manfred <noname@add.invalid> writes:
> On 8/2/2021 10:22 PM, Keith Thompson wrote:
>> Ian Pilcher <arequipeno@gmail.com> writes:
>>> On 8/2/21 12:48 PM, Ian Pilcher wrote:
>>>> Given the following:
>>>>   struct foo {
>>>>       /* other members */
>>>>       char prefix[PREFIX_SIZE];
>>>>       char name[];
>>>>   };
>>>
>>> As has been pointed out, the 'name' member is a flexible array member,
>>> not a variable length array. I apologize for the mistake.
>>>
>>> Also, PREFIX_SIZE is intended to represent a macro that expands to a
>>> positive(!) integer constant expression.
>>>
>>> The consensus seems to be that no one can think of a good reason that a
>>> compiler would insert padding between the 'prefix' and 'name' members,
>>> but there's nothing in any of the language standards that would forbid
>>> it from doing so. That would certainly explain my failure to find any
>>> such rule.
>>>
>>> (_Static_assert here I come!)
>> The ABI for whatever target platform you're interested in might have
>> something to say about it.
>
> It may be worth pointing out that ABIs set requirements for exported
> interfaces - they don't pose restrictions to compilers for accesses to
> local data and functions.

Sure, but it would be bizarre to use the ABI-imposed layout for struct foo
if it's exported, but a different layout if it isn't.

>> (I can imagine that some compiler might find it convenient to give
>> the
>> name[] member an alignment stricter than 1 byte.)

--
Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
Working, but not speaking, for Philips
void Void(void) { Void(); } /* The recursive call of the void */

Pages:12
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor