novaBBS - comp.lang.c - Re: destructor for C? (static analisys)

On 2/2/2023 1:01 PM, Öö Tiib wrote:
> On Thursday, 2 February 2023 at 03:14:32 UTC+2, BGB wrote:
>> On 2/1/2023 4:51 PM, Öö Tiib wrote:
>>> On Wednesday, 1 February 2023 at 22:05:17 UTC+2, Thiago Adams wrote:
>>>>
>>>> The main feedback I would like to have is :
>>>>
>>>> What do you think about the idea of having a 100% compile time
>>>> guarantee that some function is called in C? (Similar of C++ destructor
>>>> guarantee but in this case we need to call it manually)
>>>
>>> What I would like is that tools as Valgrind are made more accessible to
>>> novices and that everybody are taught to use those tools. That may be
>>> hard but at least feels worth of any effort, however little.
>>>
>> In my project (C compiler + ISA), I do sort-of have things like an
>> experimental feature for bounds-checked arrays (with partial ISA level
>> support).
>>
>> Currently, it only applies to local (stack based) arrays, and to global
>> arrays. Applying it to malloc or similar (in any transparent way) would
>> be non-trivial. Could work maybe if one had a special "malloc_array()"
>> or similar, but otherwise wouldn't really play as nicely with code which
>> implements custom memory allocators.
>>
>>
>> One other drawback is that I had to shoe-horn the bounds-check data into
>> 12 bits, which only really allows for approximate bounds checks:
>> E5.F3 minifloat giving the main array size, and 4-bits giving a
>> denormalized offset bias (shares the same exponent scale).
>>
>> Some trickery was used to allow a "LEAT.B" operation which combines a
>> LEA with updating the bounds-check bias, using a carry-flag out of the
>> low bits when adjusting the bias.
>>
>> Some limitations of the scheme though is that it can't effectively
>> encode out-of-bounds offsets, and requires padding arrays and
>> bounds-checks slightly to deal with "error".
>>
>>
>>
>> Otherwise, it is basically invisible to normal C code.
>> Performance impact seems to be fairly small, though it does have some
>> cost in terms of code density.
>>
>> Does have a minor ABI impact in that array pointers originating in code
>> which has the feature may not necessarily work "flawlessly" if used in
>> code built without bounds-checks (some operations, such like calculating
>> the difference between pointers, or performing comparisons, require
>> stripping off the high-order bits in the bounds-checked case).
>>
>> Granted, both cases could be partially addressed if there were a
>> "Subtract but sign extend from the low 48 bits" operation.
>
> Maybe yes can encode couple bits in pointer as meta-information and
> so gain hardware support to some of it ... but approximate bounds
> sound bad. Off-by-one is most common violation of bounds.
> Also how can it work with nested object like array in struct in array?
>

It doesn't really work for off-by-1, since by the time bounds are tight
enough to detect an off-by-1, they are also tight enough to get
false-positive bounds failures.

For stack and global arrays, it pads the array slightly, such that
"known good" accesses will not give false-positives, and such that
"slightly out of bounds" accesses should not hit into any other memory
(so, arrays are padded with a small "no man's land" area).

This doesn't really work for structs though.
Currently, arrays within structs are not bounds-checked.

Though, the scheme would handle this, as its only "real" requirement for
this is that the bounds are known at compile time. However, if the
bounds are padded, it would not reliably detect out-of-bounds access.

For my other language, the idea will be that the bounds will not be
padded, in which case the bounds-check will fall back to a handler which
will either verify that the access is actually valid, or raise an
exception (in C mode, if the initial check fails, it immediately
generates a fault).

But, as noted, there are limits to what I can do with a 64-bit pointer
size. More accurate bounds checks would require using 128-bit pointers,
which have a much bigger set of drawbacks.

It does generally help detect cases where things go "clearly out of
bounds" though. But, sadly, doesn't do as much to help detect cases
where the compiler itself is buggy.

And, is at least, "better than nothing".
It is generally used alongside stack canaries and some other features.

>>> Otherwise neither C nor C++ (nor garbage collection like in C#) does
>>> guarantee that life-time of objects in program is well managed and
>>> adding more syntax sugar might only confuse that fact.
>>>
>> Automatic management of object lifetime, or cost-effectively determining
>> this at compile or runtime, is non-trivial. Otherwise, if it were not
>> non-trivial, garbage collection would effectively be a solved issue.
>>
>>
>> In one of my own languages, I had partially addressed the issue by
>> making it semi-explicit:
>> "new TypeDesc" / "new Class(...)", allocate heap object.
>> "new! TypeDesc" / "new! Class(...)", allocate with automatic lifetime.
>> "new(ZoneID) TypeDesc" / "new(ZoneID) Class(...)", allocate within a
>> given zone.
>>
>> Formally, this language does not use a garbage collector by default.
>>
>>
>> In this case, "new!" uses the same underlying mechanism as "alloca()" in
>> my case, which is in turn implemented by internally allocating stuff on
>> the heap and adding it to a linked list associated within the current
>> stack frame (with a callback function to either free the object or to
>> invoke a destructor).
>>
>> So, say:
>> alloca(size);
>> Becomes, essentially:
>> __alloca(&alloca_head, size);
>>
>> And, when the function returns:
>> __alloca_end(&alloca_head);
>> Which frees everything in the list.
>>
>> And, on entry:
>> __alloca_start(&alloca_head);
>> Which initializes the list (mostly sets it to NULL).
>>
>> Where each object essentially has a small hidden header:
>> {
>> void *next; //next object in list
>> void (*doFree)(void *obj); //called to free the object
>> char data[]; //data area for object
>> }
>>
>> Which internally, essentially just calls malloc/free for the backing
>> memory (where the ABI design in this case only really accommodates
>> fixed-size stack frames).
>>
>> For class object types (in my language), this would essentially call its
>> 'delete' handler (which in turn calls the object's destructor method,
>> and then calls free).
>>
>>
>> Experimentally, VLAs also exist, and were implemented with the same
>> underlying mechanism. As can be noted, with this implementation,
>> manually using malloc/free for the array is generally more efficient.
>>
>>
>> Or, at least, something to this effect...
>
> Sounds a bit like std::unique_ptr of C++ but ... the unique_ptr is not
> noticeably different from manual malloc/free in performance.

Dunno.

I don't really have a full C++ implementation in BGBCC (and none of the
C++ standard library).

So, in this case, it is basically a mechanism built into the compiler,
and used to implement "alloca", VLAs, lambdas, and other objects with an
automatic lifetime.

Mechanism and behavior is slightly different from the C++ RAII mechanism
though, in that it deals with "any automatic allocations within the
scope of a function" (in a linked list sense), whereas RAII deals with
it per-object and for each block-scope.

So, say:
if(cond)
{
Foo a();
...
}
Would turn into something like, say:
if(cond)
{
Foo a;
Foo::ctor(&a);
...
Foo::dtor(&a);
}

But, this is not exactly how my alloca mechanism works (with all
destructor calls being delayed until the point where the parent function
returns).

My other language (BS2) has a "zone" system, which operates in a way
vaguely influenced by the "Z_Malloc" system in the Doom engine.

It differs in a few points though:
Z_Malloc uses 8-bit ZoneID's, but this uses 16-bit;
Z_Malloc uses relative comparison for ClearZone, but mine uses bit-mask
and equality ("(zoneid&zonemask)==target", object gets freed).

In my language, destroying an object with a ClearZone may also trigger
any destructors/finalizers to trigger.

Had considered possibly supporting reference-counting, but reference
counting is "pretty steep" in some ways.

Some of this is supported in C mode via library extensions (one can
assign both zone-tags and type-tags to malloc'ed objects; and register
finalizers and similar via callback functions).

Well, and admittedly, some parts of the BS2 language work mostly by
transforming the operator into a function call.

So, generally, the "fastest" way to use BS2 is basically to treat it
like it is C (or a limited form of C++), which kinda lessens its advantage.

Main reason though for lackluster C++ support, is that writing a
full-featured C++ compiler looks like too much of an uphill battle for a
single-person project.

Note that BS2 also partly shares the C library in this case (rather than
supplying its own class library). If I did supply a class library, it
would likely be a bit more minimalist if compared with something like
Java or similar, likely mostly a fairly thin wrapper over the C library.

Though, as noted, the language does not impose Java-style structuring,
and top-level structure is more free-form like C and C++ (or, C# if it
allowed declaring functions directly in namespaces or in the toplevel).

Where the current implementation also implicitly assumes that any
toplevel declarations in the global namespace also follow C ABI rules
(no name mangling or similar, ...). Though, formally, one is still
supposed to use the "native" keyword.

Partly this is because requiring all functions to be static methods in a
class was, IMHO, pointless and stupid...

....

Subject	Replies	Author
destructor for C? (static analisys) By: Thiago Adams on Wed, 1 Feb 2023	8	Thiago Adams

(null cookie; hope that's ok)

devel / comp.lang.c / Re: destructor for C? (static analisys)