Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

E Pluribus Unix


devel / comp.lang.c / Re: Libraries using longjmp for error handling

Re: Libraries using longjmp for error handling

<86ediee28o.fsf@linuxsc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=29492&group=comp.lang.c#29492

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: tr.17...@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.lang.c
Subject: Re: Libraries using longjmp for error handling
Date: Sun, 01 Oct 2023 09:28:23 -0700
Organization: A noiseless patient Spider
Lines: 220
Message-ID: <86ediee28o.fsf@linuxsc.com>
References: <65SQM.565977$9o89.411905@fx05.ams4> <wwvpm23d9fr.fsf@LkoBDZeT.terraraq.uk> <pan$6f4ce$6edf891e$2b5c40c1$f8f989c1@invalid.invalid> <AE3RM.402236$kCld.195109@fx08.ams4> <871qejf62x.fsf@bsb.me.uk> <86y1gphiur.fsf@linuxsc.com> <877co9d8m2.fsf@bsb.me.uk> <86bkdlghqr.fsf@linuxsc.com> <8734ywbvek.fsf@bsb.me.uk> <86cyy0fee5.fsf@linuxsc.com> <87ttrbclcj.fsf@bsb.me.uk> <86r0mfdng1.fsf@linuxsc.com> <87il7qcsfz.fsf@bsb.me.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: dont-email.me; posting-host="5df80bfa77615176987b6091c1d3373e";
logging-data="1736391"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18qKmSB3d1Nsgjge2wTCxQlPdzv0GQi7D0="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:4f5LwAg6viYffFTpk5HkP6zT6vU=
sha1:QURjTR8eFB9rFUFmWF/pnC+VwjU=
 by: Tim Rentsch - Sun, 1 Oct 2023 16:28 UTC

Ben Bacarisse <ben.usenet@bsb.me.uk> writes:

> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>
>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>
>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>>
>>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>>>
>>>>> Tim Rentsch <tr.17687@z991.linuxsc.com> writes:
>>>>>
>>>>>> Ben Bacarisse <ben.usenet@bsb.me.uk> writes:
>>>>>>
>>>>>> [..how should errors be handled..]
>>>>>>
>>>>>>> How would you prefer these sorts of thing to be signalled?
>>>>>>
>>>>>> Let me give just one example.
>>>>>>
>>>>>> I wrote some code not long ago to add a value to a set of values,
>>>>>> where the set is represented using a recursive binary tree structure
>>>>>> (like red-black trees, but a little different). Usually we may
>>>>>> expect that a request to add a value would offer a value not already
>>>>>> included in the set, but certainly it can happen that there is a
>>>>>> call to add a value that is already included, in which case the top
>>>>>> level return value should be the original tree structure.
>>>>>>
>>>>>> I thought it would be easier to write a simple recursive routine
>>>>>> that assumes the to-be-added value is not yet included in the set,
>>>>>> and if that assumption is violated raise an exception that is caught
>>>>>> at the outermost level and simply returns the original argument tree
>>>>>> value. Certainly the code could have been written to handle the
>>>>>> already-present situation at each level of the recursion, but it was
>>>>>> much easier and much cleaner to handle it by raising an exception.
>>>>>
>>>>> This is one of those cases where I just have to take your word for it,
>>>>> (and I am happy to do that).
>>>>
>>>> Responding here to just this one part.
>>>>
>>>> Here is a simpler version of the code, for an ordinary
>>>> binary tree, and without any rebalancing:
>>>>
>>>> module TreeSet = struct
>>>> type 'a treeset = Empty | Node of 'a treeset * 'a * 'a treeset
>>>> exception Present
>>>>
>>>> let add tree element =
>>>> let rec add t e =
>>>> match t with
>>>> | Empty -> Node( Empty, e, Empty )
>>>> | Node( a, v, b ) ->
>>>> if e < v then Node( add a e, v, b ) else
>>>> if e > v then Node( a, v, add b e ) else
>>>> raise Present
>>>> in
>>>> try add tree element with Present -> tree
>>>>
>>>> end
>>>
>>> I'm not getting it. I'd write 'else t' rather than 'else raise
>>> Present' and I could then do away with the wrapping 'add' function.
>>
>> Doing that would result in returning a valid tree, but it would also
>> needlessly replicate all nodes starting from the point where the
>> match was found and going up to the root. The consequence is more
>> cycles used and more demand on the memory allocator.
>
> How can OCaml avoid making the allocations until it knows the exception
> won't be raised? Are they not all still being done?

No, no allocations are done if the to-be-added element is found
(and so an exception is raised). Consider one of the lines where a
Node is being constructed (which is synonymous with an allocation
occurring):

if e < v then Node( add a e, v, b ) else

The call to the Node() constructor doesn't take place until the
recursive call to 'add' returns. But if an equal value is found,
then an exception is raised, and the call to 'add' does not /ever/
return. No return from 'add' means no Node() is constructed.

To say this another way, looking for a point where an insertion
should take place, or where there is an equal value, happens going
/down/ the call chain. Node() allocations, however, happen only
when we are coming back /up/ the call chain. Raising an exception
"short circuits" all of those pending returns: they never happen,
and so no Node()s are constructed.

>> The technique
>> of raising an exception avoids those shortcomings (and also it
>> preserves tree identity, which can be important in some situations).
>
> I agree about the identity. I am not used to relying on the identity in
> situations like this (probably because of Haskell!) but I might find
> some time to work out how to preserve identity without an exception. I
> fear it will be messier that your simple solution.

SPOILER ALERT - at the end of this posting I am including code
for a version of add (called add') that does not use exceptions
and preserves identity when the "new" element is already present.

>>> I worry we are getting away from C, but the point is likely to be
>>> much simpler to make in OCaml (or Caml).
>>
>> I have another example. The example doesn't include any C code, but
>> I will talk about some concerns that pertain to C.
>>
>> Suppose we are writing a compiler for a large programming language.
>> The compiler should make use of available resources, especially
>> memory resources, when it can, but also should be able to compile
>> large programs even when the amount of memory available is much more
>> limited. How much memory is available is not known in advance; we
>> find out when a call to malloc() or realloc() returns NULL. I note
>> in passing that the PL/I(F) compiler from IBM could translate full
>> PL/I even if limited to only 44 K bytes of user memory; to do that
>> it needed 110 physical passes over source text or structures created
>> and stored in disk files.
>>
>> The idea in essence is to have two compilers in one: one taking a
>> conventional approach using regular dynamic memory allocation (in
>> other words malloc() etc), and the other using an IBM-style multiple
>> passes scheme that stores intermediate data in disk files. To
>> compile a program we start the first compiler under an umbrella of
>> setjmp(), which going forward assumes all memory allocations will
>> succeed. To allocate memory we call wrapper functions for malloc()
>> or realloc(), which will do a longjmp() if an allocation fails.
>> Upon receipt of the longjmp() "exception" the outer setjmp() call
>> starts over using the second approach, which is of course much
>> slower but also much less demanding of main memory.
>>
>> To make this work we might have to track malloc()'s and free()'s so
>> that any not-yet-released memory can be reclaimed before beginning
>> the alternate system, but I think that is straightforward and not in
>> need of any further explanation.
>
> I am not convinced it's neater to use longjmp here. I prefer to write
> code where allocation failures just propagate up the call stack. That
> would give me a top-level failure return from the try_fast_compile()
> function call.

There was a time when I would normally handle unusual conditions
locally, and pooh-poohed using exceptions (and setjmp()/longjmp()).
Working in OCaml has caused me (I speculate) to revise my thoughts
on the question.

SPOILER ALERT - continue scrolling only when ready to see the code
for an exception-free version of the add function...

Here it is...

let add' tree element =
let rec add t e =
match t with
| Empty -> Some (Node( Empty, e, Empty ))
| Node( a, v, b ) ->
if e < v then (
match add a e with
| None -> None
| Some a' -> Some( Node( a', v, b ) )
) else if e > v then (
match add b e with
| None -> None
| Some b' -> Some( Node( a, v, b' ) )
) else None
in
match add tree element with
| Some t -> t
| None -> tree

The original version is 10 lines, the revised version 18 lines.

That ratio is higher than what I would expect for an analogous
change to the full version (which is much longer than the simple
non-rebalancing version, because it has more cases to deal with,
and needs to keep the tree balanced), but not a lot higher - maybe
only 40 or 50% longer instead of 80% longer.

SubjectRepliesAuthor
o Libraries using longjmp for error handling (was: Re: More on NNTP

By: Blue-Maned_Hawk on Wed, 27 Sep 2023

61Blue-Maned_Hawk
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor