Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

"A dirty mind is a joy forever." -- Randy Kunkee


devel / comp.lang.misc / Re: Memory allocators and reporting the allocation size

SubjectAuthor
* Memory allocators and reporting the allocation sizeJames Harris
`- Memory allocators and reporting the allocation sizeDavid Brown

1
Re: Memory allocators and reporting the allocation size

<ulaaps$3p37o$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=10006&group=comp.lang.misc#10006

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: james.ha...@gmail.com (James Harris)
Newsgroups: comp.lang.misc
Subject: Re: Memory allocators and reporting the allocation size
Date: Tue, 12 Dec 2023 19:03:24 +0000
Organization: A noiseless patient Spider
Lines: 264
Message-ID: <ulaaps$3p37o$1@dont-email.me>
References: <tlqed3$1016q$1@dont-email.me> <tlqpmb$11hul$1@dont-email.me>
<4fbd07fd-e923-4f4e-9a02-ae3d4e6911d8n@googlegroups.com>
<tm1mvg$1u5gj$3@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Tue, 12 Dec 2023 19:03:25 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="e25f27a5e3c98805fc6686c7061945c2";
logging-data="3968248"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/kSjiniV2tWRQd9x7BqhQpRoZnu5jtOpY="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:33koAeUpOMUr3TxDgwXoVgkSldY=
In-Reply-To: <tm1mvg$1u5gj$3@dont-email.me>
Content-Language: en-GB
 by: James Harris - Tue, 12 Dec 2023 19:03 UTC

-On 28/11/2022 07:11, David Brown wrote:
> On 26/11/2022 14:05, James Harris wrote:
>> On Friday, 25 November 2022 at 16:15:08 UTC, David Brown wrote:
>>> On 25/11/2022 14:02, James Harris wrote:
>>>> I will get back to you guys on other topics but I need to improve
>>>> the memory allocator I use in my compiler and that has led to the
>>>> following query.
>>>>
>>>> Imagine an allocator which carries out plain malloc-type
>>>> allocations (i.e. specified as a number of 8-bit bytes suitably
>>>> aligned for any type) without having to have compatibility with
>>>> malloc, and in a language which is not C.
>>>>
>>>> For the sake of discussion the set of calls could be
>>>>
>>>> m_alloc m_calloc m_realloc m_resize m_free
>>> In a new language, I would not follow the names from C as closely.
>>> For one thing, it could confuse people - they may think they do the
>>> same thing as in C. For another, "calloc" in particular is a silly
>>> name. And you don't need "realloc" and "resize".
>>
>> The idea is that realloc could move the allocation whereas resize
>> could only resize it in place. The latter is not present in C's
>> malloc family.
>>
>
> Where would such a function be useful?  If the application needs to
> expand an object, it needs to expand it - if that means moving it, so be
> it.  I can't see where you would have use for a function that /might/ be
> able to expand an object.  Do you have /real/ use-cases in mind?

Sure. Imagine implementing a rope data structure

https://en.wikipedia.org/wiki/Rope_(data_structure)

or anything which could be in parts and/or of varying size.

>
>
>>>
>>> It can make sense to distinguish between "alloc_zeroed" and
>>> "alloc_unintialised", for performance reasons.
>>
>> Agreed, but what exactly would you want for the calls? Maybe
>>
>> m_alloc(size) m_alloc_a(size, flags)
>>
>> ?
>>
>
> There could be many different interfaces.  Dmitry also pointed out other
> possible distinctions for allocation of shared memory, unpageable
> memory, and so on.  Not all memory is created equal!
>
> Maybe you want a more object-oriented interface, for supporting several
> different pools or memory allocation choices.

There is use for (and, I would argue, a need for) various different
types of allocator. Some allocators are good for fixed-size objects,
some for space which does not need to be returned until a program
terminates, and some for where releases are in the opposite order from
requests, for example.

But such allocators impose restrictions on how they can be used or how
they can be implemented, e.g. requiring meta space to manage the storage
space.

There is still a need for one approach to be completely general:

* not requiring fixed-size allocations
* not requiring separate data structures to describe data space
* permitting arbitrary order of allocation and deallocation

Once one has a completely general allocator then it can take control of
available address space and other, more specialised allocators can be
built on top of it - e.g. by requesting meta space and data space for an
array-style allocator.

....

>
> The one thing I would avoid at all costs, in any interfaces, is the
> "integer flag" nonsense found in many languages (such as C).  If you
> really want some kind of flags, then at least be sure your language has
> strong type-checked enumerations.

I presume you mean not to combine flags into an integer such as

mem_allocator_call(size, flags)

where flags is such as

ALLOC_DATA | ALLOC_WRITABLE

but what would you use instead of an integer?

>
>>
>>>>
>>>> Would there be any value in giving the programmer a way to find
>>>> out the size of a given allocation? I don't mean the size
>>>> requested but the size allocated (which would be greater than or
>>>> equal to the size requested).
>>>>
>>> No. It is pointless.
>>>
>>> Why would anyone want to know the result of such a function? The
>>> only conceivable thought would be if they had first allocated space
>>> for x units, and then they now want to store y units - calling
>>> "get_real_size" could let them know if they need to call "resize"
>>> or not.
>>>
>>> The answer is that they should simply call "resize". If "resize"
>>> does not need to allocate new memory because the real size is big
>>> enough, it does nothing.
>>
>> It's partly for performance, as Bart's comments have backed up. If
>> code can find out the capacity then it can fill that allocation up to
>> the stated capacity without having to make any other calls and
>> without risking moving what has been stored so far.
>>
>
> There is no performance benefit in real code - and you should not care
> about silly cases.

On the contrary, there is huge potential for real benefit in real code.
There is no advantage in toy code. That's the difference.

>
>>>> I ask because it's simple enough to precede an aligned chunk of
>>>> memory with the allocation size. The allocator may well do that
>>>> anyway in order to help it manage memory. So there seems to be no
>>>> good reason to keep that info from the programmer. The question
>>>> is over whether there's value in allowing the programmer to get
>>>> such info. I am thinking that he could get the value with a call
>>>> such as
>>>>
>>>> m_msize(p)
>>>>
>>>> It seems simple enough, potentially useful, and should be
>>>> harmless. But it's noticeable that the malloc-type calls don't
>>>> have anything similar. So maybe there's a good reason why not.
>>>>
>>>> I thought it might cause implementation issues such as requiring
>>>> a larger header if the allocator wants to search only free
>>>> memory. But whatever implementation is used I think the allocator
>>>> will still need to either store or calculate the size of the
>>>> memory allocated so I am not sure I see a problem with the idea.
>>>>
>>>> What do you guys think? Opinions welcome!
>>>>
>>> I think you should not try to copy C's malloc/free mechanism. In
>>> particular, I do not think the memory allocator should track the
>>> sizes of allocations. "free" should always include the size,
>>> matching the value used in "alloc". (And "resize" should pass the
>>> old size as well as the new size.)
>>
>> Bart said the same and the suggestion surprises me. It seems prone to
>> error.
>>
>
> Your choices with memory management are basically to either trust the
> programmer, or do everything automatically with garbage collection.
> Even if you use something like C++'s RAII mechanisms for allocating and
> deallocating, you still have to trust the programmer to some extent.  If
> you are relying on the programmer to get their pointers right for calls
> to "free", why are you worried that they'll get the size wrong?

The implementation has to know the sizes. There's no point in requiring
the client program to know the sizes as well.

>
>>>
>>> Storing the size of allocation was not too bad in earliest days of
>>> C, with simpler processors, no caches, no multi-threading, and
>>> greater concern for simple allocation implementations than for
>>> performance or reliability. Modern allocators no longer work with a
>>> simple linked list storing sizes and link pointers at an address
>>> below the address returned to the user, so why follow a similar
>>> interface?
>>>
>>> Programs know the size of memory they requested when they allocated
>>> it. They know, almost invariably, the size when they are freeing
>>> the memory. They know the size of the types, and the size of the
>>> arrays. So having the memory allocator store this too is a waste.
>>
>> That's sometimes the case but not always. Say I wanted to read a line
>> from a file a byte at a time without knowing in advance how long the
>> line would be (a common-enough requirement but one which C programs
>> all too often fudge by defining a maximum line length). The required
>> allocation could not be known in advance.
>>
>
> That is irrelevant.  (On a side note, it makes sense to have library
> calls that make this kind of common function more convenient.)  The
> programmer knows the size of any "malloc" calls or "realloc" calls made
> - that's the size given back to "free".


Click here to read the complete article
Re: Memory allocators and reporting the allocation size

<ulnn3q$3419k$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=10009&group=comp.lang.misc#10009

  copy link   Newsgroups: comp.lang.misc
Path: i2pn2.org!i2pn.org!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: david.br...@hesbynett.no (David Brown)
Newsgroups: comp.lang.misc
Subject: Re: Memory allocators and reporting the allocation size
Date: Sun, 17 Dec 2023 21:53:13 +0100
Organization: A noiseless patient Spider
Lines: 324
Message-ID: <ulnn3q$3419k$1@dont-email.me>
References: <tlqed3$1016q$1@dont-email.me> <tlqpmb$11hul$1@dont-email.me>
<4fbd07fd-e923-4f4e-9a02-ae3d4e6911d8n@googlegroups.com>
<tm1mvg$1u5gj$3@dont-email.me> <ulaaps$3p37o$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Sun, 17 Dec 2023 20:53:14 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="e469dc31b08951081c800789ed5a0768";
logging-data="3278132"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/YdfUCTbk7W+fQbmYg7o82Hjl7hd7n/qM="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:q3tb4ypj/z1HIUWA2KX51L2cWw8=
Content-Language: en-GB
In-Reply-To: <ulaaps$3p37o$1@dont-email.me>
 by: David Brown - Sun, 17 Dec 2023 20:53 UTC

On 12/12/2023 20:03, James Harris wrote:
> -On 28/11/2022 07:11, David Brown wrote:
>> On 26/11/2022 14:05, James Harris wrote:
>>> On Friday, 25 November 2022 at 16:15:08 UTC, David Brown wrote:
>>>> On 25/11/2022 14:02, James Harris wrote:
>>>>> I will get back to you guys on other topics but I need to improve
>>>>> the memory allocator I use in my compiler and that has led to the
>>>>> following query.
>>>>>
>>>>> Imagine an allocator which carries out plain malloc-type
>>>>> allocations (i.e. specified as a number of 8-bit bytes suitably
>>>>> aligned for any type) without having to have compatibility with
>>>>> malloc, and in a language which is not C.
>>>>>
>>>>> For the sake of discussion the set of calls could be
>>>>>
>>>>> m_alloc m_calloc m_realloc m_resize m_free
>>>> In a new language, I would not follow the names from C as closely.
>>>> For one thing, it could confuse people - they may think they do the
>>>> same thing as in C. For another, "calloc" in particular is a silly
>>>> name. And you don't need "realloc" and "resize".
>>>
>>> The idea is that realloc could move the allocation whereas resize
>>> could only resize it in place. The latter is not present in C's
>>> malloc family.
>>>
>>
>> Where would such a function be useful?  If the application needs to
>> expand an object, it needs to expand it - if that means moving it, so be
>> it.  I can't see where you would have use for a function that /might/ be
>> able to expand an object.  Do you have /real/ use-cases in mind?
>
> Sure. Imagine implementing a rope data structure
>
>   https://en.wikipedia.org/wiki/Rope_(data_structure)
>
> or anything which could be in parts and/or of varying size.

You still don't need a "resize" that is distinct from "realloc". The
parts of your ropes would have a known allocation size (ideally matching
the appropriate units for the main allocator) and a "used" size. Small
changes that can be handled within the existing allocated size for the
rope are easily done without any memory allocation functions, and if you
need more, you have a new block on your rope.

The number of times - even with a rope - that a "resize if possible
without a new allocation" function is useful is negligible.

/Vastly/ more useful, I think, would be to re-think allocation systems
entirely - don't copy C. C's memory management was invented in a time
before caches, and before SMP. Stop trying to tweak C's outdated
choices with hacks that have almost no real-world uses, and starting
thinking differently.

For example, have your allocation functions return a tuple of address
and actual available memory in the allocation. For data structures
where you think "resize" might have relevance, store the real available
memory size in the blocks - then you have all the benefits of "resize"
for free.

And don't store the size of the allocation data in hidden parts of the
allocation, as is traditional in C malloc/free implementations. For 95%
of all allocations, the software knows what the allocated size is when
it is time to free the allocation. And for the other 5%, the size can
easily be stored, tracked or calculated when needed. Including
allocation size and pointers to free lists or linked list chains of
allocations in the allocated memory made sense before caches - it does
not make sense now. Keep the metadata separate.

>
>>
>>
>>>>
>>>> It can make sense to distinguish between "alloc_zeroed" and
>>>> "alloc_unintialised", for performance reasons.
>>>
>>> Agreed, but what exactly would you want for the calls? Maybe
>>>
>>> m_alloc(size) m_alloc_a(size, flags)
>>>
>>> ?
>>>
>>
>> There could be many different interfaces.  Dmitry also pointed out other
>> possible distinctions for allocation of shared memory, unpageable
>> memory, and so on.  Not all memory is created equal!
>>
>> Maybe you want a more object-oriented interface, for supporting several
>> different pools or memory allocation choices.
>
> There is use for (and, I would argue, a need for) various different
> types of allocator. Some allocators are good for fixed-size objects,
> some for space which does not need to be returned until a program
> terminates, and some for where releases are in the opposite order from
> requests, for example.

Yes.

>
> But such allocators impose restrictions on how they can be used or how
> they can be implemented, e.g. requiring meta space to manage the storage
> space.

Size-size allocators are particularly efficient for their metadata - all
you need is a single bit per allocation.

Metadata is always needed. Putting it within the allocated space is an
outdated and inefficient solution, and does not save space.

>
> There is still a need for one approach to be completely general:

Certainly there are uses for general allocations as well as more
specific ones. But often most allocations in a program can be better
handled by specific ones.

>
> * not requiring fixed-size allocations

Yes.

> * not requiring separate data structures to describe data space

No.

> * permitting arbitrary order of allocation and deallocation

Yes.

>
> Once one has a completely general allocator then it can take control of
> available address space and other, more specialised allocators can be
> built on top of it - e.g. by requesting meta space and data space for an
> array-style allocator.
>

That is certainly possible.

> ...
>
>>
>> The one thing I would avoid at all costs, in any interfaces, is the
>> "integer flag" nonsense found in many languages (such as C).  If you
>> really want some kind of flags, then at least be sure your language has
>> strong type-checked enumerations.
>
> I presume you mean not to combine flags into an integer such as
>
>   mem_allocator_call(size, flags)
>
> where flags is such as
>
>   ALLOC_DATA | ALLOC_WRITABLE
>
> but what would you use instead of an integer?
>

Use proper types - strong enumerations, sets, structs with bits or
flags, etc. /Strong/ types. It all boils down to words of some size in
the end, but the stronger your types as seen by the user in the
programming language, the less possibility there is for error or
misunderstanding.

>>
>>>
>>>>>
>>>>> Would there be any value in giving the programmer a way to find
>>>>> out the size of a given allocation? I don't mean the size
>>>>> requested but the size allocated (which would be greater than or
>>>>> equal to the size requested).
>>>>>
>>>> No. It is pointless.
>>>>
>>>> Why would anyone want to know the result of such a function? The
>>>> only conceivable thought would be if they had first allocated space
>>>> for x units, and then they now want to store y units - calling
>>>> "get_real_size" could let them know if they need to call "resize"
>>>> or not.
>>>>
>>>> The answer is that they should simply call "resize". If "resize"
>>>> does not need to allocate new memory because the real size is big
>>>> enough, it does nothing.
>>>
>>> It's partly for performance, as Bart's comments have backed up. If
>>> code can find out the capacity then it can fill that allocation up to
>>> the stated capacity without having to make any other calls and
>>> without risking moving what has been stored so far.
>>>
>>
>> There is no performance benefit in real code - and you should not care
>> about silly cases.
>
> On the contrary, there is huge potential for real benefit in real code.
> There is no advantage in toy code. That's the difference.
>

I completely disagree, and nothing you have suggested here or previously
has given me any reason to suspect that "resize" is useful. You've
given vague ideas about things where you think it might be helpful, but
nothing that is concrete and nothing that - IMHO - would not be better
off handled in a very different way. However, this is /your/ language,
not mine, and if you want to disagree with me on this point then that is
absolutely fine. I think we can call this one a dead donkey, and stop
beating it. There are surely many other points for which my suggestions
or recommendations could be more useful to you, and we should focus there.


Click here to read the complete article
1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor