Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Everybody needs a little love sometime; stop hacking and fall in love!


devel / comp.lang.c++ / Re: "Inside STL: The string" by Raymond Chen

SubjectAuthor
* "Inside STL: The string" by Raymond ChenLynn McGuire
`* Re: "Inside STL: The string" by Raymond ChenPaavo Helde
 +- Re: "Inside STL: The string" by Raymond ChenLynn McGuire
 `- Re: "Inside STL: The string" by Raymond ChenRichard

1
"Inside STL: The string" by Raymond Chen

<uahhha$vri4$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=965&group=comp.lang.c%2B%2B#965

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: lynnmcgu...@gmail.com (Lynn McGuire)
Newsgroups: comp.lang.c++
Subject: "Inside STL: The string" by Raymond Chen
Date: Thu, 3 Aug 2023 19:42:17 -0500
Organization: A noiseless patient Spider
Lines: 12
Message-ID: <uahhha$vri4$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 4 Aug 2023 00:42:18 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="a5bab9f41e2ba458bae30abe29b55bb8";
logging-data="1044036"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19YaAkd90I57Q3vZt0CLTWs+Lvx2ipQnPY="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.13.1
Cancel-Lock: sha1:yiEyDaV2IHjXhFNLppWSND6jcLo=
Content-Language: en-US
 by: Lynn McGuire - Fri, 4 Aug 2023 00:42 UTC

"Inside STL: The string" by Raymond Chen
https://devblogs.microsoft.com/oldnewthing/20230803-00/?p=108532

"You might think that a std::string (and all of its friends in the
std::basic_string family) are basically a vector of characters
internally. But strings are organized differently due to specific
optimizations permitted for strings but not for vectors."

I've always thought the internal buffer was a cool idea.

Lynn

Re: "Inside STL: The string" by Raymond Chen

<uai55v$15u7i$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=966&group=comp.lang.c%2B%2B#966

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: eesn...@osa.pri.ee (Paavo Helde)
Newsgroups: comp.lang.c++
Subject: Re: "Inside STL: The string" by Raymond Chen
Date: Fri, 4 Aug 2023 09:17:34 +0300
Organization: A noiseless patient Spider
Lines: 26
Message-ID: <uai55v$15u7i$1@dont-email.me>
References: <uahhha$vri4$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 4 Aug 2023 06:17:35 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="0bbe3e422c7bcaa9f51345284025e827";
logging-data="1243378"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18kUvpvzxIgAbSbPBUa3en+YtBp+8dLle8="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.13.0
Cancel-Lock: sha1:JgCw/S7NA5i+L3WoNeJQIR4URf4=
Content-Language: en-US
In-Reply-To: <uahhha$vri4$1@dont-email.me>
 by: Paavo Helde - Fri, 4 Aug 2023 06:17 UTC

04.08.2023 03:42 Lynn McGuire kirjutas:
> "Inside STL: The string" by Raymond Chen
>    https://devblogs.microsoft.com/oldnewthing/20230803-00/?p=108532
>
> "You might think that a std::string (and all of its friends in the
> std::basic_string family) are basically a vector of characters
> internally. But strings are organized differently due to specific
> optimizations permitted for strings but not for vectors."
>
> I've always thought the internal buffer was a cool idea.

You mean small string optimization? Yes, that's nifty. Still, I think it
could be made better.

Current mainstream (64-bit) implementations use SSO buffer of 16 bytes.
However, when a string is used inside an union which is larger, it could
well use a larger buffer, but there is no way to set this up.

A polymorphic variant class which I once made is 24 bytes. The last byte
in the class is the variant type tag, which is chosen to be 0 for small
strings, so that I can store zero-terminated small UTF-8 strings of up
to 23 bytes in it. I do not record the string length separately for
small strings as it is cheap to just calculate it by strlen() whenever
needed.

Re: "Inside STL: The string" by Raymond Chen

<uajl4g$1d4mv$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=967&group=comp.lang.c%2B%2B#967

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: lynnmcgu...@gmail.com (Lynn McGuire)
Newsgroups: comp.lang.c++
Subject: Re: "Inside STL: The string" by Raymond Chen
Date: Fri, 4 Aug 2023 14:55:58 -0500
Organization: A noiseless patient Spider
Lines: 32
Message-ID: <uajl4g$1d4mv$1@dont-email.me>
References: <uahhha$vri4$1@dont-email.me> <uai55v$15u7i$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 4 Aug 2023 19:56:00 -0000 (UTC)
Injection-Info: dont-email.me; posting-host="a5bab9f41e2ba458bae30abe29b55bb8";
logging-data="1479391"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18H1YWreelct5L1TyxaNlXd4iliXuLbc8Y="
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.13.1
Cancel-Lock: sha1:YGTsxrhY22shg9AH/2Bsl96200Q=
Content-Language: en-US
In-Reply-To: <uai55v$15u7i$1@dont-email.me>
 by: Lynn McGuire - Fri, 4 Aug 2023 19:55 UTC

On 8/4/2023 1:17 AM, Paavo Helde wrote:
> 04.08.2023 03:42 Lynn McGuire kirjutas:
>> "Inside STL: The string" by Raymond Chen
>>     https://devblogs.microsoft.com/oldnewthing/20230803-00/?p=108532
>>
>> "You might think that a std::string (and all of its friends in the
>> std::basic_string family) are basically a vector of characters
>> internally. But strings are organized differently due to specific
>> optimizations permitted for strings but not for vectors."
>>
>> I've always thought the internal buffer was a cool idea.
>
> You mean small string optimization? Yes, that's nifty. Still, I think it
> could be made better.
>
> Current mainstream (64-bit) implementations use SSO buffer of 16 bytes.
> However, when a string is used inside an union which is larger, it could
> well use a larger buffer, but there is no way to set this up.
>
> A polymorphic variant class which I once made is 24 bytes. The last byte
> in the class is the variant type tag, which is chosen to be 0 for small
> strings, so that I can store zero-terminated small UTF-8 strings of up
> to 23 bytes in it. I do not record the string length separately for
> small strings as it is cheap to just calculate it by strlen() whenever
> needed.

We compress large strings of more than 1,000 bytes so this is
interesting to me. Some of our strings go up to a GB in size.

Lynn

Re: "Inside STL: The string" by Raymond Chen

<uatn8e$3a2oc$2@news.xmission.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=972&group=comp.lang.c%2B%2B#972

  copy link   Newsgroups: comp.lang.c++
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!xmission!nnrp.xmission!.POSTED.shell.xmission.com!not-for-mail
From: legalize...@mail.xmission.com (Richard)
Newsgroups: comp.lang.c++
Subject: Re: "Inside STL: The string" by Raymond Chen
Date: Tue, 8 Aug 2023 15:33:34 -0000 (UTC)
Organization: multi-cellular, biological
Sender: legalize+jeeves@mail.xmission.com
Message-ID: <uatn8e$3a2oc$2@news.xmission.com>
References: <uahhha$vri4$1@dont-email.me> <uai55v$15u7i$1@dont-email.me>
Reply-To: (Richard) legalize+jeeves@mail.xmission.com
Injection-Date: Tue, 8 Aug 2023 15:33:34 -0000 (UTC)
Injection-Info: news.xmission.com; posting-host="shell.xmission.com:2607:fa18:0:beef::4";
logging-data="3476236"; mail-complaints-to="abuse@xmission.com"
X-Reply-Etiquette: No copy by email, please
Mail-Copies-To: never
X-Newsreader: trn 4.0-test77 (Sep 1, 2010)
Originator: legalize@shell.xmission.com (Richard)
 by: Richard - Tue, 8 Aug 2023 15:33 UTC

[Please do not mail me a copy of your followup]

Paavo Helde <eesnimi@osa.pri.ee> spake the secret code
<uai55v$15u7i$1@dont-email.me> thusly:

>04.08.2023 03:42 Lynn McGuire kirjutas:
>> I've always thought the internal buffer was a cool idea.
>
>You mean small string optimization? Yes, that's nifty. Still, I think it
>could be made better.

The advantage of string coming from a library and not from the
language is that you can create custom string types specific to your
use case. LLVM/Clang does this for all the internal string
manipulation that they do in order to optimize better than
std::string.
--
"The Direct3D Graphics Pipeline" free book <http://tinyurl.com/d3d-pipeline>
The Terminals Wiki <http://terminals-wiki.org>
The Computer Graphics Museum <http://computergraphicsmuseum.org>
Legalize Adulthood! (my blog) <http://legalizeadulthood.wordpress.com>


devel / comp.lang.c++ / Re: "Inside STL: The string" by Raymond Chen

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor