Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

24 Apr, 2024: Testing a new version of the Overboard here. If you have an issue post about it to rocksolid.nodes.help (I know. Everyone on Usenet has issues)


devel / comp.lang.c / Re: Embedded numbers sorting function (begging post)

SubjectAuthor
* Re: Embedded numbers sorting function (begging post)Richard Damon
+* Re: Embedded numbers sorting function (begging post)Malcolm McLean
|+* Re: Embedded numbers sorting function (begging post)Lew Pitcher
||`* Re: Embedded numbers sorting function (begging post)Malcolm McLean
|| +- Re: Embedded numbers sorting function (begging post)Kaz Kylheku
|| `* Re: Embedded numbers sorting function (begging post)Tim Rentsch
||  `- Re: Embedded numbers sorting function (begging post)Malcolm McLean
|`- Re: Embedded numbers sorting function (begging post)Richard Damon
`- Re: Embedded numbers sorting function (begging post)Tim Rentsch

1
Re: Embedded numbers sorting function (begging post)

<DCiRJ.121764$Tr18.72270@fx42.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=20523&group=comp.lang.c#20523

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!news-out.netnews.com!news.alt.net!fdc2.netnews.com!peer01.ams1!peer.ams1.xlned.com!news.xlned.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx42.iad.POSTED!not-for-mail
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0)
Gecko/20100101 Thunderbird/91.6.1
Subject: Re: Embedded numbers sorting function (begging post)
Content-Language: en-US
Newsgroups: comp.lang.c
References: <0865bc76-8d80-43c1-af79-87c3faf7c9d9n@googlegroups.com>
From: Rich...@Damon-Family.org (Richard Damon)
In-Reply-To: <0865bc76-8d80-43c1-af79-87c3faf7c9d9n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 17
Message-ID: <DCiRJ.121764$Tr18.72270@fx42.iad>
X-Complaints-To: abuse@easynews.com
Organization: Forte - www.forteinc.com
X-Complaints-Info: Please be sure to forward a copy of ALL headers otherwise we will be unable to process your complaint properly.
Date: Tue, 22 Feb 2022 23:20:45 -0500
X-Received-Bytes: 1745
 by: Richard Damon - Wed, 23 Feb 2022 04:20 UTC

On 2/22/22 10:27 AM, Malcolm McLean wrote:
> Anyone got a string comparison function handy which compares strings intelligently by embedded numbers, so that foo2 compares earlier than foo11 ?

I know I have seen them. Basically as you scan the strings you need to
switch between mode. If you hit a digit in both strings you need to
collect the next set of chacters and convert them to number to compare,
and then switch back to letters after that,

The thing you need to decide is how do you want to handle things like:

a02b and a2b, are they the same? or which comes first?

Means you may need to keep track of how long the string was too.

Presumably you are only wanting to process the normal ASCII digits or
you get into all sorts of other complexities.

Re: Embedded numbers sorting function (begging post)

<72affcf8-2435-4ef3-a6ba-367b82252228n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=20524&group=comp.lang.c#20524

  copy link   Newsgroups: comp.lang.c
X-Received: by 2002:adf:fe06:0:b0:1e5:95ad:b6bc with SMTP id n6-20020adffe06000000b001e595adb6bcmr92085wrr.191.1645629715560;
Wed, 23 Feb 2022 07:21:55 -0800 (PST)
X-Received: by 2002:a05:622a:1981:b0:2dd:a0ed:ea39 with SMTP id
u1-20020a05622a198100b002dda0edea39mr196038qtc.476.1645629714970; Wed, 23 Feb
2022 07:21:54 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.128.87.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Wed, 23 Feb 2022 07:21:54 -0800 (PST)
In-Reply-To: <DCiRJ.121764$Tr18.72270@fx42.iad>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:3818:3f66:255b:f6e5;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:3818:3f66:255b:f6e5
References: <0865bc76-8d80-43c1-af79-87c3faf7c9d9n@googlegroups.com> <DCiRJ.121764$Tr18.72270@fx42.iad>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <72affcf8-2435-4ef3-a6ba-367b82252228n@googlegroups.com>
Subject: Re: Embedded numbers sorting function (begging post)
From: malcolm....@gmail.com (Malcolm McLean)
Injection-Date: Wed, 23 Feb 2022 15:21:55 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Malcolm McLean - Wed, 23 Feb 2022 15:21 UTC

On Wednesday, 23 February 2022 at 04:21:18 UTC, Richard Damon wrote:
> On 2/22/22 10:27 AM, Malcolm McLean wrote:
> > Anyone got a string comparison function handy which compares strings intelligently by embedded numbers, so that foo2 compares earlier than foo11 ?
> I know I have seen them. Basically as you scan the strings you need to
> switch between mode. If you hit a digit in both strings you need to
> collect the next set of chacters and convert them to number to compare,
> and then switch back to letters after that,
>
> The thing you need to decide is how do you want to handle things like:
>
>
> a02b and a2b, are they the same? or which comes first?
>
> Means you may need to keep track of how long the string was too.
>
> Presumably you are only wanting to process the normal ASCII digits or
> you get into all sorts of other complexities.
>
The customers are international so they might use non-ascii digits, but I don't think
we can cater for that.
Profile 1.01 should come after Profile 1.001. Since some people abuse decimal
points to indicate that 3.11 means "version three, minor version eleven", then you
can defend 3.2 coming either after or before 3.11. The system has no way of knowing
which convention is in use.

Re: Embedded numbers sorting function (begging post)

<sv5kvh$k2d$2@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=20525&group=comp.lang.c#20525

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: lew.pitc...@digitalfreehold.ca (Lew Pitcher)
Newsgroups: comp.lang.c
Subject: Re: Embedded numbers sorting function (begging post)
Date: Wed, 23 Feb 2022 15:48:01 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 43
Message-ID: <sv5kvh$k2d$2@dont-email.me>
References: <0865bc76-8d80-43c1-af79-87c3faf7c9d9n@googlegroups.com>
<DCiRJ.121764$Tr18.72270@fx42.iad>
<72affcf8-2435-4ef3-a6ba-367b82252228n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Injection-Date: Wed, 23 Feb 2022 15:48:01 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="c0e52d0a62b858db636524364719928d";
logging-data="20557"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19ZJTDJsK8wzJn2okqjrKRwe4YSEKsiwW4="
User-Agent: Pan/0.139 (Sexual Chocolate; GIT bf56508
git://git.gnome.org/pan2)
Cancel-Lock: sha1:fujdq7W3PQSDwI3dCOXUtVHr16U=
 by: Lew Pitcher - Wed, 23 Feb 2022 15:48 UTC

On Wed, 23 Feb 2022 07:21:54 -0800, Malcolm McLean wrote:

> On Wednesday, 23 February 2022 at 04:21:18 UTC, Richard Damon wrote:
>> On 2/22/22 10:27 AM, Malcolm McLean wrote:
>> > Anyone got a string comparison function handy which compares strings
>> > intelligently by embedded numbers, so that foo2 compares earlier than
>> > foo11 ?
>> I know I have seen them. Basically as you scan the strings you need to
>> switch between mode. If you hit a digit in both strings you need to
>> collect the next set of chacters and convert them to number to compare,
>> and then switch back to letters after that,
>>
>> The thing you need to decide is how do you want to handle things like:
>>
>>
>> a02b and a2b, are they the same? or which comes first?
>>
>> Means you may need to keep track of how long the string was too.
>>
>> Presumably you are only wanting to process the normal ASCII digits or
>> you get into all sorts of other complexities.
>>
> The customers are international so they might use non-ascii digits, but
> I don't think we can cater for that.
> Profile 1.01 should come after Profile 1.001. Since some people abuse
> decimal points to indicate that 3.11 means "version three, minor version
> eleven", then you can defend 3.2 coming either after or before 3.11. The
> system has no way of knowing which convention is in use.

Given
Profile1
Profile2
where would you place
Profile01
? How about
Profile001
?

--
Lew Pitcher
"In Skills, We Trust"

Re: Embedded numbers sorting function (begging post)

<6354d8c7-a433-4fc8-a6af-bc1e3000f26en@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=20526&group=comp.lang.c#20526

  copy link   Newsgroups: comp.lang.c
X-Received: by 2002:a5d:588a:0:b0:1e8:b478:e74f with SMTP id n10-20020a5d588a000000b001e8b478e74fmr330959wrf.210.1645633683453;
Wed, 23 Feb 2022 08:28:03 -0800 (PST)
X-Received: by 2002:ac8:5dd3:0:b0:2d7:1db6:9ddc with SMTP id
e19-20020ac85dd3000000b002d71db69ddcmr474438qtx.529.1645633682865; Wed, 23
Feb 2022 08:28:02 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!proxad.net!feeder1-2.proxad.net!209.85.128.87.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Wed, 23 Feb 2022 08:28:02 -0800 (PST)
In-Reply-To: <sv5kvh$k2d$2@dont-email.me>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:a133:5b64:3ae3:4d01;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:a133:5b64:3ae3:4d01
References: <0865bc76-8d80-43c1-af79-87c3faf7c9d9n@googlegroups.com>
<DCiRJ.121764$Tr18.72270@fx42.iad> <72affcf8-2435-4ef3-a6ba-367b82252228n@googlegroups.com>
<sv5kvh$k2d$2@dont-email.me>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6354d8c7-a433-4fc8-a6af-bc1e3000f26en@googlegroups.com>
Subject: Re: Embedded numbers sorting function (begging post)
From: malcolm....@gmail.com (Malcolm McLean)
Injection-Date: Wed, 23 Feb 2022 16:28:03 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Malcolm McLean - Wed, 23 Feb 2022 16:28 UTC

On Wednesday, 23 February 2022 at 15:48:31 UTC, Lew Pitcher wrote:
> On Wed, 23 Feb 2022 07:21:54 -0800, Malcolm McLean wrote:
>
> > On Wednesday, 23 February 2022 at 04:21:18 UTC, Richard Damon wrote:
> >> On 2/22/22 10:27 AM, Malcolm McLean wrote:
> >> > Anyone got a string comparison function handy which compares strings
> >> > intelligently by embedded numbers, so that foo2 compares earlier than
> >> > foo11 ?
> >> I know I have seen them. Basically as you scan the strings you need to
> >> switch between mode. If you hit a digit in both strings you need to
> >> collect the next set of chacters and convert them to number to compare,
> >> and then switch back to letters after that,
> >>
> >> The thing you need to decide is how do you want to handle things like:
> >>
> >>
> >> a02b and a2b, are they the same? or which comes first?
> >>
> >> Means you may need to keep track of how long the string was too.
> >>
> >> Presumably you are only wanting to process the normal ASCII digits or
> >> you get into all sorts of other complexities.
> >>
> > The customers are international so they might use non-ascii digits, but
> > I don't think we can cater for that.
> > Profile 1.01 should come after Profile 1.001. Since some people abuse
> > decimal points to indicate that 3.11 means "version three, minor version
> > eleven", then you can defend 3.2 coming either after or before 3.11. The
> > system has no way of knowing which convention is in use.
> Given
> Profile1
> Profile2
> where would you place
> Profile01
> ?
> How about
> Profile001

I'd say that leading zeroes place you before in the sorting order.
Thus
Profile001
Profile01
Profile1
Profile2

Profile0 would place before Profile001.

Re: Embedded numbers sorting function (begging post)

<20220223103017.563@kylheku.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=20527&group=comp.lang.c#20527

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: 480-992-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.lang.c
Subject: Re: Embedded numbers sorting function (begging post)
Date: Wed, 23 Feb 2022 18:39:25 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 69
Message-ID: <20220223103017.563@kylheku.com>
References: <0865bc76-8d80-43c1-af79-87c3faf7c9d9n@googlegroups.com>
<DCiRJ.121764$Tr18.72270@fx42.iad>
<72affcf8-2435-4ef3-a6ba-367b82252228n@googlegroups.com>
<sv5kvh$k2d$2@dont-email.me>
<6354d8c7-a433-4fc8-a6af-bc1e3000f26en@googlegroups.com>
Injection-Date: Wed, 23 Feb 2022 18:39:25 -0000 (UTC)
Injection-Info: reader02.eternal-september.org; posting-host="604ccbf5808049cc8404bfb1ad893045";
logging-data="7555"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19eao/BXtXCeMx3kl2T37Q/Nvn/CSBSugA="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:7Ir2Mh2uaJjOXwMugYaGk2h0me4=
 by: Kaz Kylheku - Wed, 23 Feb 2022 18:39 UTC

On 2022-02-23, Malcolm McLean <malcolm.arthur.mclean@gmail.com> wrote:
> On Wednesday, 23 February 2022 at 15:48:31 UTC, Lew Pitcher wrote:
>> On Wed, 23 Feb 2022 07:21:54 -0800, Malcolm McLean wrote:
>>
>> > On Wednesday, 23 February 2022 at 04:21:18 UTC, Richard Damon wrote:
>> >> On 2/22/22 10:27 AM, Malcolm McLean wrote:
>> >> > Anyone got a string comparison function handy which compares strings
>> >> > intelligently by embedded numbers, so that foo2 compares earlier than
>> >> > foo11 ?
>> >> I know I have seen them. Basically as you scan the strings you need to
>> >> switch between mode. If you hit a digit in both strings you need to
>> >> collect the next set of chacters and convert them to number to compare,
>> >> and then switch back to letters after that,
>> >>
>> >> The thing you need to decide is how do you want to handle things like:
>> >>
>> >>
>> >> a02b and a2b, are they the same? or which comes first?
>> >>
>> >> Means you may need to keep track of how long the string was too.
>> >>
>> >> Presumably you are only wanting to process the normal ASCII digits or
>> >> you get into all sorts of other complexities.
>> >>
>> > The customers are international so they might use non-ascii digits, but
>> > I don't think we can cater for that.
>> > Profile 1.01 should come after Profile 1.001. Since some people abuse
>> > decimal points to indicate that 3.11 means "version three, minor version
>> > eleven", then you can defend 3.2 coming either after or before 3.11. The
>> > system has no way of knowing which convention is in use.
>> Given
>> Profile1
>> Profile2
>> where would you place
>> Profile01
>> ?
>> How about
>> Profile001
>
> I'd say that leading zeroes place you before in the sorting order.
> Thus
> Profile001
> Profile01
> Profile1
> Profile2
>
> Profile0 would place before Profile001.

I'd say that 001 compares equal to 1, and so Profile001 and Profile1 are
considered equivalent keys.

Either numbers are semantic, or else they are not. The users should
make up their mind and follow a numbering scheme accordingly.

Or else, if they want stable order under both lexicographic and
semantic, they should consistently use leading digits: 0001 0002 0003.

If these requirements are not suitable, that doesn't mean that
alternative requirements have to be shoehorned into the comparison
function, because comparison functions can be chained together, such
that if one comparison function reports equal, the next one is then
tried to resolve the tie. Sequences can be also be sorted by more than
one key, using a stable sort: sort it one way to put Profile001 ahead of
Profile01, and then use a stable sort over a comparison function with
embedded number semantics.

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Re: Embedded numbers sorting function (begging post)

<86mtihgzsj.fsf@linuxsc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=20530&group=comp.lang.c#20530

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: tr.17...@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.lang.c
Subject: Re: Embedded numbers sorting function (begging post)
Date: Wed, 23 Feb 2022 11:15:24 -0800
Organization: A noiseless patient Spider
Lines: 16
Message-ID: <86mtihgzsj.fsf@linuxsc.com>
References: <0865bc76-8d80-43c1-af79-87c3faf7c9d9n@googlegroups.com> <DCiRJ.121764$Tr18.72270@fx42.iad>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: reader02.eternal-september.org; posting-host="14abd13015942dc4cf9591af4aa38923";
logging-data="23647"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/9aV1NufeVqZDXMkxIT11MnarqqjCMI9s="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:lvxQNCZ5UiksotQwcoPk99h16w0=
sha1:X76jxH8AH5dgLWdqDafGAZrxsvk=
 by: Tim Rentsch - Wed, 23 Feb 2022 19:15 UTC

Richard Damon <Richard@Damon-Family.org> writes:

> On 2/22/22 10:27 AM, Malcolm McLean wrote:
>
>> Anyone got a string comparison function handy which compares strings
>> intelligently by embedded numbers, so that foo2 compares earlier than
>> foo11 ?
>
> I know I have seen them. Basically as you scan the strings you need to
> switch between mode. If you hit a digit in both strings you need to
> collect the next set of chacters and convert them to number to
> compare, and then switch back to letters after that,

Converting digit strings to a number is neither desirable nor
necessary. We would like to be able to handle digit strings
that have a value larger than any of the basic types can hold.

Re: Embedded numbers sorting function (begging post)

<86ilt5gysp.fsf@linuxsc.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=20531&group=comp.lang.c#20531

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: tr.17...@z991.linuxsc.com (Tim Rentsch)
Newsgroups: comp.lang.c
Subject: Re: Embedded numbers sorting function (begging post)
Date: Wed, 23 Feb 2022 11:36:54 -0800
Organization: A noiseless patient Spider
Lines: 151
Message-ID: <86ilt5gysp.fsf@linuxsc.com>
References: <0865bc76-8d80-43c1-af79-87c3faf7c9d9n@googlegroups.com> <DCiRJ.121764$Tr18.72270@fx42.iad> <72affcf8-2435-4ef3-a6ba-367b82252228n@googlegroups.com> <sv5kvh$k2d$2@dont-email.me> <6354d8c7-a433-4fc8-a6af-bc1e3000f26en@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Injection-Info: reader02.eternal-september.org; posting-host="14abd13015942dc4cf9591af4aa38923";
logging-data="28856"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+uwmZhVh1Jtcaq+56NyVgVDyHm7nlbUJk="
User-Agent: Gnus/5.11 (Gnus v5.11) Emacs/22.4 (gnu/linux)
Cancel-Lock: sha1:meHYBx4IteL5Wpzjfhhpoxmb2C4=
sha1:g6Et53IvJ6DZzpudr0KgQGTbluw=
 by: Tim Rentsch - Wed, 23 Feb 2022 19:36 UTC

Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:

> On Wednesday, 23 February 2022 at 15:48:31 UTC, Lew Pitcher wrote:
>
>> On Wed, 23 Feb 2022 07:21:54 -0800, Malcolm McLean wrote:
>>
>>> On Wednesday, 23 February 2022 at 04:21:18 UTC, Richard Damon wrote:
>>>
>>>> On 2/22/22 10:27 AM, Malcolm McLean wrote:
>>>>
>>>>> Anyone got a string comparison function handy which compares strings
>>>>> intelligently by embedded numbers, so that foo2 compares earlier than
>>>>> foo11 ?
>>>>
>>>> I know I have seen them. Basically as you scan the strings you need to
>>>> switch between mode. If you hit a digit in both strings you need to
>>>> collect the next set of chacters and convert them to number to compare,
>>>> and then switch back to letters after that,
>>>>
>>>> The thing you need to decide is how do you want to handle things like:
>>>>
>>>>
>>>> a02b and a2b, are they the same? or which comes first?
>>>>
>>>> Means you may need to keep track of how long the string was too.
>>>>
>>>> Presumably you are only wanting to process the normal ASCII digits or
>>>> you get into all sorts of other complexities.
>>>
>>> The customers are international so they might use non-ascii digits, but
>>> I don't think we can cater for that.
>>> Profile 1.01 should come after Profile 1.001. Since some people abuse
>>> decimal points to indicate that 3.11 means "version three, minor version
>>> eleven", then you can defend 3.2 coming either after or before 3.11. The
>>> system has no way of knowing which convention is in use.
>>
>> Given
>> Profile1
>> Profile2
>> where would you place
>> Profile01
>> ?
>> How about
>> Profile001
>
> I'd say that leading zeroes place you before in the sorting order.
> Thus
> Profile001
> Profile01
> Profile1
> Profile2
>
> Profile0 would place before Profile001.

The function foghorn_compare(), et al, as given below, does what
I think it is you want.

Both clang and gcc will optimize away the recursive calls. If
someone wants a version that is not facially recursive, I think
it shouldn't be too hard to synthesize one based on the function
definitions below (although I must confess I haven't tried to do
so).

(Note: I am taking an exception to my rule about not replying to
a posting from you. I hope you don't mind.)

#include <stddef.h>

typedef const char *String;

static int zero_compare( String, String );
static int nonzero_compare( String, String );
static int nonzero_prefer_p( String, String );
static int nonzero_prefer_q( String, String );

static size_t digits_length( String );
static String skip_digits( String );

static inline _Bool is_digit( unsigned );

int
foghorn_compare( String p, String q ){
return
*p == 0 && *q == 0 ? 0 :
*p == 0 ? -1 :
*q == 0 ? +1 :
is_digit(*p) && is_digit(*q) ? zero_compare( p, q ) :
is_digit(*p) ? -1 :
is_digit(*q) ? +1 :
*p+0u < *q+0u ? -1 :
*p+0u > *q+0u ? +1 :
foghorn_compare( p+1, q+1 );
}

int
zero_compare( String p, String q ){
return
*p == '0' && *q == '0' ? zero_compare( p+1, q+1 ) :
*p == '0' && is_digit( *q ) ? -1 :
*q == '0' && is_digit( *p ) ? +1 :
*q == '0' ? -1 :
*p == '0' ? +1 :
is_digit(*p) && is_digit(*q) ? nonzero_compare( p, q ) :
is_digit(*q) ? -1 :
is_digit(*p) ? +1 :
foghorn_compare( p, q );
}

int
nonzero_compare( String p, String q ){
_Bool p_not = ! is_digit( *p );
_Bool q_not = ! is_digit( *q );

return
p_not && q_not ? foghorn_compare( p, q ) :
p_not ? -1 :
q_not ? +1 :
*p < *q ? nonzero_prefer_p( p+1, q+1 ) :
*p > *q ? nonzero_prefer_q( p+1, q+1 ) :
nonzero_compare( p+1, q+1 );
}

int
nonzero_prefer_p( String p, String q ){
return digits_length( p ) <= digits_length( q ) ? -1 : +1;
}

int
nonzero_prefer_q( String p, String q ){
return digits_length( p ) >= digits_length( q ) ? +1 : -1;
}

size_t
digits_length( String s ){
return skip_digits( s ) - s;
}

String
skip_digits( String s ){
return is_digit( *s ) ? skip_digits( s+1 ) : s;
}

_Bool
is_digit( unsigned c ){
return c - '0' < 10;
}

Re: Embedded numbers sorting function (begging post)

<kqLRJ.29882$dln7.139@fx03.iad>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=20534&group=comp.lang.c#20534

  copy link   Newsgroups: comp.lang.c
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!newsfeed.xs4all.nl!newsfeed7.news.xs4all.nl!feeder1.feed.usenet.farm!feed.usenet.farm!peer02.ams4!peer.am4.highwinds-media.com!peer03.iad!feed-me.highwinds-media.com!news.highwinds-media.com!fx03.iad.POSTED!not-for-mail
MIME-Version: 1.0
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0)
Gecko/20100101 Thunderbird/91.6.1
Subject: Re: Embedded numbers sorting function (begging post)
Content-Language: en-US
Newsgroups: comp.lang.c
References: <0865bc76-8d80-43c1-af79-87c3faf7c9d9n@googlegroups.com>
<DCiRJ.121764$Tr18.72270@fx42.iad>
<72affcf8-2435-4ef3-a6ba-367b82252228n@googlegroups.com>
From: Rich...@Damon-Family.org (Richard Damon)
In-Reply-To: <72affcf8-2435-4ef3-a6ba-367b82252228n@googlegroups.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Lines: 36
Message-ID: <kqLRJ.29882$dln7.139@fx03.iad>
X-Complaints-To: abuse@easynews.com
Organization: Forte - www.forteinc.com
X-Complaints-Info: Please be sure to forward a copy of ALL headers otherwise we will be unable to process your complaint properly.
Date: Thu, 24 Feb 2022 08:07:29 -0500
X-Received-Bytes: 2850
 by: Richard Damon - Thu, 24 Feb 2022 13:07 UTC

On 2/23/22 10:21 AM, Malcolm McLean wrote:
> On Wednesday, 23 February 2022 at 04:21:18 UTC, Richard Damon wrote:
>> On 2/22/22 10:27 AM, Malcolm McLean wrote:
>>> Anyone got a string comparison function handy which compares strings intelligently by embedded numbers, so that foo2 compares earlier than foo11 ?
>> I know I have seen them. Basically as you scan the strings you need to
>> switch between mode. If you hit a digit in both strings you need to
>> collect the next set of chacters and convert them to number to compare,
>> and then switch back to letters after that,
>>
>> The thing you need to decide is how do you want to handle things like:
>>
>>
>> a02b and a2b, are they the same? or which comes first?
>>
>> Means you may need to keep track of how long the string was too.
>>
>> Presumably you are only wanting to process the normal ASCII digits or
>> you get into all sorts of other complexities.
>>
> The customers are international so they might use non-ascii digits, but I don't think
> we can cater for that.
> Profile 1.01 should come after Profile 1.001. Since some people abuse decimal
> points to indicate that 3.11 means "version three, minor version eleven", then you
> can defend 3.2 coming either after or before 3.11. The system has no way of knowing
> which convention is in use.

Dealing with decimals says you need to treat things differently than
with just intergers.

1.51 is after 1.501 but 1 is equal to 1 and . is equal to . but 51 is
before 501, so yes, you are going to need to decide if '.' introduces a
decimal fraction or just a value separator.

IF . is a decimal point, the digits after can just be sorted lexically.
If it is a separator (like the version number case) then you get into a
new numeric sort.

Re: Embedded numbers sorting function (begging post)

<3c87f750-35a6-46f0-89d3-989a40cff608n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=20627&group=comp.lang.c#20627

  copy link   Newsgroups: comp.lang.c
X-Received: by 2002:a05:622a:189b:b0:2de:4b91:b1a8 with SMTP id v27-20020a05622a189b00b002de4b91b1a8mr20078510qtc.601.1646130407619;
Tue, 01 Mar 2022 02:26:47 -0800 (PST)
X-Received: by 2002:ac8:7fca:0:b0:2de:8f3d:89be with SMTP id
b10-20020ac87fca000000b002de8f3d89bemr19717801qtk.34.1646130407271; Tue, 01
Mar 2022 02:26:47 -0800 (PST)
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.c
Date: Tue, 1 Mar 2022 02:26:47 -0800 (PST)
In-Reply-To: <86ilt5gysp.fsf@linuxsc.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2a00:23a8:400a:5601:10a0:6ae5:28db:399e;
posting-account=Dz2zqgkAAADlK5MFu78bw3ab-BRFV4Qn
NNTP-Posting-Host: 2a00:23a8:400a:5601:10a0:6ae5:28db:399e
References: <0865bc76-8d80-43c1-af79-87c3faf7c9d9n@googlegroups.com>
<DCiRJ.121764$Tr18.72270@fx42.iad> <72affcf8-2435-4ef3-a6ba-367b82252228n@googlegroups.com>
<sv5kvh$k2d$2@dont-email.me> <6354d8c7-a433-4fc8-a6af-bc1e3000f26en@googlegroups.com>
<86ilt5gysp.fsf@linuxsc.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <3c87f750-35a6-46f0-89d3-989a40cff608n@googlegroups.com>
Subject: Re: Embedded numbers sorting function (begging post)
From: malcolm....@gmail.com (Malcolm McLean)
Injection-Date: Tue, 01 Mar 2022 10:26:47 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 175
 by: Malcolm McLean - Tue, 1 Mar 2022 10:26 UTC

On Wednesday, 23 February 2022 at 19:37:32 UTC, Tim Rentsch wrote:
> Malcolm McLean <malcolm.ar...@gmail.com> writes:
>
> > On Wednesday, 23 February 2022 at 15:48:31 UTC, Lew Pitcher wrote:
> >
> >> On Wed, 23 Feb 2022 07:21:54 -0800, Malcolm McLean wrote:
> >>
> >>> On Wednesday, 23 February 2022 at 04:21:18 UTC, Richard Damon wrote:
> >>>
> >>>> On 2/22/22 10:27 AM, Malcolm McLean wrote:
> >>>>
> >>>>> Anyone got a string comparison function handy which compares strings
> >>>>> intelligently by embedded numbers, so that foo2 compares earlier than
> >>>>> foo11 ?
> >>>>
> >>>> I know I have seen them. Basically as you scan the strings you need to
> >>>> switch between mode. If you hit a digit in both strings you need to
> >>>> collect the next set of chacters and convert them to number to compare,
> >>>> and then switch back to letters after that,
> >>>>
> >>>> The thing you need to decide is how do you want to handle things like:
> >>>>
> >>>>
> >>>> a02b and a2b, are they the same? or which comes first?
> >>>>
> >>>> Means you may need to keep track of how long the string was too.
> >>>>
> >>>> Presumably you are only wanting to process the normal ASCII digits or
> >>>> you get into all sorts of other complexities.
> >>>
> >>> The customers are international so they might use non-ascii digits, but
> >>> I don't think we can cater for that.
> >>> Profile 1.01 should come after Profile 1.001. Since some people abuse
> >>> decimal points to indicate that 3.11 means "version three, minor version
> >>> eleven", then you can defend 3.2 coming either after or before 3.11. The
> >>> system has no way of knowing which convention is in use.
> >>
> >> Given
> >> Profile1
> >> Profile2
> >> where would you place
> >> Profile01
> >> ?
> >> How about
> >> Profile001
> >
> > I'd say that leading zeroes place you before in the sorting order.
> > Thus
> > Profile001
> > Profile01
> > Profile1
> > Profile2
> >
> > Profile0 would place before Profile001.
> The function foghorn_compare(), et al, as given below, does what
> I think it is you want.
>
Thanks to everyone who contributed, especially TIm.

Initially I used the code posted by Eric Sosman. But it turned out that it wasn't acceptable.
The problem was that the main program already has a Unicode comparison function
built in, which doesn't handle non-alnum glyphs in any simple way. It's also dependent on
the language the user chooses. We don't have access to the source.
So instead of comparing two UTF-8 character pointers in a C function, I had to move to
C++.

So this is what I did. Firstly, I wrote a function which scans a string for embedded numbers
at equal positions. If it finds a number pair that differ, it returns +1/-1. Otherwise it returns
zero. If the embedded number function returns zero, I call the main program comparison
function. The embedded numbers function is a static, it's got too quirky an interface to
expose it.
We also wanted a case-insensitive compare. So I called the main program toLower.
There's quite a lot of messing about converting between the main program Unicode format,
UTF-8, and back. This seems to be the way with Unicode. CompareUTF8StringCaseInsensitive
is a trivial wrapper I wrote to the main program compare, which works on ai::UnicodeStrings.

static inline int nat_isdigit(char a)
{ return isdigit((unsigned char)a);
}

static inline char nat_toupper(char a)
{ return toupper((unsigned char)a);
} static int compare_right(char const* a, char const* b)
{ int bias = 0;

/* The longest run of digits wins. That aside, the greatest
value wins, but we can't know that it will until we've scanned
both numbers to know that they have the same magnitude, so we
remember it in BIAS. */
for (;; a++, b++) {
if (!nat_isdigit(*a) && !nat_isdigit(*b))
return bias;
if (!nat_isdigit(*a))
return -1;
if (!nat_isdigit(*b))
return +1;
if (*a < *b) {
if (!bias)
bias = -1;
}
else if (*a > *b) {
if (!bias)
bias = +1;
}
else if (!*a && !*b)
return bias;
}

return 0;
}

static int compare_left(char const* a, char const* b)
{ /* Compare two left-aligned numbers: the first to have a
different value wins. */
for (;; a++, b++) {
if (!nat_isdigit(*a) && !nat_isdigit(*b))
return 0;
if (!nat_isdigit(*a))
return -1;
if (!nat_isdigit(*b))
return +1;
if (*a < *b)
return -1;
if (*a > *b)
return +1;
}

return 0;
}

static int numbercompare(const char* a, const char* b)
{ if (*a == 0 || *b == 0)
return 0;
if (nat_isdigit(*a) && nat_isdigit(*b))
{
int comp;

if (*a == '0' || *b == '0')
{
comp = compare_left(a, b);
}
else
{
comp = compare_right(a, b);
}
if (comp != 0)
return comp;
}
if (*a != *b)
return 0;
return numbercompare(a + 1, b + 1);
}

bool CompareProfileNames(std::string const& namea, std::string const& nameb)
{ ai::UnicodeString a = ai::UnicodeString::FromUTF8(namea);
ai::UnicodeString b = ai::UnicodeString::FromUTF8(nameb);
a.toUpper();
b.toUpper();
std::string autf8 = a.as_UTF8();
std::string butf8 = b.as_UTF8();

int numbercomp = numbercompare(autf8.c_str(), butf8.c_str());
if (numbercomp != 0)
return numbercomp < 0;
return CompareUTF8StringCaseInsensitive(namea, nameb);
}

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor