Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Clothes make the man. Naked people have little or no influence on society. -- Mark Twain


devel / comp.lang.forth / Numeric parsing

SubjectAuthor
* Numeric parsingdxforth
+* Re: Numeric parsingHans Bezemer
|`* Re: Numeric parsingdxforth
| +* Re: Numeric parsingHans Bezemer
| |+* Re: Numeric parsingAnton Ertl
| ||`- Re: Numeric parsingHans Bezemer
| |`- Re: Numeric parsingdxforth
| `* Re: Numeric parsingS Jack
|  +- Re: Numeric parsingHans Bezemer
|  +* Re: Numeric parsingminf...@arcor.de
|  |`* Re: Numeric parsingdxforth
|  | `* Re: Numeric parsingHans Bezemer
|  |  `- Re: Numeric parsingdxforth
|  `- Re: Numeric parsingdxforth
+- Re: Numeric parsingHans Bezemer
+- Re: Numeric parsingHans Bezemer
`- Re: Numeric parsingHans Bezemer

1
Numeric parsing

<t2gc87$1h8a$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17468&group=comp.lang.forth#17468

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!aioe.org!7AktqsUqy5CCvnKa3S0Dkw.user.46.165.242.75.POSTED!not-for-mail
From: dxfo...@gmail.com (dxforth)
Newsgroups: comp.lang.forth
Subject: Numeric parsing
Date: Tue, 5 Apr 2022 13:15:18 +1000
Organization: Aioe.org NNTP Server
Message-ID: <t2gc87$1h8a$1@gioia.aioe.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="50442"; posting-host="7AktqsUqy5CCvnKa3S0Dkw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.7.0
Content-Language: en-GB
X-Notice: Filtered by postfilter v. 0.9.2
 by: dxforth - Tue, 5 Apr 2022 03:15 UTC

On 5/04/2022 03:36, Hans Bezemer wrote:
> On Monday, April 4, 2022 at 5:14:52 PM UTC+2, Hans Bezemer wrote:
> I thought, "let's be nice". It took me a few minutes to port. Seems to run ok, but your mileage may vary:
>
> ---8<---
> \ 4tH library - SSCANF - Copyright 2022 J.L. Bezemer

Perhaps not everyone knows C well enough (am I the only one?) for a scanf to be
attractive. Nevertheless your factors would be familiar to anyone who has been
down the 'scan a numeric' path. For the latter audience

(putback) looks suspiciously like '-1 /string' ... which brings me to

(sign) ( a1 n1 -- f a2 n2)

I'm increasingly seeing forths which include this function variously called
SIGN? /SIGN etc. with a stack effect of ( a1 n1 -- a2 n2 f )

Your app may not need it but a generic /SIGN should probably include an empty
string check. From there it's trivial to define another handy function:

: /NUMBER ( addr u -- addr2 u2 d|ud )
/sign >r 0 0 2swap >number 2swap r> if dnegate then ;

which is a std function in my forth (along with /FLOAT )

A /SIGN that didn't check empty string could come unstuck with:

: >pad ( a u -- a' u ) tuck pad swap cmove pad swap ;

s" -1234" >pad 2drop
s" " >pad /sign .s 4749724 -1 -1

Re: Numeric parsing

<81c65224-bdf3-4a15-8214-c49f41a24ccfn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17470&group=comp.lang.forth#17470

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:a05:622a:1a27:b0:2e0:64c2:7469 with SMTP id f39-20020a05622a1a2700b002e064c27469mr2186740qtb.187.1649153145757;
Tue, 05 Apr 2022 03:05:45 -0700 (PDT)
X-Received: by 2002:a05:620a:4055:b0:67d:61ca:e9f2 with SMTP id
i21-20020a05620a405500b0067d61cae9f2mr1543193qko.510.1649153145584; Tue, 05
Apr 2022 03:05:45 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Tue, 5 Apr 2022 03:05:45 -0700 (PDT)
In-Reply-To: <t2gc87$1h8a$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=82.95.228.79; posting-account=Ebqe4AoAAABfjCRL4ZqOHWv4jv5ZU4Cs
NNTP-Posting-Host: 82.95.228.79
References: <t2gc87$1h8a$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <81c65224-bdf3-4a15-8214-c49f41a24ccfn@googlegroups.com>
Subject: Re: Numeric parsing
From: the.beez...@gmail.com (Hans Bezemer)
Injection-Date: Tue, 05 Apr 2022 10:05:45 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 50
 by: Hans Bezemer - Tue, 5 Apr 2022 10:05 UTC

On Tuesday, April 5, 2022 at 5:15:23 AM UTC+2, dxforth wrote:
> Perhaps not everyone knows C well enough (am I the only one?) for a scanf to be
> attractive. Nevertheless your factors would be familiar to anyone who has been
> down the 'scan a numeric' path.
Being intimately familiar with your numeric functions (they are part of 4tH's FP suites)
I can concur here. And my thanks for that. BTW, C/STRING should also be familiar to you ;-)
> (putback) looks suspiciously like '-1 /string' ... which brings me to
It is. But I pulled this trick from 4tH's uBasic/4tH interpreter - which also features this
"quick peek & rollback" function.

> Your app may not need it but a generic /SIGN should probably include an empty
> string check. From there it's trivial to define another handy function:
In 4tH practically every string is "guarded" - which is a consequence of using ASCIIZ strings.
If you pull this trick you simply get the NULL byte back. In vanilla Forth this might be a problem.
Do you mean I should "wrap" the (SIGN) function into:

DUP IF <previous sign> ELSE FALSE THEN

It's not rocket science, but I want to know if that's what you mean. It's quite easy to do in %d
and would add an extra check.

> : /NUMBER ( addr u -- addr2 u2 d|ud )
> /sign >r 0 0 2swap >number 2swap r> if dnegate then ;
> which is a std function in my forth (along with /FLOAT )
That's basically what's done in the handling of "%d". Others, like %u, %o, %x do unsigned numbers.

> A /SIGN that didn't check empty string could come unstuck with:
>
> : >pad ( a u -- a' u ) tuck pad swap cmove pad swap ;
>
> s" -1234" >pad 2drop
> s" " >pad /sign .s 4749724 -1 -1

I don't quite know what you mean by there, but I lifted the routine from FPIN and this is what I got:
- You FIRST do the "sign flag";
- Then you determine whether a 1 /STRING should be executed.

So I think this will do the job:

[char] d of \ 'd' requires a sign
dup if
(sign) ['] (dec) (number) 2>r swap if negate then (assign)
else 2>r true then endof

This will even signal it as an error (which basically, it is). You could even argue that ANY attempt
(apart from %s) to extract anything from an empty string is futile - and hence: an error.

I'm quite interested what you think of this line of thought.

Hans Bezemer

Re: Numeric parsing

<ae7f8d82-da17-43d7-b036-1e445e73c43an@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17471&group=comp.lang.forth#17471

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:ac8:5a88:0:b0:2e1:bbda:3b21 with SMTP id c8-20020ac85a88000000b002e1bbda3b21mr2169338qtc.307.1649153807431;
Tue, 05 Apr 2022 03:16:47 -0700 (PDT)
X-Received: by 2002:a05:622a:1a0d:b0:2e2:26f4:1aa with SMTP id
f13-20020a05622a1a0d00b002e226f401aamr2241040qtb.86.1649153807289; Tue, 05
Apr 2022 03:16:47 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Tue, 5 Apr 2022 03:16:47 -0700 (PDT)
In-Reply-To: <t2gc87$1h8a$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=82.95.228.79; posting-account=Ebqe4AoAAABfjCRL4ZqOHWv4jv5ZU4Cs
NNTP-Posting-Host: 82.95.228.79
References: <t2gc87$1h8a$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ae7f8d82-da17-43d7-b036-1e445e73c43an@googlegroups.com>
Subject: Re: Numeric parsing
From: the.beez...@gmail.com (Hans Bezemer)
Injection-Date: Tue, 05 Apr 2022 10:16:47 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 16
 by: Hans Bezemer - Tue, 5 Apr 2022 10:16 UTC

On Tuesday, April 5, 2022 at 5:15:23 AM UTC+2, dxforth wrote:
I'm thinking quite slow - so sometimes some thought come afterwards.

Of course, I've done quite some testing. There are basically two ways SSCANF stops:
- Either all there has to be parsed, has been parsed;
- Or there is some error along the way.

However - at the very end BOTH these conditions may occur. Now, the basic question is:
Do you want a "permissive" or "strict" SSCANF. There is something to be said for both.

Now let's apply it to the real world, where data is flawed and work has got to be done. That
would speak for a more permissive approach. You don't want this thing to stop two dozen times
for a trivial error when doing a run with thousands of records.

On the other hand.. ;-)

Hans Bezemer

Re: Numeric parsing

<1b8d2282-5641-42d0-a062-59e241260ba0n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17472&group=comp.lang.forth#17472

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:a05:620a:410b:b0:67d:d59c:13b8 with SMTP id j11-20020a05620a410b00b0067dd59c13b8mr1752023qko.449.1649157179720;
Tue, 05 Apr 2022 04:12:59 -0700 (PDT)
X-Received: by 2002:a0c:e74d:0:b0:443:ddc3:c88a with SMTP id
g13-20020a0ce74d000000b00443ddc3c88amr2195668qvn.51.1649157179611; Tue, 05
Apr 2022 04:12:59 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Tue, 5 Apr 2022 04:12:59 -0700 (PDT)
In-Reply-To: <t2gc87$1h8a$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=82.95.228.79; posting-account=Ebqe4AoAAABfjCRL4ZqOHWv4jv5ZU4Cs
NNTP-Posting-Host: 82.95.228.79
References: <t2gc87$1h8a$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1b8d2282-5641-42d0-a062-59e241260ba0n@googlegroups.com>
Subject: Re: Numeric parsing
From: the.beez...@gmail.com (Hans Bezemer)
Injection-Date: Tue, 05 Apr 2022 11:12:59 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 45
 by: Hans Bezemer - Tue, 5 Apr 2022 11:12 UTC

On Tuesday, April 5, 2022 at 5:15:23 AM UTC+2, dxforth wrote:
Obviously - I wrote that one too ;-)

IMHO, it has a lot more sane behavior:

%15c%d
012345678901234

Unparsed format:
Number of data read: 1
Stack bleed: 1

Contents
b = -.554405041/23002/0/
c$ = 012345678901234

%15c%d
01234567890

Unparsed format:
Number of data read: 1
Stack bleed: 1

Contents
b = -.554405041/23002/0/
c$ = 01234567890

%d%d
01234567890

Unparsed format:
Number of data read: 1
Stack bleed: 1

Contents
a = -.554405041/23002/0/
b = 1234567890

%d,%d
01234567890

Unparsed format: %d
Number of data read: 1
Stack bleed: 1

Hans Bezemer

Re: Numeric parsing

<cc6d21be-5b0f-422f-a1b3-2de463a58d32n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17473&group=comp.lang.forth#17473

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:a05:6214:528d:b0:441:4702:6263 with SMTP id kj13-20020a056214528d00b0044147026263mr2828381qvb.125.1649164513627;
Tue, 05 Apr 2022 06:15:13 -0700 (PDT)
X-Received: by 2002:a05:6214:1cc5:b0:443:6a15:5894 with SMTP id
g5-20020a0562141cc500b004436a155894mr2685324qvd.59.1649164513319; Tue, 05 Apr
2022 06:15:13 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Tue, 5 Apr 2022 06:15:13 -0700 (PDT)
In-Reply-To: <t2gc87$1h8a$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=82.95.228.79; posting-account=Ebqe4AoAAABfjCRL4ZqOHWv4jv5ZU4Cs
NNTP-Posting-Host: 82.95.228.79
References: <t2gc87$1h8a$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <cc6d21be-5b0f-422f-a1b3-2de463a58d32n@googlegroups.com>
Subject: Re: Numeric parsing
From: the.beez...@gmail.com (Hans Bezemer)
Injection-Date: Tue, 05 Apr 2022 13:15:13 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 107
 by: Hans Bezemer - Tue, 5 Apr 2022 13:15 UTC

On Tuesday, April 5, 2022 at 5:15:23 AM UTC+2, dxforth wrote:
Updated Forth version - if some of the words elude you: https://sourceforge.net/p/forth-4th/code/HEAD/tree/trunk/4th.src/lib/easy.4th

BTW, I've already converted the test program as well. Lemme know if there's any interest. It's basic, but it works.

---8<---
\ 4tH library - SSCANF - Copyright 2022 J.L. Bezemer
\ You can redistribute this file and/or modify it under
\ the terms of the GNU General Public License

\ Like sscanf(), but with a few differences
\ - Whitespace in both buffer and format string is largely ignored;
\ - %c with a width REQUIRES a string, %c without REQUIRES a variable;
\ - %s will stop on ANY defined delimiter, not just whitespace;
\ - Returns the unscanned part of the format string;
\ - Variable #SCANF returns the number of assignments made;
\ - Stack diagram of variables is inverted: d c b a s" %a %b %c %d";
\ - On failure, unused variables WILL remain on the stack.

\ Typical use:
\ a b c$ d$ e f g
\ s" %c%c %d%%%s, %4c%c%c" s" ab -12345%This is the end, 543210" sscanf

s" MAX-N" environment? \ query environment
[IF] \ if successful
constant max-n \ create constant MAX-N
[ELSE]
..( Warning: MAX-N undefined) cr
[THEN]

variable (delim) \ delimiter of string
variable (width) \ width of string
variable #scanf \ number of assignments
\ ANS Forth interface
: char- 1 chars - ; ( a -- a-1)
: >zero dup xor ; ( n -- 0)
: ;then postpone exit postpone then ; immediate
: unless postpone 0= postpone if ; immediate
: c/string over >r 1- swap char+ swap r> c@ ; ( a n -- a n-1 c)
\ a few execution tokens
: (dec) decimal ; : (hex) hex ; : (oct) 8 base ! ;
: (putback) 1+ swap char- swap ; ( a n -- a-1 n+1)
: (unumber) 0. 2swap >number 2>r d>s 2r> ; ( a1 n1 -- n3 a2 n2)
: (number) base @ >r execute (unumber) r> base ! ;
: (width!) ['] (dec) (number) rot dup unless max-n + then (width) ! ;
: (delimiter!) dup if c/string >r (putback) r> else dup then (delim) ! ;
: (delimiter?) over [char] ! - 0< over bl = and >r = r> or ;
: (assigned) 1 #scanf +! false ; ( -- f)
: (assign) swap ! (assigned) ; ( x n -- f)
: (place) rot place (assigned) ; ( a1 a2 n2 -- f)
\ get sign flag
: (sign) ( a1 n1 -- f a2 n2)
c/string dup [char] - = dup >r \ is it a minus sign?
if drop else [char] + <> if (putback) then then r> -rot
; \ drop plus sign
\ skip white space
: (skipwhite) ( a1 n1 -- a2 n2)
begin dup while c/string bl > dup >r if (putback) then r> until then
; \ parse buffer string
: (getstr) ( a1 n1 -- a2 n2 f)
over >r begin \ save starting address
dup \ any string left?
while \ if so, still within width?
over r@ - (width) @ < \ did we hit the delimiter?
while \ if so, put back the character
c/string (delim) @ (delimiter?) dup if >r (putback) r> then
until then then over r@ - r> swap \ calculate string dimensions
2swap 2>r (place) 2r> rot \ place string and get flag
;

: (getitem) ( a1 n1 c -- a2 n2 f)
case \ select type specifier
[char] d of \ 'd' requires a sign
(sign) ['] (dec) (number) 2>r swap if negate then (assign) endof
[char] u of ['] (dec) (number) 2>r (assign) endof
[char] x of ['] (hex) (number) 2>r (assign) endof
[char] o of ['] (oct) (number) 2>r (assign) endof
[char] % of c/string [char] % <> -rot 2>r endof
[char] c of \ if default width specified
(width) @ max-n = if \ just parse a single character
c/string -rot 2>r (assign) \ if a width has been specified
else \ take the entire length specified
2dup (width) @ /string 0 max 2>r (width) @ min (place)
then endof \ and put it in a string
endcase 2r> rot
; \ handle type specifiers
: (gettype) ( a1 n1 a2 n2 -- f)
2>r (width!) c/string >r (delimiter!) r> -rot
2r> 2swap 2>r (skipwhite) rot dup [char] s =
if drop (getstr) else over if (getitem) else 0<> then then
2r> rot >r 2swap r> \ restore stack and pull up flag
; \ select character on format string
: (select) ( a1 n1 c a2 n2 -- a3 n3 a4 n4 f)
rot dup >r \ save current character
[char] % = if rdrop (gettype) ;then \ act on type specifier
r@ bl = if (skipwhite) 2swap (skipwhite) 2swap r> >zero ;then
c/string r> <> \ does this character match
;

: sscanf ( xn .. x0 a1 n1 a2 n2 -- a3 n3)
0 #scanf ! 2>r begin dup while c/string 2r> (select) -rot 2>r until then 2rdrop
; ---8<---

Hans Bezemer

Re: Numeric parsing

<t2ing3$1qtg$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17481&group=comp.lang.forth#17481

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!aioe.org!7AktqsUqy5CCvnKa3S0Dkw.user.46.165.242.75.POSTED!not-for-mail
From: dxfo...@gmail.com (dxforth)
Newsgroups: comp.lang.forth
Subject: Re: Numeric parsing
Date: Wed, 6 Apr 2022 10:39:30 +1000
Organization: Aioe.org NNTP Server
Message-ID: <t2ing3$1qtg$1@gioia.aioe.org>
References: <t2gc87$1h8a$1@gioia.aioe.org>
<81c65224-bdf3-4a15-8214-c49f41a24ccfn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="60336"; posting-host="7AktqsUqy5CCvnKa3S0Dkw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.7.0
Content-Language: en-GB
X-Notice: Filtered by postfilter v. 0.9.2
 by: dxforth - Wed, 6 Apr 2022 00:39 UTC

On 5/04/2022 20:05, Hans Bezemer wrote:
> On Tuesday, April 5, 2022 at 5:15:23 AM UTC+2, dxforth wrote:
>
>> A /SIGN that didn't check empty string could come unstuck with:
>>
>> : >pad ( a u -- a' u ) tuck pad swap cmove pad swap ;
>>
>> s" -1234" >pad 2drop
>> s" " >pad /sign .s 4749724 -1 -1
>
> I don't quite know what you mean by there, but I lifted the routine from FPIN and this
> is what I got:
> - You FIRST do the "sign flag";
> - Then you determine whether a 1 /STRING should be executed.

I had to look at it twice to see that it handled the empty string case.
In retrospect I wouldn't write it that way today given it potentially
accesses a character that's out of bounds. A more robust definition
would be:

: /SIGN ( a u -- a' u' f )
dup if
over c@ dup [char] + =
swap [char] - = dup >r or
negate /string r> exit
then 0 ;

>
> So I think this will do the job:
>
> [char] d of \ 'd' requires a sign
> dup if
> (sign) ['] (dec) (number) 2>r swap if negate then (assign)
> else 2>r true then endof
>
> This will even signal it as an error (which basically, it is). You could even argue that ANY attempt
> (apart from %s) to extract anything from an empty string is futile - and hence: an error.
>
> I'm quite interested what you think of this line of thought.

Can't speak for sscanf but the default for /NUMBER is to return numeric 0
for an empty string. I would prefer the app decide whether it's acceptable
or not, rather than the routine.

Re: Numeric parsing

<11c1721a-ec7a-4c03-afac-add62dae68a0n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17487&group=comp.lang.forth#17487

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:ac8:5c85:0:b0:2e2:3211:92e9 with SMTP id r5-20020ac85c85000000b002e2321192e9mr7040176qta.386.1649244237278;
Wed, 06 Apr 2022 04:23:57 -0700 (PDT)
X-Received: by 2002:ad4:4eef:0:b0:441:2b3a:cd22 with SMTP id
dv15-20020ad44eef000000b004412b3acd22mr6745915qvb.130.1649244237131; Wed, 06
Apr 2022 04:23:57 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.goja.nl.eu.org!3.eu.feeder.erje.net!feeder.erje.net!proxad.net!feeder1-2.proxad.net!209.85.160.216.MISMATCH!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Wed, 6 Apr 2022 04:23:56 -0700 (PDT)
In-Reply-To: <t2ing3$1qtg$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=82.95.228.79; posting-account=Ebqe4AoAAABfjCRL4ZqOHWv4jv5ZU4Cs
NNTP-Posting-Host: 82.95.228.79
References: <t2gc87$1h8a$1@gioia.aioe.org> <81c65224-bdf3-4a15-8214-c49f41a24ccfn@googlegroups.com>
<t2ing3$1qtg$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <11c1721a-ec7a-4c03-afac-add62dae68a0n@googlegroups.com>
Subject: Re: Numeric parsing
From: the.beez...@gmail.com (Hans Bezemer)
Injection-Date: Wed, 06 Apr 2022 11:23:57 +0000
Content-Type: text/plain; charset="UTF-8"
 by: Hans Bezemer - Wed, 6 Apr 2022 11:23 UTC

On Wednesday, April 6, 2022 at 2:39:34 AM UTC+2, dxforth wrote:
> I had to look at it twice to see that it handled the empty string case.
Frankly, I had to put it into an editor and write out the stack diagram for every few words
to interpret what was going on. But sometimes I have to do that for my own programs
as well, so I didn't see any point to it mentioning that at first. ;-)

> Can't speak for sscanf but the default for /NUMBER is to return numeric 0
> for an empty string. I would prefer the app decide whether it's acceptable
> or not, rather than the routine.
Well, as usual I delved into the standard to see what was written there.

"3.2.1.2 Digit conversion
<blabla>
The value in BASE is the radix for number conversion. A digit has a value ranging from zero to one
less than the contents of BASE. The digit with the value zero corresponds to the character 0.
This representation of digits proceeds through the character set to the decimal value nine
corresponding to the character 9. For digits beginning with the decimal value ten the graphic
characters beginning with the character A are used. This correspondence continues up to and
including the digit with the decimal value thirty-five which is represented by the character Z".

=> The conversion of digits outside this range is implementation defined. <=
I don't quite know what to make of this - but it probably concerns stuff outside radix 2 to 36.

However - NOWHERE is mentioned what to do with either an empty string - or a string of blanks
(>FLOAT does, however - but that's outside this issue). On the other hand, the behavior of an empty
string to >NUMBER in Gforth IMPLIES that such conversion is valid. Since there is NO way to
check that such a case is rejected (you'd also expect an empty string when a single S" 0" was fed
to it). Spaces are clearly rejected though.

I'm not gonna quote the standard on >NUMBER again - it's simply ignored there. Overflow is mentioned
as a possible issue, but that's it.

So the standard doesn't respond to the question whether empty strings are valid numbers. I'm afraid
that puts us in a stalemate.

Personally, if a string is empty I don't like the idea of "inventing" numbers when they're obviously not
there. That doesn't mean that there are no "use cases" where this could apply. I've seen enough corporate
data to agree with you on that issue.

That having said - my 4tH compilers "NUMBER" fails on an empty string as well. Which means I've been quite
consistent in the last 30 years.

Of course, feel free to plug your /SIGN into any SSCANF version and use it as you like. But given the
alternative, I tend to like the consistent behavior of this more recent version of SSCANF - and I thank you
for the suggestions you made to make me reconsider - although it might not have worked out in the
way that I verbatim accepted and implemented them.

Hans Bezemer

Re: Numeric parsing

<2022Apr6.134315@mips.complang.tuwien.ac.at>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17490&group=comp.lang.forth#17490

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ant...@mips.complang.tuwien.ac.at (Anton Ertl)
Newsgroups: comp.lang.forth
Subject: Re: Numeric parsing
Date: Wed, 06 Apr 2022 11:43:15 GMT
Organization: Institut fuer Computersprachen, Technische Universitaet Wien
Lines: 52
Message-ID: <2022Apr6.134315@mips.complang.tuwien.ac.at>
References: <t2gc87$1h8a$1@gioia.aioe.org> <81c65224-bdf3-4a15-8214-c49f41a24ccfn@googlegroups.com> <t2ing3$1qtg$1@gioia.aioe.org> <11c1721a-ec7a-4c03-afac-add62dae68a0n@googlegroups.com>
Injection-Info: reader02.eternal-september.org; posting-host="f60c0aab4b32b175d1c86946b0f03722";
logging-data="15776"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18Z1vuhr6n4EAfbkLb3ClSu"
Cancel-Lock: sha1:ReI7cgagX3akFGzWbOnTOujA7SU=
X-newsreader: xrn 10.00-beta-3
 by: Anton Ertl - Wed, 6 Apr 2022 11:43 UTC

Hans Bezemer <the.beez.speaks@gmail.com> writes:

>However - NOWHERE is mentioned what to do with either an empty string
>- or a string of blanks (>FLOAT does, however - but that's outside
>this issue). On the other hand, the behavior of an empty string to
>>NUMBER in Gforth IMPLIES that such conversion is valid.

The word

>NUMBER ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )

is pretty low-level. It just converts as many digits as it finds and
then stops. How to use that is up to you. E.g., if you only want to
accept a string that only contains digits, you check if u2=0; if you
don't want to accept an empty string, you check if u1=0; if you want
to process a non-zero number of digits that may be followed by
something else, you check if u1>u2.

>I'm not gonna quote the standard on >NUMBER again - it's simply
>ignored there. Overflow is mentioned as a possible issue, but that's
>it.
>
>So the standard doesn't respond to the question whether empty strings
>are valid numbers.

As I see it, it clearly does. If u1=0, ud2=ud1. If the first
character of c-addr1 u1 is a non-digit, ud2=ud1.

If you want something more high-level than >NUMBER, Gforth currently
offers REC-NUM:

s" -123" rec-num ... clearstack \ <2> #-123 `recognized-num ok
s" $123." rec-num ... clearstack \ <3> #291 #0 `recognized-dnum ok
s" $123x" rec-num ... clearstack \ <1> `notfound ok

These examples demonstrate a number of things that >NUMBER (by itself)
lacks:

* support for negative numbers

* support for number prefixes

* support for doubles

I reformatted your overly long lines.

- anton
--
M. Anton Ertl http://www.complang.tuwien.ac.at/anton/home.html
comp.lang.forth FAQs: http://www.complang.tuwien.ac.at/forth/faq/toc.html
New standard: http://www.forth200x.org/forth200x.html
EuroForth 2021: https://euro.theforth.net/2021

Re: Numeric parsing

<cc38e4a8-44f5-4ce2-8eec-22d2bcc5a190n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17492&group=comp.lang.forth#17492

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:a05:620a:2993:b0:67d:7119:9f19 with SMTP id r19-20020a05620a299300b0067d71199f19mr5406250qkp.494.1649247972959;
Wed, 06 Apr 2022 05:26:12 -0700 (PDT)
X-Received: by 2002:a37:6395:0:b0:67b:1305:4ec3 with SMTP id
x143-20020a376395000000b0067b13054ec3mr5500950qkb.609.1649247972781; Wed, 06
Apr 2022 05:26:12 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Wed, 6 Apr 2022 05:26:12 -0700 (PDT)
In-Reply-To: <2022Apr6.134315@mips.complang.tuwien.ac.at>
Injection-Info: google-groups.googlegroups.com; posting-host=82.95.228.79; posting-account=Ebqe4AoAAABfjCRL4ZqOHWv4jv5ZU4Cs
NNTP-Posting-Host: 82.95.228.79
References: <t2gc87$1h8a$1@gioia.aioe.org> <81c65224-bdf3-4a15-8214-c49f41a24ccfn@googlegroups.com>
<t2ing3$1qtg$1@gioia.aioe.org> <11c1721a-ec7a-4c03-afac-add62dae68a0n@googlegroups.com>
<2022Apr6.134315@mips.complang.tuwien.ac.at>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <cc38e4a8-44f5-4ce2-8eec-22d2bcc5a190n@googlegroups.com>
Subject: Re: Numeric parsing
From: the.beez...@gmail.com (Hans Bezemer)
Injection-Date: Wed, 06 Apr 2022 12:26:12 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 27
 by: Hans Bezemer - Wed, 6 Apr 2022 12:26 UTC

On Wednesday, April 6, 2022 at 2:05:57 PM UTC+2, Anton Ertl wrote:
> >NUMBER ( ud1 c-addr1 u1 -- ud2 c-addr2 u2 )
>
> is pretty low-level. It just converts as many digits as it finds and
> then stops. How to use that is up to you. E.g., if you only want to
> accept a string that only contains digits, you check if u2=0; if you
> don't want to accept an empty string, you check if u1=0; if you want
> to process a non-zero number of digits that may be followed by
> something else, you check if u1>u2.

I see where you're coming from - and I might even agree with you.
That having said, I would expect such an explanation in the Rationale -
if only to eradicate any doubt - since the reference to "digit conversion"
doesn't cut it. At least not completely - it leaves room for interpretation
IMHO.

BTW, it was not intended as any criticism towards Gforth. I just happens
to be that Gforth was positioned as a "reference implementation" and -
as I've stated numerous times - I tend to use it as such to test conforming
behavior. As such was the case here.

As for your reference to REC-NUM - it's not part of the ANS-94 standard.
I won't doubt its usefulness - 4tH's got a variant called "NUMBER" which
roughly equals the same basic function - but IMHO it can't be regarded
as a reference point in this discussion. And given that criteria - neither
can "NUMBER" (which clearly rejects an empty string).

Hans Bezemer

Re: Numeric parsing

<2c096b92-db96-4091-9069-7ff16d11c9ccn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17493&group=comp.lang.forth#17493

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:ad4:5965:0:b0:440:fee0:bef2 with SMTP id eq5-20020ad45965000000b00440fee0bef2mr7295843qvb.68.1649248874089;
Wed, 06 Apr 2022 05:41:14 -0700 (PDT)
X-Received: by 2002:a05:620a:460e:b0:680:9d1c:2493 with SMTP id
br14-20020a05620a460e00b006809d1c2493mr5325762qkb.227.1649248873941; Wed, 06
Apr 2022 05:41:13 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!3.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Wed, 6 Apr 2022 05:41:13 -0700 (PDT)
In-Reply-To: <t2ing3$1qtg$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=2600:1700:3f7a:20d0:6532:1950:3009:3f85;
posting-account=V5nGoQoAAAC_P2U0qnxm2kC0s1jNJXJa
NNTP-Posting-Host: 2600:1700:3f7a:20d0:6532:1950:3009:3f85
References: <t2gc87$1h8a$1@gioia.aioe.org> <81c65224-bdf3-4a15-8214-c49f41a24ccfn@googlegroups.com>
<t2ing3$1qtg$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2c096b92-db96-4091-9069-7ff16d11c9ccn@googlegroups.com>
Subject: Re: Numeric parsing
From: sdwjac...@gmail.com (S Jack)
Injection-Date: Wed, 06 Apr 2022 12:41:14 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 9
 by: S Jack - Wed, 6 Apr 2022 12:41 UTC

On Tuesday, April 5, 2022 at 7:39:34 PM UTC-5, dxforth wrote:

> Can't speak for sscanf but the default for /NUMBER is to return numeric 0
> for an empty string. I would prefer the app decide whether it's acceptable
> or not, rather than the routine.

Just throw. If the user needs a value for the empty case, he can catch and supply it.
Isn't that the purpose of catch/throw?
--
me

Re: Numeric parsing

<d1c6e39a-b923-4d63-8051-addea258f40fn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17494&group=comp.lang.forth#17494

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:a05:620a:470d:b0:67d:d8a8:68c6 with SMTP id bs13-20020a05620a470d00b0067dd8a868c6mr5697956qkb.717.1649251021862;
Wed, 06 Apr 2022 06:17:01 -0700 (PDT)
X-Received: by 2002:a05:622a:1906:b0:2e0:77a7:16c4 with SMTP id
w6-20020a05622a190600b002e077a716c4mr7390933qtc.119.1649251021718; Wed, 06
Apr 2022 06:17:01 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!nntp.club.cc.cmu.edu!45.76.7.193.MISMATCH!3.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Wed, 6 Apr 2022 06:17:01 -0700 (PDT)
In-Reply-To: <2c096b92-db96-4091-9069-7ff16d11c9ccn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=82.95.228.79; posting-account=Ebqe4AoAAABfjCRL4ZqOHWv4jv5ZU4Cs
NNTP-Posting-Host: 82.95.228.79
References: <t2gc87$1h8a$1@gioia.aioe.org> <81c65224-bdf3-4a15-8214-c49f41a24ccfn@googlegroups.com>
<t2ing3$1qtg$1@gioia.aioe.org> <2c096b92-db96-4091-9069-7ff16d11c9ccn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <d1c6e39a-b923-4d63-8051-addea258f40fn@googlegroups.com>
Subject: Re: Numeric parsing
From: the.beez...@gmail.com (Hans Bezemer)
Injection-Date: Wed, 06 Apr 2022 13:17:01 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 14
 by: Hans Bezemer - Wed, 6 Apr 2022 13:17 UTC

On Wednesday, April 6, 2022 at 2:41:15 PM UTC+2, S Jack wrote:
> On Tuesday, April 5, 2022 at 7:39:34 PM UTC-5, dxforth wrote:
>
> > Can't speak for sscanf but the default for /NUMBER is to return numeric 0
> > for an empty string. I would prefer the app decide whether it's acceptable
> > or not, rather than the routine.
> Just throw. If the user needs a value for the empty case, he can catch and supply it.
> Isn't that the purpose of catch/throw?
In my routine you wouldn't even have to CATCH|THROW. Like sscanf(), you can extract the number
of values that would have been assigned and consume the left over variables yourself - that's especially
easy when they're all numerical values:

#expected #sscanf - 0 ?do 0 swap ! loop

HB

Re: Numeric parsing

<b6231fd1-c521-42a2-b0f6-1dab4aa76af1n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17495&group=comp.lang.forth#17495

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:a05:622a:14c8:b0:2e1:d626:66ea with SMTP id u8-20020a05622a14c800b002e1d62666eamr7448409qtx.58.1649252581590;
Wed, 06 Apr 2022 06:43:01 -0700 (PDT)
X-Received: by 2002:a05:622a:194:b0:2e1:e733:5798 with SMTP id
s20-20020a05622a019400b002e1e7335798mr7470557qtw.104.1649252581411; Wed, 06
Apr 2022 06:43:01 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!nntp.club.cc.cmu.edu!45.76.7.193.MISMATCH!3.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Wed, 6 Apr 2022 06:43:01 -0700 (PDT)
In-Reply-To: <2c096b92-db96-4091-9069-7ff16d11c9ccn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=79.224.111.239; posting-account=AqNUYgoAAADmkK2pN-RKms8sww57W0Iw
NNTP-Posting-Host: 79.224.111.239
References: <t2gc87$1h8a$1@gioia.aioe.org> <81c65224-bdf3-4a15-8214-c49f41a24ccfn@googlegroups.com>
<t2ing3$1qtg$1@gioia.aioe.org> <2c096b92-db96-4091-9069-7ff16d11c9ccn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b6231fd1-c521-42a2-b0f6-1dab4aa76af1n@googlegroups.com>
Subject: Re: Numeric parsing
From: minfo...@arcor.de (minf...@arcor.de)
Injection-Date: Wed, 06 Apr 2022 13:43:01 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 13
 by: minf...@arcor.de - Wed, 6 Apr 2022 13:43 UTC

S Jack schrieb am Mittwoch, 6. April 2022 um 14:41:15 UTC+2:
> On Tuesday, April 5, 2022 at 7:39:34 PM UTC-5, dxforth wrote:
>
> > Can't speak for sscanf but the default for /NUMBER is to return numeric 0
> > for an empty string. I would prefer the app decide whether it's acceptable
> > or not, rather than the routine.
> Just throw. If the user needs a value for the empty case, he can catch and supply it.
> Isn't that the purpose of catch/throw?

The only issue I ever had with >NUMBER was overflow of the ud accumulator.
It had been a pita to debug. Now my >NUMBER throws an exception on overflow.

This is one example where the standard wrap-around behavior of integer math
in Forth is poor.

Re: Numeric parsing

<t2k6ph$lk2$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17496&group=comp.lang.forth#17496

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!aioe.org!7AktqsUqy5CCvnKa3S0Dkw.user.46.165.242.75.POSTED!not-for-mail
From: dxfo...@gmail.com (dxforth)
Newsgroups: comp.lang.forth
Subject: Re: Numeric parsing
Date: Thu, 7 Apr 2022 00:06:41 +1000
Organization: Aioe.org NNTP Server
Message-ID: <t2k6ph$lk2$1@gioia.aioe.org>
References: <t2gc87$1h8a$1@gioia.aioe.org>
<81c65224-bdf3-4a15-8214-c49f41a24ccfn@googlegroups.com>
<t2ing3$1qtg$1@gioia.aioe.org>
<11c1721a-ec7a-4c03-afac-add62dae68a0n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="22146"; posting-host="7AktqsUqy5CCvnKa3S0Dkw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.7.0
Content-Language: en-GB
X-Notice: Filtered by postfilter v. 0.9.2
 by: dxforth - Wed, 6 Apr 2022 14:06 UTC

On 6/04/2022 21:23, Hans Bezemer wrote:
> ...
> I'm not gonna quote the standard on >NUMBER again - it's simply ignored there.

But >NUMBER isn't a complete str-to-int routine - only a parser. I think the
spec is clear what happens when passed an empty string - it's a NOP.

> So the standard doesn't respond to the question whether empty strings are valid numbers.
> I'm afraid that puts us in a stalemate.

Perhaps because unlike >FLOAT, ANS never provided an equivalent routine for ints -
only a primitive for making one.

> Personally, if a string is empty I don't like the idea of "inventing" numbers when they're obviously not
> there. That doesn't mean that there are no "use cases" where this could apply. I've seen enough corporate
> data to agree with you on that issue.
>
> That having said - my 4tH compilers "NUMBER" fails on an empty string as well. Which means I've been quite
> consistent in the last 30 years.

IIRC C provides several such routines each having their own behaviour.

Re: Numeric parsing

<t2k785$1037$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17497&group=comp.lang.forth#17497

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!aioe.org!7AktqsUqy5CCvnKa3S0Dkw.user.46.165.242.75.POSTED!not-for-mail
From: dxfo...@gmail.com (dxforth)
Newsgroups: comp.lang.forth
Subject: Re: Numeric parsing
Date: Thu, 7 Apr 2022 00:14:29 +1000
Organization: Aioe.org NNTP Server
Message-ID: <t2k785$1037$1@gioia.aioe.org>
References: <t2gc87$1h8a$1@gioia.aioe.org>
<81c65224-bdf3-4a15-8214-c49f41a24ccfn@googlegroups.com>
<t2ing3$1qtg$1@gioia.aioe.org>
<2c096b92-db96-4091-9069-7ff16d11c9ccn@googlegroups.com>
<b6231fd1-c521-42a2-b0f6-1dab4aa76af1n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="32871"; posting-host="7AktqsUqy5CCvnKa3S0Dkw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.7.0
X-Notice: Filtered by postfilter v. 0.9.2
Content-Language: en-GB
 by: dxforth - Wed, 6 Apr 2022 14:14 UTC

On 6/04/2022 23:43, minf...@arcor.de wrote:
> S Jack schrieb am Mittwoch, 6. April 2022 um 14:41:15 UTC+2:
>> On Tuesday, April 5, 2022 at 7:39:34 PM UTC-5, dxforth wrote:
>>
>> > Can't speak for sscanf but the default for /NUMBER is to return numeric 0
>> > for an empty string. I would prefer the app decide whether it's acceptable
>> > or not, rather than the routine.
>> Just throw. If the user needs a value for the empty case, he can catch and supply it.
>> Isn't that the purpose of catch/throw?
>
> The only issue I ever had with >NUMBER was overflow of the ud accumulator.
> It had been a pita to debug. Now my >NUMBER throws an exception on overflow.
>
> This is one example where the standard wrap-around behavior of integer math
> in Forth is poor.

Just like the [Turbo] C compiler :)

Re: Numeric parsing

<155eccbe-9450-4141-933b-a49c84f8616an@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17498&group=comp.lang.forth#17498

  copy link   Newsgroups: comp.lang.forth
X-Received: by 2002:a05:620a:2993:b0:67d:7119:9f19 with SMTP id r19-20020a05620a299300b0067d71199f19mr5908495qkp.494.1649256152448;
Wed, 06 Apr 2022 07:42:32 -0700 (PDT)
X-Received: by 2002:a05:620a:3cc:b0:67b:e77:6f21 with SMTP id
r12-20020a05620a03cc00b0067b0e776f21mr6005387qkm.272.1649256152275; Wed, 06
Apr 2022 07:42:32 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!1.us.feeder.erje.net!3.us.feeder.erje.net!feeder.erje.net!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.forth
Date: Wed, 6 Apr 2022 07:42:31 -0700 (PDT)
In-Reply-To: <t2k785$1037$1@gioia.aioe.org>
Injection-Info: google-groups.googlegroups.com; posting-host=82.95.228.79; posting-account=Ebqe4AoAAABfjCRL4ZqOHWv4jv5ZU4Cs
NNTP-Posting-Host: 82.95.228.79
References: <t2gc87$1h8a$1@gioia.aioe.org> <81c65224-bdf3-4a15-8214-c49f41a24ccfn@googlegroups.com>
<t2ing3$1qtg$1@gioia.aioe.org> <2c096b92-db96-4091-9069-7ff16d11c9ccn@googlegroups.com>
<b6231fd1-c521-42a2-b0f6-1dab4aa76af1n@googlegroups.com> <t2k785$1037$1@gioia.aioe.org>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <155eccbe-9450-4141-933b-a49c84f8616an@googlegroups.com>
Subject: Re: Numeric parsing
From: the.beez...@gmail.com (Hans Bezemer)
Injection-Date: Wed, 06 Apr 2022 14:42:32 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 16
 by: Hans Bezemer - Wed, 6 Apr 2022 14:42 UTC

On Wednesday, April 6, 2022 at 4:14:31 PM UTC+2, dxforth wrote:
> > This is one example where the standard wrap-around behavior of integer math
> > in Forth is poor.
> Just like the [Turbo] C compiler :)

I don't mind "wrap around" behavior. Sometimes that just is what the doctor ordered.
Of all the "bit tricks" that I pulled I wouldn't want to see the behavior of VBA, where
every overflow results in "BEEP - I halt your program right there".

BTW, the standard says "An ambiguous condition exists if ud2 overflows during the conversion".
So, you may pretty much do as you see fit.

Yes, >NUMBER is seriously lacking in some respects - and as usual the required functionality
is lacking from the standard - leaving it to the likes of us to come up with our highly different
solutions to the problem. Oh, I really love portability.

Hans Bezemer

Re: Numeric parsing

<t2kb1o$11t5$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17499&group=comp.lang.forth#17499

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!aioe.org!7AktqsUqy5CCvnKa3S0Dkw.user.46.165.242.75.POSTED!not-for-mail
From: dxfo...@gmail.com (dxforth)
Newsgroups: comp.lang.forth
Subject: Re: Numeric parsing
Date: Thu, 7 Apr 2022 01:19:20 +1000
Organization: Aioe.org NNTP Server
Message-ID: <t2kb1o$11t5$1@gioia.aioe.org>
References: <t2gc87$1h8a$1@gioia.aioe.org>
<81c65224-bdf3-4a15-8214-c49f41a24ccfn@googlegroups.com>
<t2ing3$1qtg$1@gioia.aioe.org>
<2c096b92-db96-4091-9069-7ff16d11c9ccn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="34725"; posting-host="7AktqsUqy5CCvnKa3S0Dkw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.7.0
Content-Language: en-GB
X-Notice: Filtered by postfilter v. 0.9.2
 by: dxforth - Wed, 6 Apr 2022 15:19 UTC

On 6/04/2022 22:41, S Jack wrote:
> On Tuesday, April 5, 2022 at 7:39:34 PM UTC-5, dxforth wrote:
>
>> Can't speak for sscanf but the default for /NUMBER is to return numeric 0
>> for an empty string. I would prefer the app decide whether it's acceptable
>> or not, rather than the routine.
>
> Just throw. If the user needs a value for the empty case, he can catch and supply it.
> Isn't that the purpose of catch/throw?

catch/throw is a solution of last resort

Re: Numeric parsing

<t2ligj$10r3$1@gioia.aioe.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=17500&group=comp.lang.forth#17500

  copy link   Newsgroups: comp.lang.forth
Path: i2pn2.org!i2pn.org!aioe.org!7AktqsUqy5CCvnKa3S0Dkw.user.46.165.242.75.POSTED!not-for-mail
From: dxfo...@gmail.com (dxforth)
Newsgroups: comp.lang.forth
Subject: Re: Numeric parsing
Date: Thu, 7 Apr 2022 12:32:52 +1000
Organization: Aioe.org NNTP Server
Message-ID: <t2ligj$10r3$1@gioia.aioe.org>
References: <t2gc87$1h8a$1@gioia.aioe.org>
<81c65224-bdf3-4a15-8214-c49f41a24ccfn@googlegroups.com>
<t2ing3$1qtg$1@gioia.aioe.org>
<2c096b92-db96-4091-9069-7ff16d11c9ccn@googlegroups.com>
<b6231fd1-c521-42a2-b0f6-1dab4aa76af1n@googlegroups.com>
<t2k785$1037$1@gioia.aioe.org>
<155eccbe-9450-4141-933b-a49c84f8616an@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 7bit
Injection-Info: gioia.aioe.org; logging-data="33635"; posting-host="7AktqsUqy5CCvnKa3S0Dkw.user.gioia.aioe.org"; mail-complaints-to="abuse@aioe.org";
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101
Thunderbird/91.7.0
Content-Language: en-GB
X-Notice: Filtered by postfilter v. 0.9.2
 by: dxforth - Thu, 7 Apr 2022 02:32 UTC

On 7/04/2022 00:42, Hans Bezemer wrote:
> On Wednesday, April 6, 2022 at 4:14:31 PM UTC+2, dxforth wrote:
>> > This is one example where the standard wrap-around behavior of integer math
>> > in Forth is poor.
>> Just like the [Turbo] C compiler :)
>
> I don't mind "wrap around" behavior. Sometimes that just is what the doctor ordered.
> Of all the "bit tricks" that I pulled I wouldn't want to see the behavior of VBA, where
> every overflow results in "BEEP - I halt your program right there".
>
> BTW, the standard says "An ambiguous condition exists if ud2 overflows during the conversion".
> So, you may pretty much do as you see fit.
>
> Yes, >NUMBER is seriously lacking in some respects - and as usual the required functionality
> is lacking from the standard - leaving it to the likes of us to come up with our highly different
> solutions to the problem. Oh, I really love portability.

Standard >NUMBER works with doubles; so it takes some effort on the part of
the user to overflow even on a 16-bit system. For all but safety-critical apps
it probably suffices and 'garbage in - garbage out' applies.

What is problematic is each forth's 'NUMBER' routine as each implementer seems
to have their own idea what constitutes a 'valid' integer. Nor do they
necessarily document these rules. What's for sure is the more strict 'NUMBER'
is, the less flexible and harder to apply it becomes. When it I wanted to
implement comma-separated values for a command-line app I balked at the prospect
of having to create yet another parser and more workarounds just so I could use
my glorious and fool-proof NUMBER? That's when I added /NUMBER which is as
simple as it gets. And the funny thing is, I could probably use it where I
previously used NUMBER? and the user would scarcely know.

That said, I'm not here to promote /NUMBER. What I do think every forth
should have is /SIGN - not least because they already do - in one form or
another. It simply hasn't been standardized - which is a pity given it's
just as fundamental and useful as >NUMBER .

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor