Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

"Tell the truth and run." -- Yugoslav proverb


devel / comp.lang.functional / Discard extraneous structure returned by parser combinators?

SubjectAuthor
* Discard extraneous structure returned by parser combinators?luserdroog
+* Re: Discard extraneous structure returned by parser combinators?Ben Bacarisse
|`* Re: Discard extraneous structure returned by parser combinators?luserdroog
| `* Re: Discard extraneous structure returned by parser combinators?Ben Bacarisse
|  `* Re: Discard extraneous structure returned by parser combinators?luserdroog
|   `- Re: Discard extraneous structure returned by parser combinators?luserdroog
+* Re: Discard extraneous structure returned by parser combinators?Paul Rubin
|`* Re: Discard extraneous structure returned by parser combinators?luserdroog
| `- Re: Discard extraneous structure returned by parser combinators?luserdroog
`* Re: Discard extraneous structure returned by parser combinators?luserdroog
 `- Re: Discard extraneous structure returned by parser combinators?luserdroog

1
Discard extraneous structure returned by parser combinators?

<1c28fc1c-ea36-450a-adf7-993fa66f25f9n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=9888&group=comp.lang.functional#9888

  copy link   Newsgroups: comp.lang.functional
X-Received: by 2002:a05:620a:400d:: with SMTP id h13mr17783110qko.45.1636865648631;
Sat, 13 Nov 2021 20:54:08 -0800 (PST)
X-Received: by 2002:a25:d9c9:: with SMTP id q192mr28573006ybg.470.1636865648442;
Sat, 13 Nov 2021 20:54:08 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.functional
Date: Sat, 13 Nov 2021 20:54:08 -0800 (PST)
Injection-Info: google-groups.googlegroups.com; posting-host=97.87.183.68; posting-account=G1KGwgkAAAAyw4z0LxHH0fja6wAbo7Cz
NNTP-Posting-Host: 97.87.183.68
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <1c28fc1c-ea36-450a-adf7-993fa66f25f9n@googlegroups.com>
Subject: Discard extraneous structure returned by parser combinators?
From: mijo...@yahoo.com (luserdroog)
Injection-Date: Sun, 14 Nov 2021 04:54:08 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 29
 by: luserdroog - Sun, 14 Nov 2021 04:54 UTC

Is there a systematic way to discard the extra noise that can occur
when using parser combinators? For example, the `many` combinator
which matches zero or more instances of its argument parser.
In the case of zero matches, it still needs to return a value.

In my situation, I'm simulating everything in PostScript because it's
my favorite language. I'm simulating Lisp cons cells as 2-element
arrays. So for this JSON string,

( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse report

if I make no special effort, I get a resulting value that looks like this:

OK
[[3 [[4 [[5 [[] []]] []]] []]] []]
remainder:[]

All those little empty arrays need to just go away, but not any of the
important array structure. `many` and `maybe` seem to be the chief
culprits, but then their results are propagated back by `alt`s and
`then`s all the way back to the top.

Do I need to make some kind of out-of-band signal for these "zeros"
that I can filter out later? The obvious problem here is that the array
type is being used for too many things. But there's a paucity of
types in PostScript, sigh. For the JSON application, I have nametype
objects available that don't have a JSON corollary.

Do I need to rewrite all the combinators to filter out noise values at
every turn?.

Re: Discard extraneous structure returned by parser combinators?

<878rxqrak5.fsf@bsb.me.uk>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=9889&group=comp.lang.functional#9889

  copy link   Newsgroups: comp.lang.functional
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ben.use...@bsb.me.uk (Ben Bacarisse)
Newsgroups: comp.lang.functional
Subject: Re: Discard extraneous structure returned by parser combinators?
Date: Sun, 14 Nov 2021 16:01:14 +0000
Organization: A noiseless patient Spider
Lines: 49
Message-ID: <878rxqrak5.fsf@bsb.me.uk>
References: <1c28fc1c-ea36-450a-adf7-993fa66f25f9n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="103a3d626c12501fe11fd313064e2c22";
logging-data="10481"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19nzcovAc8gcHFZWDGfFpFPgCiuLrYS4ac="
Cancel-Lock: sha1:MEOzmkny9zHXUwuwljozBWNtGV0=
sha1:6UZTuk8T0qbAZeMmxsr/j4X/YvU=
X-BSB-Auth: 1.d9d0824c0a4aaf4d7670.20211114160114GMT.878rxqrak5.fsf@bsb.me.uk
 by: Ben Bacarisse - Sun, 14 Nov 2021 16:01 UTC

luserdroog <mijoryx@yahoo.com> writes:

> Is there a systematic way to discard the extra noise that can occur
> when using parser combinators? For example, the `many` combinator
> which matches zero or more instances of its argument parser.
> In the case of zero matches, it still needs to return a value.
>
> In my situation, I'm simulating everything in PostScript because it's
> my favorite language. I'm simulating Lisp cons cells as 2-element
> arrays. So for this JSON string,
>
> ( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse report
>
> if I make no special effort, I get a resulting value that looks like this:
>
> OK
> [[3 [[4 [[5 [[] []]] []]] []]] []]
> remainder:[]
>
> All those little empty arrays need to just go away, but not any of the
> important array structure.

So you want

[[3 [[4 [[5 []]]]]]]

?

> `many` and `maybe` seem to be the chief
> culprits, but then their results are propagated back by `alt`s and
> `then`s all the way back to the top.
>
> Do I need to make some kind of out-of-band signal for these "zeros"
> that I can filter out later? The obvious problem here is that the array
> type is being used for too many things. But there's a paucity of
> types in PostScript, sigh. For the JSON application, I have nametype
> objects available that don't have a JSON corollary.
>
> Do I need to rewrite all the combinators to filter out noise values at
> every turn?.

It's odd to call something that you are returning (presuambly) as
noise. Are you using lists as a sort of Maybe monad with [] as Nothing?

I think you'd have to show the code to get anything more concrete as a
reply.

--
Ben.

Re: Discard extraneous structure returned by parser combinators?

<87pmr2uyc9.fsf@nightsong.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=9890&group=comp.lang.functional#9890

  copy link   Newsgroups: comp.lang.functional
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: no.em...@nospam.invalid (Paul Rubin)
Newsgroups: comp.lang.functional
Subject: Re: Discard extraneous structure returned by parser combinators?
Date: Sun, 14 Nov 2021 15:11:34 -0800
Organization: A noiseless patient Spider
Lines: 16
Message-ID: <87pmr2uyc9.fsf@nightsong.com>
References: <1c28fc1c-ea36-450a-adf7-993fa66f25f9n@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="09c999d75e18bca25990a358ea30890b";
logging-data="30302"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19kSrql9+K8265ngO2bzU7L"
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)
Cancel-Lock: sha1:Ja6BRGV0CsKo5KnacqAUQxw4heg=
sha1:6l9PvY011Ln8OwtIWfMt1vXKdck=
 by: Paul Rubin - Sun, 14 Nov 2021 23:11 UTC

luserdroog <mijoryx@yahoo.com> writes:
> Is there a systematic way to discard the extra noise that can occur
> when using parser combinators? For example, the `many` combinator
> which matches zero or more instances of its argument parser.
> In the case of zero matches, it still needs to return a value.

I'd expect 'many' to return a list, an empty list in the case of zero
matches. What is the extra noise? Your PostScript example is
confusing. I'd expect ( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse to give
something like [3, [4, [5]]], using square brackets to denote lists.

I didn't know parser combinators were even a thing in PostScript: or are
you trying to implement them? You could look at the Parsec paper to see
how they traditionally worked:

https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/parsec-paper-letter.pdf

Re: Discard extraneous structure returned by parser combinators?

<b2f228f9-383c-4578-9847-61a3fc9c7bedn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=9891&group=comp.lang.functional#9891

  copy link   Newsgroups: comp.lang.functional
X-Received: by 2002:ac8:4111:: with SMTP id q17mr36676869qtl.407.1636946879065;
Sun, 14 Nov 2021 19:27:59 -0800 (PST)
X-Received: by 2002:a5b:98e:: with SMTP id c14mr37142884ybq.458.1636946878861;
Sun, 14 Nov 2021 19:27:58 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.functional
Date: Sun, 14 Nov 2021 19:27:58 -0800 (PST)
In-Reply-To: <878rxqrak5.fsf@bsb.me.uk>
Injection-Info: google-groups.googlegroups.com; posting-host=97.87.183.68; posting-account=G1KGwgkAAAAyw4z0LxHH0fja6wAbo7Cz
NNTP-Posting-Host: 97.87.183.68
References: <1c28fc1c-ea36-450a-adf7-993fa66f25f9n@googlegroups.com> <878rxqrak5.fsf@bsb.me.uk>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <b2f228f9-383c-4578-9847-61a3fc9c7bedn@googlegroups.com>
Subject: Re: Discard extraneous structure returned by parser combinators?
From: mijo...@yahoo.com (luserdroog)
Injection-Date: Mon, 15 Nov 2021 03:27:59 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 92
 by: luserdroog - Mon, 15 Nov 2021 03:27 UTC

On Sunday, November 14, 2021 at 10:01:16 AM UTC-6, Ben Bacarisse wrote:
> luserdroog <mij...@yahoo.com> writes:
>
> > Is there a systematic way to discard the extra noise that can occur
> > when using parser combinators? For example, the `many` combinator
> > which matches zero or more instances of its argument parser.
> > In the case of zero matches, it still needs to return a value.
> >
> > In my situation, I'm simulating everything in PostScript because it's
> > my favorite language. I'm simulating Lisp cons cells as 2-element
> > arrays. So for this JSON string,
> >
> > ( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse report
> >
> > if I make no special effort, I get a resulting value that looks like this:
> >
> > OK
> > [[3 [[4 [[5 [[] []]] []]] []]] []]
> > remainder:[]
> >
> > All those little empty arrays need to just go away, but not any of the
> > important array structure.
> So you want
>
> [[3 [[4 [[5 []]]]]]]
>
> ?

I guess that's the big problem here. I'm not sure what I want. I keep having
to add extra code to clean up and delete the extra stuff. Ultimately the
result should be

[ 3 [ 4 [ 5 ] ] ]

The parser for arrays looks for the left bracket, then ...

/Jarray //begin-array
//value executeonly xthen
//value-separator //value executeonly xthen many then %{ps flatten ps} using
maybe
//end-array thenx
{ %filter-zeros first %ps
} using def

The `executeonly` are in there to prevent infinite recursion if the expanded
code ever gets printed (like in a stack dump while debugging). The /Jarray
parser is one of the components of the //value parser.

Hmm. Initially I had the `then` combinator doing a Lisp-style (append)
operation on my simulated lists, so something like

(a) char (b) char then

would -- if matched by the input -- return

[ (a) [ (b) [] ] ]

which I could then easily massage into

[ (a) (b) ]

But that led me into problems when I wanted to use the combinators
`xthen` and `thenx` which discard one of the two pieces. If the results
are just appended together in a list, then I've lost the information to
peel them back apart. So I changed `then` to just (cons) the pieces
together, and now `xthen` and `thenx` have an easy job.

And extra noise values pop up if I use combinators like `maybe`
and `many` which might succeed with zero matches. So, ...

> > `many` and `maybe` seem to be the chief
> > culprits, but then their results are propagated back by `alt`s and
> > `then`s all the way back to the top.
> >
> > Do I need to make some kind of out-of-band signal for these "zeros"
> > that I can filter out later? The obvious problem here is that the array
> > type is being used for too many things. But there's a paucity of
> > types in PostScript, sigh. For the JSON application, I have nametype
> > objects available that don't have a JSON corollary.
> >
> > Do I need to rewrite all the combinators to filter out noise values at
> > every turn?.
> It's odd to call something that you are returning (presuambly) as
> noise. Are you using lists as a sort of Maybe monad with [] as Nothing?
>

Yes, I think that's what I'm doing, clumsily.

> I think you'd have to show the code to get anything more concrete as a
> reply.
>
> --
> Ben.

Re: Discard extraneous structure returned by parser combinators?

<05f0a5b6-c22d-426b-b1e5-b8950dd00e72n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=9892&group=comp.lang.functional#9892

  copy link   Newsgroups: comp.lang.functional
X-Received: by 2002:a37:712:: with SMTP id 18mr28048249qkh.366.1636946994945;
Sun, 14 Nov 2021 19:29:54 -0800 (PST)
X-Received: by 2002:a25:bd52:: with SMTP id p18mr36212005ybm.484.1636946994725;
Sun, 14 Nov 2021 19:29:54 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.functional
Date: Sun, 14 Nov 2021 19:29:54 -0800 (PST)
In-Reply-To: <87pmr2uyc9.fsf@nightsong.com>
Injection-Info: google-groups.googlegroups.com; posting-host=97.87.183.68; posting-account=G1KGwgkAAAAyw4z0LxHH0fja6wAbo7Cz
NNTP-Posting-Host: 97.87.183.68
References: <1c28fc1c-ea36-450a-adf7-993fa66f25f9n@googlegroups.com> <87pmr2uyc9.fsf@nightsong.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <05f0a5b6-c22d-426b-b1e5-b8950dd00e72n@googlegroups.com>
Subject: Re: Discard extraneous structure returned by parser combinators?
From: mijo...@yahoo.com (luserdroog)
Injection-Date: Mon, 15 Nov 2021 03:29:54 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 19
 by: luserdroog - Mon, 15 Nov 2021 03:29 UTC

On Sunday, November 14, 2021 at 5:11:37 PM UTC-6, Paul Rubin wrote:
> luserdroog <mij...@yahoo.com> writes:
> > Is there a systematic way to discard the extra noise that can occur
> > when using parser combinators? For example, the `many` combinator
> > which matches zero or more instances of its argument parser.
> > In the case of zero matches, it still needs to return a value.
> I'd expect 'many' to return a list, an empty list in the case of zero
> matches. What is the extra noise? Your PostScript example is
> confusing. I'd expect ( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse to give
> something like [3, [4, [5]]], using square brackets to denote lists.
>
> I didn't know parser combinators were even a thing in PostScript: or are
> you trying to implement them? You could look at the Parsec paper to see
> how they traditionally worked:
>
> https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/parsec-paper-letter.pdf

It's something I've been trying to do in PostScript for a while now.
A lot of the saga is detailed in comp.lang.postscript.
Code is at github.com/luser-dr00g/pcomb/ps

Re: Discard extraneous structure returned by parser combinators?

<87mtm5puix.fsf@bsb.me.uk>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=9893&group=comp.lang.functional#9893

  copy link   Newsgroups: comp.lang.functional
Path: i2pn2.org!i2pn.org!eternal-september.org!reader02.eternal-september.org!.POSTED!not-for-mail
From: ben.use...@bsb.me.uk (Ben Bacarisse)
Newsgroups: comp.lang.functional
Subject: Re: Discard extraneous structure returned by parser combinators?
Date: Mon, 15 Nov 2021 10:45:10 +0000
Organization: A noiseless patient Spider
Lines: 70
Message-ID: <87mtm5puix.fsf@bsb.me.uk>
References: <1c28fc1c-ea36-450a-adf7-993fa66f25f9n@googlegroups.com>
<878rxqrak5.fsf@bsb.me.uk>
<b2f228f9-383c-4578-9847-61a3fc9c7bedn@googlegroups.com>
Mime-Version: 1.0
Content-Type: text/plain
Injection-Info: reader02.eternal-september.org; posting-host="5fe867269cb9ac96facdd1dae680befb";
logging-data="3322"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX18Z1RVVaWHevMmUCQJTv3JZOApYbN6U3aI="
Cancel-Lock: sha1:1tK0cbsuicexSQaNUpw6K0GJxis=
sha1:LUKiZFOWsOv1SchaujChoEqtj88=
X-BSB-Auth: 1.df990c585904cfc60b6c.20211115104510GMT.87mtm5puix.fsf@bsb.me.uk
 by: Ben Bacarisse - Mon, 15 Nov 2021 10:45 UTC

luserdroog <mijoryx@yahoo.com> writes:

> On Sunday, November 14, 2021 at 10:01:16 AM UTC-6, Ben Bacarisse wrote:
>> luserdroog <mij...@yahoo.com> writes:
>>
>> > Is there a systematic way to discard the extra noise that can occur
>> > when using parser combinators? For example, the `many` combinator
>> > which matches zero or more instances of its argument parser.
>> > In the case of zero matches, it still needs to return a value.
>> >
>> > In my situation, I'm simulating everything in PostScript because it's
>> > my favorite language. I'm simulating Lisp cons cells as 2-element
>> > arrays. So for this JSON string,
>> >
>> > ( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse report
>> >
>> > if I make no special effort, I get a resulting value that looks like this:
>> >
>> > OK
>> > [[3 [[4 [[5 [[] []]] []]] []]] []]
>> > remainder:[]
>> >
>> > All those little empty arrays need to just go away, but not any of the
>> > important array structure.
>> So you want
>>
>> [[3 [[4 [[5 []]]]]]]
>>
>> ?
>
> I guess that's the big problem here. I'm not sure what I want. I keep having
> to add extra code to clean up and delete the extra stuff. Ultimately the
> result should be
>
> [ 3 [ 4 [ 5 ] ] ]
>
> The parser for arrays looks for the left bracket, then ...
>
> /Jarray //begin-array
> //value executeonly xthen
> //value-separator //value executeonly xthen many then %{ps flatten ps} using
> maybe
> //end-array thenx
> { %filter-zeros first %ps
> } using def
>
> The `executeonly` are in there to prevent infinite recursion if the expanded
> code ever gets printed (like in a stack dump while debugging). The /Jarray
> parser is one of the components of the //value parser.
>
> Hmm. Initially I had the `then` combinator doing a Lisp-style (append)
> operation on my simulated lists, so something like
>
> (a) char (b) char then
>
> would -- if matched by the input -- return
>
> [ (a) [ (b) [] ] ]
>
> which I could then easily massage into
>
> [ (a) (b) ]

I think you need to pin down what you want. You made a remark about
using two-element arrays as cons cells. In that case, parsing (a) and
(b) in sequence /should/ give [ (a) [ (b) [] ] ]. Massaging that into
something else seems like the wrong strategy.

--
Ben.

Re: Discard extraneous structure returned by parser combinators?

<bd2648a8-9be6-4c7c-a6c8-8c5d0a96d74fn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=9894&group=comp.lang.functional#9894

  copy link   Newsgroups: comp.lang.functional
X-Received: by 2002:a05:622a:24a:: with SMTP id c10mr2500678qtx.209.1637014410486;
Mon, 15 Nov 2021 14:13:30 -0800 (PST)
X-Received: by 2002:a25:bc83:: with SMTP id e3mr2736569ybk.255.1637014410211;
Mon, 15 Nov 2021 14:13:30 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.functional
Date: Mon, 15 Nov 2021 14:13:30 -0800 (PST)
In-Reply-To: <87mtm5puix.fsf@bsb.me.uk>
Injection-Info: google-groups.googlegroups.com; posting-host=97.87.183.68; posting-account=G1KGwgkAAAAyw4z0LxHH0fja6wAbo7Cz
NNTP-Posting-Host: 97.87.183.68
References: <1c28fc1c-ea36-450a-adf7-993fa66f25f9n@googlegroups.com>
<878rxqrak5.fsf@bsb.me.uk> <b2f228f9-383c-4578-9847-61a3fc9c7bedn@googlegroups.com>
<87mtm5puix.fsf@bsb.me.uk>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <bd2648a8-9be6-4c7c-a6c8-8c5d0a96d74fn@googlegroups.com>
Subject: Re: Discard extraneous structure returned by parser combinators?
From: mijo...@yahoo.com (luserdroog)
Injection-Date: Mon, 15 Nov 2021 22:13:30 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 99
 by: luserdroog - Mon, 15 Nov 2021 22:13 UTC

On Monday, November 15, 2021 at 4:45:12 AM UTC-6, Ben Bacarisse wrote:
> luserdroog <mij...@yahoo.com> writes:
>
> > On Sunday, November 14, 2021 at 10:01:16 AM UTC-6, Ben Bacarisse wrote:
> >> luserdroog <mij...@yahoo.com> writes:
> >>
> >> > Is there a systematic way to discard the extra noise that can occur
> >> > when using parser combinators? For example, the `many` combinator
> >> > which matches zero or more instances of its argument parser.
> >> > In the case of zero matches, it still needs to return a value.
> >> >
> >> > In my situation, I'm simulating everything in PostScript because it's
> >> > my favorite language. I'm simulating Lisp cons cells as 2-element
> >> > arrays. So for this JSON string,
> >> >
> >> > ( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse report
> >> >
> >> > if I make no special effort, I get a resulting value that looks like this:
> >> >
> >> > OK
> >> > [[3 [[4 [[5 [[] []]] []]] []]] []]
> >> > remainder:[]
> >> >
> >> > All those little empty arrays need to just go away, but not any of the
> >> > important array structure.
> >> So you want
> >>
> >> [[3 [[4 [[5 []]]]]]]
> >>
> >> ?
> >
> > I guess that's the big problem here. I'm not sure what I want. I keep having
> > to add extra code to clean up and delete the extra stuff. Ultimately the
> > result should be
> >
> > [ 3 [ 4 [ 5 ] ] ]
> >
> > The parser for arrays looks for the left bracket, then ...
> >
> > /Jarray //begin-array
> > //value executeonly xthen
> > //value-separator //value executeonly xthen many then %{ps flatten ps} using
> > maybe
> > //end-array thenx
> > { %filter-zeros first %ps
> > } using def
> >
> > The `executeonly` are in there to prevent infinite recursion if the expanded
> > code ever gets printed (like in a stack dump while debugging). The /Jarray
> > parser is one of the components of the //value parser.
> >
> > Hmm. Initially I had the `then` combinator doing a Lisp-style (append)
> > operation on my simulated lists, so something like
> >
> > (a) char (b) char then
> >
> > would -- if matched by the input -- return
> >
> > [ (a) [ (b) [] ] ]
> >
> > which I could then easily massage into
> >
> > [ (a) (b) ]
> I think you need to pin down what you want. You made a remark about
> using two-element arrays as cons cells. In that case, parsing (a) and
> (b) in sequence /should/ give [ (a) [ (b) [] ] ]. Massaging that into
> something else seems like the wrong strategy.
>

Yes, I think I may have presented an X/Y problem or started the story
from the middle. In all of my test cases for this version (regexs,
Postscript scanner, JSON parser) I had to write a `fix` function
to convert these lists into arrays that I can work with more simply.

But with each test case, I've had to re-write the `fix`. And it keeps
running into more problems. The final one that "broke the camel's back"
and sent me searching for the deeper design flaw was this:

( [ {"a":4,"b":5}, 6, {"c":"7"}] ) dup JSON-parse report

which yields this:

OK
[[<< /a 4 /b 5 >> [6 << /c (7) >>]] []]
remainder:[]

I can remove the trailing [] easily. But the extra array in the middle
grouping the 6 and the dictionary, I don't know where to patch in
to grammar to remove that one. So that seems to show that I'm
doing something wrong deeper down in the organization of the
program.

So, it appears I need to back up and rebuild the pieces more slowly,
making sure that the regex example doesn't need any kind of `fix`
before moving on to the more complicated ones.

On a similar front, I can rewrite `xthen` and `thenx` to do their
jobs without composing off of `then`. That way I don't need to
worry about the result of `then` needing to be taken apart again into
2 pieces.

Re: Discard extraneous structure returned by parser combinators?

<c3c3919b-6912-4abd-8201-c79f4175410dn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=9895&group=comp.lang.functional#9895

  copy link   Newsgroups: comp.lang.functional
X-Received: by 2002:a05:620a:1ed:: with SMTP id x13mr39562985qkn.408.1637455780376;
Sat, 20 Nov 2021 16:49:40 -0800 (PST)
X-Received: by 2002:a25:6994:: with SMTP id e142mr51714105ybc.84.1637455780151;
Sat, 20 Nov 2021 16:49:40 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.functional
Date: Sat, 20 Nov 2021 16:49:39 -0800 (PST)
In-Reply-To: <bd2648a8-9be6-4c7c-a6c8-8c5d0a96d74fn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=97.87.183.68; posting-account=G1KGwgkAAAAyw4z0LxHH0fja6wAbo7Cz
NNTP-Posting-Host: 97.87.183.68
References: <1c28fc1c-ea36-450a-adf7-993fa66f25f9n@googlegroups.com>
<878rxqrak5.fsf@bsb.me.uk> <b2f228f9-383c-4578-9847-61a3fc9c7bedn@googlegroups.com>
<87mtm5puix.fsf@bsb.me.uk> <bd2648a8-9be6-4c7c-a6c8-8c5d0a96d74fn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <c3c3919b-6912-4abd-8201-c79f4175410dn@googlegroups.com>
Subject: Re: Discard extraneous structure returned by parser combinators?
From: mijo...@yahoo.com (luserdroog)
Injection-Date: Sun, 21 Nov 2021 00:49:40 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 116
 by: luserdroog - Sun, 21 Nov 2021 00:49 UTC

On Monday, November 15, 2021 at 4:13:31 PM UTC-6, luserdroog wrote:
> On Monday, November 15, 2021 at 4:45:12 AM UTC-6, Ben Bacarisse wrote:
> > luserdroog <mij...@yahoo.com> writes:
> >
> > > On Sunday, November 14, 2021 at 10:01:16 AM UTC-6, Ben Bacarisse wrote:
> > >> luserdroog <mij...@yahoo.com> writes:
> > >>
> > >> > Is there a systematic way to discard the extra noise that can occur
> > >> > when using parser combinators? For example, the `many` combinator
> > >> > which matches zero or more instances of its argument parser.
> > >> > In the case of zero matches, it still needs to return a value.
> > >> >
> > >> > In my situation, I'm simulating everything in PostScript because it's
> > >> > my favorite language. I'm simulating Lisp cons cells as 2-element
> > >> > arrays. So for this JSON string,
> > >> >
> > >> > ( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse report
> > >> >
> > >> > if I make no special effort, I get a resulting value that looks like this:
> > >> >
> > >> > OK
> > >> > [[3 [[4 [[5 [[] []]] []]] []]] []]
> > >> > remainder:[]
> > >> >
> > >> > All those little empty arrays need to just go away, but not any of the
> > >> > important array structure.
> > >> So you want
> > >>
> > >> [[3 [[4 [[5 []]]]]]]
> > >>
> > >> ?
> > >
> > > I guess that's the big problem here. I'm not sure what I want. I keep having
> > > to add extra code to clean up and delete the extra stuff. Ultimately the
> > > result should be
> > >
> > > [ 3 [ 4 [ 5 ] ] ]
> > >
> > > The parser for arrays looks for the left bracket, then ...
> > >
> > > /Jarray //begin-array
> > > //value executeonly xthen
> > > //value-separator //value executeonly xthen many then %{ps flatten ps} using
> > > maybe
> > > //end-array thenx
> > > { %filter-zeros first %ps
> > > } using def
> > >
> > > The `executeonly` are in there to prevent infinite recursion if the expanded
> > > code ever gets printed (like in a stack dump while debugging). The /Jarray
> > > parser is one of the components of the //value parser.
> > >
> > > Hmm. Initially I had the `then` combinator doing a Lisp-style (append)
> > > operation on my simulated lists, so something like
> > >
> > > (a) char (b) char then
> > >
> > > would -- if matched by the input -- return
> > >
> > > [ (a) [ (b) [] ] ]
> > >
> > > which I could then easily massage into
> > >
> > > [ (a) (b) ]
> > I think you need to pin down what you want. You made a remark about
> > using two-element arrays as cons cells. In that case, parsing (a) and
> > (b) in sequence /should/ give [ (a) [ (b) [] ] ]. Massaging that into
> > something else seems like the wrong strategy.
> >
> Yes, I think I may have presented an X/Y problem or started the story
> from the middle. In all of my test cases for this version (regexs,
> Postscript scanner, JSON parser) I had to write a `fix` function
> to convert these lists into arrays that I can work with more simply.
[snip]

Ugh. I think it wasn't even an X/Y problem. It was a "doctor it hurts when
I move my arm like this; ... so don't move your arm like that" problem.

I want the "result" part of the "reply" structure (using new terms following
usage from the Parsec document) to be any of the /usual/ PostScript types:
integer, real, string, boolean, array, dictionary.

But I also need some way to arbitrarily combine or concatenate two
objects regardless of type. My `then` (aka `seq`) combinator needs to
do this. So I made a hack-y function that does the combining. If it has
two arrays, it composes the contents into a longer array. If it has one
array and some other object it extends the array by one and stuffs
the object in the front or back as appropriate. If it has two non-array
objects it makes a new 2-element array to contain them.

So, instead of building `xthen` and `thenx` off of `then` and needing
to cons, car, and cdr the stuff, I can write all 3 of these as a more
general parameterized function.

sequence{ p q u }{
{ /p exec +is-ok {
next x-xs force /q exec +is-ok {
next x-xs 3 1 roll /u exec exch consok
}{
x-xs 3 2 roll ( after ) exch cons exch cons cons
} ifelse
} if } ll } @func
then { {append} sequence }
xthen { {exch pop} sequence }
thenx { {pop} sequence }

append { 1 index zero eq { exch pop }{
dup zero eq { pop }{
1 index type /arraytype eq {
dup type /arraytype eq { compose }{ one compose } ifelse
}{ dup type /arraytype eq { curry }{ cons } ifelse } ifelse } ifelse } ifelse }

(`@func` is my own non-standard extension to PostScript that takes
a procedure body and list of parameters and wraps the procedure with
code that defines the arguments in a local dictionary. `ll` is my hack-y
PostScript way of making lambdas with hard-patched parameters, it's
short for `load all literals.`)

Re: Discard extraneous structure returned by parser combinators?

<6b9152c3-63d0-487d-ab57-a5e3e390bd66n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=9896&group=comp.lang.functional#9896

  copy link   Newsgroups: comp.lang.functional
X-Received: by 2002:a05:620a:541:: with SMTP id o1mr33442415qko.145.1637461043416;
Sat, 20 Nov 2021 18:17:23 -0800 (PST)
X-Received: by 2002:a25:aae2:: with SMTP id t89mr6449993ybi.470.1637461043217;
Sat, 20 Nov 2021 18:17:23 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.functional
Date: Sat, 20 Nov 2021 18:17:23 -0800 (PST)
In-Reply-To: <1c28fc1c-ea36-450a-adf7-993fa66f25f9n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=97.87.183.68; posting-account=G1KGwgkAAAAyw4z0LxHH0fja6wAbo7Cz
NNTP-Posting-Host: 97.87.183.68
References: <1c28fc1c-ea36-450a-adf7-993fa66f25f9n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <6b9152c3-63d0-487d-ab57-a5e3e390bd66n@googlegroups.com>
Subject: Re: Discard extraneous structure returned by parser combinators?
From: mijo...@yahoo.com (luserdroog)
Injection-Date: Sun, 21 Nov 2021 02:17:23 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 9
 by: luserdroog - Sun, 21 Nov 2021 02:17 UTC

On Saturday, November 13, 2021 at 10:54:09 PM UTC-6, luserdroog wrote:
> Is there a systematic way to discard the extra noise that can occur
> when using parser combinators? For example, the `many` combinator
> which matches zero or more instances of its argument parser.
> In the case of zero matches, it still needs to return a value.
>
[snip]
Sigh. I already had this same problem before. It came up when I googled it. [sad trombone]

https://stackoverflow.com/q/55346600/733077

Re: Discard extraneous structure returned by parser combinators?

<2fa45e9b-5835-4fc6-9bfe-4f77045bf746n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=9897&group=comp.lang.functional#9897

  copy link   Newsgroups: comp.lang.functional
X-Received: by 2002:a37:9f17:: with SMTP id i23mr40667697qke.452.1637466926821;
Sat, 20 Nov 2021 19:55:26 -0800 (PST)
X-Received: by 2002:a25:d214:: with SMTP id j20mr5698144ybg.536.1637466926613;
Sat, 20 Nov 2021 19:55:26 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.functional
Date: Sat, 20 Nov 2021 19:55:26 -0800 (PST)
In-Reply-To: <05f0a5b6-c22d-426b-b1e5-b8950dd00e72n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=97.87.183.68; posting-account=G1KGwgkAAAAyw4z0LxHH0fja6wAbo7Cz
NNTP-Posting-Host: 97.87.183.68
References: <1c28fc1c-ea36-450a-adf7-993fa66f25f9n@googlegroups.com>
<87pmr2uyc9.fsf@nightsong.com> <05f0a5b6-c22d-426b-b1e5-b8950dd00e72n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <2fa45e9b-5835-4fc6-9bfe-4f77045bf746n@googlegroups.com>
Subject: Re: Discard extraneous structure returned by parser combinators?
From: mijo...@yahoo.com (luserdroog)
Injection-Date: Sun, 21 Nov 2021 03:55:26 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 23
 by: luserdroog - Sun, 21 Nov 2021 03:55 UTC

On Sunday, November 14, 2021 at 9:29:55 PM UTC-6, luserdroog wrote:
> On Sunday, November 14, 2021 at 5:11:37 PM UTC-6, Paul Rubin wrote:
> > luserdroog <mij...@yahoo.com> writes:
> > > Is there a systematic way to discard the extra noise that can occur
> > > when using parser combinators? For example, the `many` combinator
> > > which matches zero or more instances of its argument parser.
> > > In the case of zero matches, it still needs to return a value.
> > I'd expect 'many' to return a list, an empty list in the case of zero
> > matches. What is the extra noise? Your PostScript example is
> > confusing. I'd expect ( [ 3, [ 4, [ 5 ] ] ] ) JSON-parse to give
> > something like [3, [4, [5]]], using square brackets to denote lists.
> >
> > I didn't know parser combinators were even a thing in PostScript: or are
> > you trying to implement them? You could look at the Parsec paper to see
> > how they traditionally worked:
> >
> > https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/parsec-paper-letter.pdf
> It's something I've been trying to do in PostScript for a while now.
> A lot of the saga is detailed in comp.lang.postscript.
> Code is at github.com/luser-dr00g/pcomb/ps

Ooops. Bad link. Here it at:

https://github.com/luser-dr00g/pcomb/tree/master/ps

Re: Discard extraneous structure returned by parser combinators?

<5e1fdf27-c25a-4f51-b681-bb645a57eda7n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=9899&group=comp.lang.functional#9899

  copy link   Newsgroups: comp.lang.functional
X-Received: by 2002:a05:6214:d04:: with SMTP id 4mr23402618qvh.26.1638595308612; Fri, 03 Dec 2021 21:21:48 -0800 (PST)
X-Received: by 2002:ac8:57ce:: with SMTP id w14mr25695969qta.252.1638595308453; Fri, 03 Dec 2021 21:21:48 -0800 (PST)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!feeder8.news.weretis.net!news.uzoreto.com!tr1.eu1.usenetexpress.com!feeder.usenetexpress.com!tr3.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.lang.functional
Date: Fri, 3 Dec 2021 21:21:48 -0800 (PST)
In-Reply-To: <6b9152c3-63d0-487d-ab57-a5e3e390bd66n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=97.87.183.68; posting-account=G1KGwgkAAAAyw4z0LxHH0fja6wAbo7Cz
NNTP-Posting-Host: 97.87.183.68
References: <1c28fc1c-ea36-450a-adf7-993fa66f25f9n@googlegroups.com> <6b9152c3-63d0-487d-ab57-a5e3e390bd66n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <5e1fdf27-c25a-4f51-b681-bb645a57eda7n@googlegroups.com>
Subject: Re: Discard extraneous structure returned by parser combinators?
From: mijo...@yahoo.com (luserdroog)
Injection-Date: Sat, 04 Dec 2021 05:21:48 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 32
 by: luserdroog - Sat, 4 Dec 2021 05:21 UTC

On Saturday, November 20, 2021 at 8:17:24 PM UTC-6, luserdroog wrote:
> On Saturday, November 13, 2021 at 10:54:09 PM UTC-6, luserdroog wrote:
> > Is there a systematic way to discard the extra noise that can occur
> > when using parser combinators? For example, the `many` combinator
> > which matches zero or more instances of its argument parser.
> > In the case of zero matches, it still needs to return a value.
> >
> [snip]
> Sigh. I already had this same problem before. It came up when I googled it. [sad trombone]
>
> https://stackoverflow.com/q/55346600/733077

Coda: The rewrite is proceeding well. I got the basic parsers typed up fresh
and simplified from the previous round. And I got the regular expression
parser (test case) written with less gyrations than before. And I got the PostScript
scanner (test case) written but there I did end up needing a `fix` function.

Precisely because of my earlier decision that the result could be any type
or an array of any number of any type, when processing this result ... I do
kinda need to know whether the thing is an array or not and handle them
differently.

So, it's the same but different, you know? I need to call `fix` but it makes
sense why I have to do it, and importantly *where* it will need to be done
to get the thing to work right.

I suppose the moral is, in this situation I can follow the Lisp stuff a
little less literally. And possibly more broadly, as I've lamented in both
comp.lang.c and comp.lang.postscript.

Every few rewrites I try to clean up the lazy evaluation and be more general
and strategic with it. But every time I just end up sprinkling in calls to `force`
until it works right.


devel / comp.lang.functional / Discard extraneous structure returned by parser combinators?

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor