Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

24 Apr, 2024: Testing a new version of the Overboard here. If you have an issue post about it to rocksolid.nodes.help (I know. Everyone on Usenet has issues)


devel / comp.compilers / Is it the job of a parser to validate the input data?

SubjectAuthor
* Is it the job of a parser to validate the input data?Christopher F Clark
`- Re: Is it the job of a parser to validate the input data?luser droog

1
Is it the job of a parser to validate the input data?

<21-08-011@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=78&group=comp.compilers#78

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!tr1.eu1.usenetexpress.com!feeder.usenetexpress.com!tr2.iad1.usenetexpress.com!border1.nntp.dca1.giganews.com!nntp.giganews.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: christop...@compiler-resources.com (Christopher F Clark)
Newsgroups: comp.compilers
Subject: Is it the job of a parser to validate the input data?
Date: Thu, 12 Aug 2021 15:10:40 +0300
Organization: Compilers Central
Lines: 52
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-08-011@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970"; logging-data="55169"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, syntax
Posted-Date: 12 Aug 2021 11:52:12 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Christopher F Clark - Thu, 12 Aug 2021 12:10 UTC

Roger L Costello <costello@mitre.org> asked:

> There are many data formats which contain things like this:
>
> A number, N
> N occurrences of something
>
> For example, 3 followed by the names of three students:
>
> 3
> John Doe
> Sally Smith
> Judy Jones
>
> I have a question about parsing such data. Is it the job of a parser to ensure
> that the number of student names matches the number? Or, is it the job of the
> parser to merely tokenize whatever is in the input and then create an abstract
> syntax tree containing the tokens?

It is almost always done in the AST creation routines, not only do you
as our insightful moderator mentioned generally get better error
messages that way, but curiously, the features of extract a number,
turn it into a count, and apply that count (and yes those might be 3
distinct operations) to be how many items a list involves has not been
implemented in any parser generator or lexer generator that I have
ever seen. That's a bizarre omission, particularly since it is a
common feature in many languages like networking protocols. Doing
fixed counts isn't rare, but doing a count held in a "register" or
"variable" seems to not be done.

The conversion step should generally be deferred to "semantic (aka
action) code or a predicate" as the process is messy and best handled
by some well tuned code not something a lexer/parser generator just
outputs and hopes it is semantically correct.

I have i(all 3 steps) on my near-endless to-do list to fix that for
Yacc++, but it isn't near the top of it.

By the way, when working with Michella Becchi on doing a hardware
regular expression engine at Intel, she studied the problem of counted
regular expressions and proposed some interesting implementation
details of how to handle them. Anyone interested in high speed regular
expression implementations would be well advised to look up her papers
on the topic.

--
******************************************************************************
Chris Clark email: christopher.f.clark@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris
------------------------------------------------------------------------------

Re: Is it the job of a parser to validate the input data?

<21-09-001@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=80&group=comp.compilers#80

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: luser.dr...@gmail.com (luser droog)
Newsgroups: comp.compilers
Subject: Re: Is it the job of a parser to validate the input data?
Date: Fri, 3 Sep 2021 21:37:53 -0700 (PDT)
Organization: Compilers Central
Lines: 17
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <21-09-001@comp.compilers>
References: <21-08-011@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="90378"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, semantics
Posted-Date: 04 Sep 2021 16:54:43 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <21-08-011@comp.compilers>
 by: luser droog - Sat, 4 Sep 2021 04:37 UTC

On Thursday, August 12, 2021 at 10:52:15 AM UTC-5, Christopher F Clark wrote:
> It is almost always done in the AST creation routines, not only do you
> as our insightful moderator mentioned generally get better error
> messages that way, but curiously, the features of extract a number,
> turn it into a count, and apply that count (and yes those might be 3
> distinct operations) to be how many items a list involves has not been
> implemented in any parser generator or lexer generator that I have
> ever seen. That's a bizarre omission, particularly since it is a
> common feature in many languages like networking protocols. Doing
> fixed counts isn't rare, but doing a count held in a "register" or
> "variable" seems to not be done.
>

I think the omission comes from difficulty in the formalization. Having to
apply such a dynamic count moves the parser from context-free to
context-sensitive, jumping to a new level in the Chomsky hierarchy.
So all the training wheels come off and it all gets scary.

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor