Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

"The one charm of marriage is that it makes a life of deception a necessity." -- Oscar Wilde


devel / comp.compilers / Re: Improved accuracy in diagnostics. Is it worthwhile?

SubjectAuthor
* Improved accuracy in diagnostics. Is it worthwhile?Ev. Drikos
+- Re: Improved accuracy in diagnostics. Is it worthwhile?Kaz Kylheku
`* Re: Improved accuracy in diagnostics. Is it worthwhile?Thomas Koenig
 `- Re: Improved accuracy in diagnostics. Is it worthwhile?Ev. Drikos

1
Improved accuracy in diagnostics. Is it worthwhile?

<22-03-035@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=320&group=comp.compilers#320

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: driko...@gmail.com (Ev. Drikos)
Newsgroups: comp.compilers
Subject: Improved accuracy in diagnostics. Is it worthwhile?
Date: Fri, 18 Mar 2022 07:25:40 +0200
Organization: Aioe.org NNTP Server
Lines: 47
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-03-035@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="84428"; mail-complaints-to="abuse@iecc.com"
Keywords: errors, question
Posted-Date: 18 Mar 2022 12:28:46 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Content-Language: en-US
 by: Ev. Drikos - Fri, 18 Mar 2022 05:25 UTC

Hello,

This is mainly a parsing question but it's also Fortran related as well.

When I make syntax checking with the command 'fcheck' in the code below,
the error message doesn't contain a '(' in the expected tokens. This
happens due to default actions, although the parser is basically LALR. A
pure LALR parser wouldn't make reductions without examininig the lookahead.

Default actions are useful because they save a lot of space in parsing
tables, at the cost of missing expected tokens in the error messages
printed by the command 'fcheck'. This is the relevant BNF rule for the
example given at the end of this message:

implicit-stmt ::=
IMPLICIT implicit-spec-list
| IMPLICIT NONE [ ( [ implicit-none-spec-list ] ) ]

Disabling default actions for the command 'fcheck' is fairly simple,
just a button click in Syntaxis, but at the moment I can't think of
how many error messages would be improved, whereas a parsing table
increase (50%) would be granted. The command 'fcheck' can be found at
https://github.com/drikosev/Fortran

So far, my approach has been that improved diagnostics shouldn't slow
down the processing of correct programs. Is it worthwhile to improve
diagnostics by disabling default actions in a LALR parser?

Thanks,
Ev. Drikos

----------------------------------------------------------------------
$ cat default-actions.f90 && fcheck default-actions.f90
IMPLICIT NONE ? (type, external)
PRINT *, "Only ';', not a '(', in the expected tokens in diagnostics."
END

default-actions.f90:1: error: syntax:Unexpected: '?'. Expected: ";".

Parsed with Errors: default-actions.f90
$ [When yacc was new and everything had to fit in 64K, small parse tables
were important. Today when people include a megabyte library to get
a four line routine, not so much. -John]

Re: Improved accuracy in diagnostics. Is it worthwhile?

<22-03-036@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=321&group=comp.compilers#321

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: 480-992-...@kylheku.com (Kaz Kylheku)
Newsgroups: comp.compilers
Subject: Re: Improved accuracy in diagnostics. Is it worthwhile?
Date: Fri, 18 Mar 2022 16:47:47 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 19
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-03-036@comp.compilers>
References: <22-03-035@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="91710"; mail-complaints-to="abuse@iecc.com"
Keywords: yacc, errors
Posted-Date: 18 Mar 2022 12:50:05 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Kaz Kylheku - Fri, 18 Mar 2022 16:47 UTC

On 2022-03-18, Ev. Drikos <drikosev@gmail.com> wrote:
> Hello,
>
> This is mainly a parsing question but it's also Fortran related as well.
>
> When I make syntax checking with the command 'fcheck' in the code below,
> the error message doesn't contain a '(' in the expected tokens. This
> happens due to default actions, although the parser is basically LALR. A
> pure LALR parser wouldn't make reductions without examininig the lookahead.

I think you mean default reductions?

In the case of Yacc, the action is the body { $$ = $1; }

:)

--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal

Re: Improved accuracy in diagnostics. Is it worthwhile?

<22-03-038@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=323&group=comp.compilers#323

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.compilers
Subject: Re: Improved accuracy in diagnostics. Is it worthwhile?
Date: Fri, 18 Mar 2022 18:12:15 -0000 (UTC)
Organization: news.netcologne.de
Lines: 36
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-03-038@comp.compilers>
References: <22-03-035@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="46601"; mail-complaints-to="abuse@iecc.com"
Keywords: errors, design, comment
Posted-Date: 18 Mar 2022 14:24:30 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Thomas Koenig - Fri, 18 Mar 2022 18:12 UTC

Ev. Drikos <drikosev@gmail.com> schrieb:

> This is mainly a parsing question but it's also Fortran related as well.

[...]

> So far, my approach has been that improved diagnostics shouldn't slow
> down the processing of correct programs.

With today's computer speeds, this is likely not a very important
consideration any more.

If you are compiling, it is usually a small fraction of time that
is spent in the parsing, and much more in optimization and code
generation. An example: Compiling a 50 k line Fortran program with
"gfortran -O2" takes 17.4 seconds on the computer I type this on.
Checking with "gfortran -fsyntax-only" takes 4.2 seconds. (For
those who want to reproduce: aermod.f90 from the Polyhedron suite).

50k lines for a single source files is already quite a lot (much
longer than most source files for modular programs are likely to
be) and throwing a bit more CPU time at the problem to reduce user
confusion by emitting better error messages is extremely likely
to be a win for the user. Just be careful to avoid anything
worse than O(n log n) for code size, or somebody will come
along with a test case that takes _really_ long.

(Take the above with a grain of salt for C++ headers.)

> Is it worthwhile to improve
> diagnostics by disabling default actions in a LALR parser?

I would presume so. Run a few benchmarks and find out.
[In my experience, lexing and optimization take most of the
time, and parsing is insignificant. -John]

Re: Improved accuracy in diagnostics. Is it worthwhile?

<22-03-041@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=326&group=comp.compilers#326

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: driko...@gmail.com (Ev. Drikos)
Newsgroups: comp.compilers
Subject: Re: Improved accuracy in diagnostics. Is it worthwhile?
Date: Sat, 19 Mar 2022 19:58:19 +0200
Organization: Aioe.org NNTP Server
Lines: 35
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-03-041@comp.compilers>
References: <22-03-035@comp.compilers> <22-03-038@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="94916"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, performance
Posted-Date: 19 Mar 2022 13:59:18 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Content-Language: en-US
 by: Ev. Drikos - Sat, 19 Mar 2022 17:58 UTC

On 18/03/2022 20:12, Thomas Koenig wrote:
> ...
> If you are compiling, it is usually a small fraction of time that
> is spent in the parsing, and much more in optimization and code
> generation. An example: Compiling a 50 k line Fortran program with
> "gfortran -O2" takes 17.4 seconds on the computer I type this on.
> Checking with "gfortran -fsyntax-only" takes 4.2 seconds. (For
> those who want to reproduce: aermod.f90 from the Polyhedron suite).
> ...

Thanks. Just tested this large file and the runtime overhead seems
to be negligible.

Likely, I'll try the change but it took me a while to find another case
with enumerators (that also lack error recovery now). Although my trial
changes added messages for 43 states, some of them are useless and so
this approach seems to be useful for BNF rules with an optional tail.

Unavoidably, a parser/front-end has to make some guessing on error
and this doesn't change easily. So, any improvement without default
state reductions (hello Kaz) will be limited, as in the code below:

-----------------------------------------------------------------------

miniserver:errors suser$ cat enum-1.f90 && fcheck enum-1.f90
ENUM, BIND(C)
ENUMERATOR :: RED => 4, BLUE => 9
ENUMERATOR YELLOW
END ENUM
END
enum-1.f90:2: error: syntax:Unexpected: '=>'. Expected: ",", ";", or "=".

Parsed with Errors: enum-1.f90
miniserver:errors suser$

1
server_pubkey.txt

rocksolid light 0.9.7
clearnet tor