Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Debug is human, de-fix divine.


devel / comp.compilers / Wrestling with phase 1 of a C compiler

SubjectAuthor
* Wrestling with phase 1 of a C compilerluser droog
+* Re: Wrestling with phase 1 of a C compilerluser droog
|`- Re: Wrestling with phase 1 of a C compilerluser droog
`* Wrestling with phase 1 of a C compilerChristopher F Clark
 +* Re: Wrestling with phase 1 of a C compilergah4
 |+* Re: Wrestling with phase 1 of a C compilerChristopher F Clark
 ||`- Re: Wrestling with phase 1 of a C compilergah4
 |`- Re: source languages, was Wrestling with phase 1 of a C compilerGeorge Neuner
 `* Re: Wrestling with phase 1 of a C compilerluser droog
  `- Re: Wrestling with phase 1 of a C compilerluser droog

1
Wrestling with phase 1 of a C compiler

<22-09-001@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=536&group=comp.compilers#536

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: luser.dr...@gmail.com (luser droog)
Newsgroups: comp.compilers
Subject: Wrestling with phase 1 of a C compiler
Date: Wed, 7 Sep 2022 09:47:29 -0700 (PDT)
Organization: Compilers Central
Lines: 52
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-09-001@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="31723"; mail-complaints-to="abuse@iecc.com"
Keywords: C, parse, question
Posted-Date: 07 Sep 2022 17:01:00 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: luser droog - Wed, 7 Sep 2022 16:47 UTC

At my tedious glacial pace, I have rewritten my parser library
for the umpteen-plus-one'th time only to stall out at an earlier
step than where I stalled out the last time around.

I'm trying to do phase 1 of the C compilation, which is just recognizing
newlines in the input.

The input is modeled as a lazy list which calls fgetc() to produce
integers as needed. Right now I have a tiny parser to recognize
the possible line termination sequences and normalize them to
a single newline.

static parser
position_grammar( void ){
return either( bind( ANY( str("\r\n"),
chr('\r'),
chr('\n') ),
Operator( NIL_, new_line ) ),
item() );
}

static object
new_line( list env, object input ){
return Int('\n');
}

So, I can run this parser and peel out the integer from the result.
And then I'm wrapping the result with this function to couple
each byte with its (row,col) information.

static list
position( object item ){
static int row = 0,
col = 0;
if( valid( eq_int( '\n', item ) ) )
return cons( item, cons( Int( ++ row ), Int( col = 0 ) ) );
else
return cons( item, cons( Int( row ), Int( ++ col ) ) );
}

But ... I guess my problem is the lack of functional programming
tools in the C language, which I already knew, and is nobody's fault
but my own. But I'm not happy with the static variables for row and col.
I don't have monadic sequencing to help route state through my
function graphs.

But ... can I extract the "position counting" part out and do it by
zipping the input stream with an iota stream to provide counting?
This feels like the right direction, but I'm not sure how to reset the
column counter when a newline is recognized. Has anyone navigated
these weeds before and blazed any trails?

Re: Wrestling with phase 1 of a C compiler

<22-09-002@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=537&group=comp.compilers#537

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: luser.dr...@gmail.com (luser droog)
Newsgroups: comp.compilers
Subject: Re: Wrestling with phase 1 of a C compiler
Date: Fri, 9 Sep 2022 20:47:15 -0700 (PDT)
Organization: Compilers Central
Lines: 11
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-09-002@comp.compilers>
References: <22-09-001@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="72888"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, design, comment
Posted-Date: 10 Sep 2022 14:36:54 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-09-001@comp.compilers>
 by: luser droog - Sat, 10 Sep 2022 03:47 UTC

On Wednesday, September 7, 2022 at 4:01:04 PM UTC-5, luser droog wrote:
> At my tedious glacial pace, I have rewritten my parser library
> for the umpteen-plus-one'th time only to stall out at an earlier
> step than where I stalled out the last time around.

Sorry for the noise. I'm prematurely optimizing, aren't I? The function
that works just fine and is already reasonably short and readable is
just fine for now. Folding and iotas will probably be a fun idea to try --
on the next rewrite. On to possibly important problems....
[It occurred to me that if you want to write in a functional language style,
doing it in C is really painful. -John]

Re: Wrestling with phase 1 of a C compiler

<22-09-003@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=538&group=comp.compilers#538

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: luser.dr...@gmail.com (luser droog)
Newsgroups: comp.compilers
Subject: Re: Wrestling with phase 1 of a C compiler
Date: Sun, 11 Sep 2022 20:15:46 -0700 (PDT)
Organization: Compilers Central
Lines: 39
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-09-003@comp.compilers>
References: <22-09-001@comp.compilers> <22-09-002@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="15450"; mail-complaints-to="abuse@iecc.com"
Keywords: design
Posted-Date: 11 Sep 2022 23:43:37 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-09-002@comp.compilers>
 by: luser droog - Mon, 12 Sep 2022 03:15 UTC

On Saturday, September 10, 2022 at 1:36:57 PM UTC-5, luser droog wrote:
> On Wednesday, September 7, 2022 at 4:01:04 PM UTC-5, luser droog wrote:
> > At my tedious glacial pace, I have rewritten my parser library
> > for the umpteen-plus-one'th time only to stall out at an earlier
> > step than where I stalled out the last time around.
> Sorry for the noise. I'm prematurely optimizing, aren't I? The function
> that works just fine and is already reasonably short and readable is
> just fine for now. Folding and iotas will probably be a fun idea to try --
> on the next rewrite. On to possibly important problems....
> [It occurred to me that if you want to write in a functional language style,
> doing it in C is really painful. -John]

That's an accurate assessment. But it's pain I invited upon myself.
I'm hoping that a hot fire will forge some strong steel, or that it's
worth doing because it is hard. But it all depends upon my ability
to emulate or simulate the functional stuff as if it were the C runtime
for a Lisp interpreter that happens not to have a read() or eval(),
just print() and constructors and functions for the objects.

The most recent review post links back to previous reviews all the way
back to an "object oriented" implementation.
https://codereview.stackexchange.com/questions/277769/commented-parser-combinators-in-lisp-style-c

For the present task of pulling the state out of the function, the
obvious solution was to rewrite it using pointers and handle the
state variables externally, letting the calling function decide where
they live.

static list
position( object item, int *row, int *col ){
if( valid( eq_int( '\n', item ) ) )
return cons( item, cons( Int( ++ *row ), Int( *col = 0 ) ) );
else
return cons( item, cons( Int( *row ), Int( ++ *col ) ) );
}

For the longer term, I know I want a Monad and maybe Monad Transformers,
too. But I don't understand how they're implemented yet. Maybe I can do some
kind of monadic comprehension or "do" notation with the C preprocessor....

Wrestling with phase 1 of a C compiler

<22-09-004@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=539&group=comp.compilers#539

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: christop...@compiler-resources.com (Christopher F Clark)
Newsgroups: comp.compilers
Subject: Wrestling with phase 1 of a C compiler
Date: Mon, 12 Sep 2022 21:45:54 +0300
Organization: Compilers Central
Lines: 46
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-09-004@comp.compilers>
References: <22-09-001@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="99560"; mail-complaints-to="abuse@iecc.com"
Keywords: design
Posted-Date: 12 Sep 2022 15:46:44 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Christopher F Clark - Mon, 12 Sep 2022 18:45 UTC

While I hate being extremely pessimistic, you are going in absolutely the
wrong direction.

If you want a functional language, use one. Don't try to turn C into a
functional language. Even if you succeed, you will have made a mess that
no one (probably not even you) will actually understand.

I say that from experience. Early on in writing Yacc++, we did macros that
allowed us to write object-oriented C, which had C++-like semantics (ala
1986, pre-templates, pre STL, etc.) with nice ABC (abstract base classes,
plus some inheritance and polymorphism). It worked. However, it was a
challenge to understand. Our wisest move was at 2.0 rewriting everything
in C++ and not trying to have a "C" layer with similar functionality. The
code was an order-of-magnitude simpler and even making a translation that
did C# was relatively trivial from that code.

C is a nice small imperative language. It's fine for expressing those
kinds of semantics. The C preprocessor is both simple and powerful, but it
doesn't change the nature of C. You cannot really do "compilation" in the
C preprocessor. And, if you really want a parser combinator library, you
want it to do some compilation. Otherwise, you have just masked the fact
that you are doing hand-written recursive descent. And, if you want to do
hand-written recursive descent, do it. Don't put lipstick on a pig and
take it to the opera.

There is a reason your attempts are failing. You don't have good closures
or co-routines in C. You cannot get them easily. You cannot really
interact well with the function stack in C. And, your efforts to put the
state behind pointers, while necessary only get you part of the way there.

So, if you are lucky, you will have something that looks like a parser
combinator language, but which actually has either
- broken semantics (i.e. cases that don't work properly and don't warn you
that they don't)
or
- is very complicated.

--
******************************************************************************

Chris Clark email: christopher.f.clark@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris

Re: Wrestling with phase 1 of a C compiler

<22-09-005@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=540&group=comp.compilers#540

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: gah...@u.washington.edu (gah4)
Newsgroups: comp.compilers
Subject: Re: Wrestling with phase 1 of a C compiler
Date: Mon, 12 Sep 2022 13:01:21 -0700 (PDT)
Organization: Compilers Central
Lines: 23
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-09-005@comp.compilers>
References: <22-09-001@comp.compilers> <22-09-004@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="41271"; mail-complaints-to="abuse@iecc.com"
Keywords: history, PL/I, comment
Posted-Date: 12 Sep 2022 18:26:57 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-09-004@comp.compilers>
 by: gah4 - Mon, 12 Sep 2022 20:01 UTC

On Monday, September 12, 2022 at 12:46:47 PM UTC-7, christoph...@compiler-resources.com wrote:

(snip)

> C is a nice small imperative language. It's fine for expressing those
> kinds of semantics. The C preprocessor is both simple and powerful, but it
> doesn't change the nature of C. You cannot really do "compilation" in the
> C preprocessor.

On of the earlier languages I knew, and maybe still favorite, is PL/I.

PL/I does have a powerful preprocessor, though I don't know so many
actually using its power. It even has preprocessor procedures, if you
need them.

The most use of the power I have seen, is one for unrolling DO loops,
which unrolls smaller loops, but not larger ones, with either a
preprocessor %DO, or a real DO.

I do remember, though, first knowing abuot no preprocessor in Java,
and the trend of PL/I to C to Java, in decreasing preprocessor power.
[PL/I had its charms but I wouldn't want to write functional code
in it, either. -John]

Re: Wrestling with phase 1 of a C compiler

<22-09-006@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=541&group=comp.compilers#541

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: christop...@compiler-resources.com (Christopher F Clark)
Newsgroups: comp.compilers
Subject: Re: Wrestling with phase 1 of a C compiler
Date: Tue, 13 Sep 2022 14:55:18 +0300
Organization: Compilers Central
Lines: 45
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-09-006@comp.compilers>
References: <22-09-001@comp.compilers> <22-09-004@comp.compilers> <22-09-005@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="31319"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, history, macros
Posted-Date: 14 Sep 2022 16:25:35 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Christopher F Clark - Tue, 13 Sep 2022 11:55 UTC

I never used the PL/I preprocessor myself. However, at my first job (at
SofTech), my mentor (Carl Martin) made some PL/I macros that *loosely*
translated a subset of Jovial into Multics PL/I. It was still PL/I
semantics, but Jovial syntax. That allowed us to write our Jovial
compilers in that subset of Jovial and get it working before it could
become self-hosting.

But notice that these were mostly minor syntactic changes and no error
checking, no semantic changes, etc. It was only used for one project and
one team and two compilers (one targeting Multics and the other
targeting the Interdata 8/32). So, the fact that it wasn't very robust
wasn't an issue.

This is very different than trying to make a functional library in an
imperative language.

----------

On a related note, I have heard stories from C++ compiler
implementors about the various template libraries that have been created
which attempt to do "Turing machine" style (NP-complete) computations via
types and parameters, where the users wonder why the compilation process
takes much longer than running the resultant program.

This is the kind of mess one makes when one sees a hammer and treats screws
as nails. Yes, with enough force you can pound a screw into wood (or a
wall) but the result is NOT a good fastener. Screws are threaded for a
reason and nails are not for the complementary reason. Different semantics.

I rarely create my own C++ templates nor Rust macros for that reason. It
is too easy to create things that look clever but are essentially
undecipherable. As the saying goes: "With great power comes great
responsibility."

Kind regards,
Chris

--
******************************************************************************

Chris Clark email: christopher.f.clark@compiler-resources.com
Compiler Resources, Inc. Web Site: http://world.std.com/~compres
23 Bailey Rd voice: (508) 435-5016
Berlin, MA 01503 USA twitter: @intel_chris
------------------------------------------------------------------------------

Re: source languages, was Wrestling with phase 1 of a C compiler

<22-09-007@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=542&group=comp.compilers#542

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!rocksolid2!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: gneun...@comcast.net (George Neuner)
Newsgroups: comp.compilers
Subject: Re: source languages, was Wrestling with phase 1 of a C compiler
Date: Wed, 14 Sep 2022 16:03:22 -0400
Organization: A noiseless patient Spider
Lines: 28
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-09-007@comp.compilers>
References: <22-09-001@comp.compilers> <22-09-004@comp.compilers> <22-09-005@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="32394"; mail-complaints-to="abuse@iecc.com"
Keywords: PL/I, history, Lisp
Posted-Date: 14 Sep 2022 16:26:33 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: George Neuner - Wed, 14 Sep 2022 20:03 UTC

On Mon, 12 Sep 2022 13:01:21 -0700 (PDT), gah4 <gah4@u.washington.edu>
wrote:

> :
>PL/I does have a powerful preprocessor, though I don't know so many
>actually using its power. It even has preprocessor procedures, if you
>need them.
> :

Back in the day I would have reached for Lisp ... certainly for rapid
prototyping and/or experimentation with new compilation techniques.

The trouble with Lisp in the (distant) past was the high cost of a
workstation capable of running it acceptably. That no longer is an
issue, so Lisp can be an excellent choice for compiler development.

For various reasons I prefer Scheme over Lisp, so for a modern
"batteries-included" Scheme environment I would reach for Racket.

Certainly mileage varies, but unless you are hell bent on maximum
performance [how many people *really* derive benefit from being able
to compile 10K lines/second/core?], in my opinion almost any modern
HLL would be a better choice than C for writing a compiler.

George

Re: Wrestling with phase 1 of a C compiler

<22-09-008@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=543&group=comp.compilers#543

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: luser.dr...@gmail.com (luser droog)
Newsgroups: comp.compilers
Subject: Re: Wrestling with phase 1 of a C compiler
Date: Wed, 14 Sep 2022 14:31:48 -0700 (PDT)
Organization: Compilers Central
Lines: 99
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-09-008@comp.compilers>
References: <22-09-001@comp.compilers> <22-09-004@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="97850"; mail-complaints-to="abuse@iecc.com"
Keywords: parse,design
Posted-Date: 15 Sep 2022 12:17:45 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-09-004@comp.compilers>
 by: luser droog - Wed, 14 Sep 2022 21:31 UTC

On Monday, September 12, 2022 at 2:46:47 PM UTC-5, christoph...@compiler-resources.com wrote:
> While I hate being extremely pessimistic, you are going in absolutely the
> wrong direction.
>
> If you want a functional language, use one. Don't try to turn C into a
> functional language. Even if you succeed, you will have made a mess that
> no one (probably not even you) will actually understand.
>

You may be right. But, but but ... I'm in too deep. And I've adopted a "larger
scale methodology" where every time it becomes a mess, I stop. I go back
to the papers by Hutton et al. I make new prototypes in my favorite dynamic
language. And then I go back and merge the new ideas with the best parts
of the old code and produce a new draft. It's getting better.

And my last comment about using the preprocessor might have given a
false clue about what I've been doing. I'm using a few preprocessor tricks
(PP_NARG, X-Macros) but their usage has also evolved through the rewrites.
And I daresay the usage of preprocessor in my code is fairly tame and
readable. Particularly if I were to throw in a link to my SO post which explains
exactly how I use the X-Macros:

https://stackoverflow.com/questions/6635851/real-world-use-of-x-macros/6636596#6636596

> I say that from experience. Early on in writing Yacc++, we did macros that
> allowed us to write object-oriented C, which had C++-like semantics (ala
> 1986, pre-templates, pre STL, etc.) with nice ABC (abstract base classes,
> plus some inheritance and polymorphism). It worked. However, it was a
> challenge to understand. Our wisest move was at 2.0 rewriting everything
> in C++ and not trying to have a "C" layer with similar functionality. The
> code was an order-of-magnitude simpler and even making a translation that
> did C# was relatively trivial from that code.
>
> C is a nice small imperative language. It's fine for expressing those
> kinds of semantics. The C preprocessor is both simple and powerful, but it
> doesn't change the nature of C. You cannot really do "compilation" in the
> C preprocessor. And, if you really want a parser combinator library, you
> want it to do some compilation. Otherwise, you have just masked the fact
> that you are doing hand-written recursive descent. And, if you want to do
> hand-written recursive descent, do it. Don't put lipstick on a pig and
> take it to the opera.
>

You're right. I'm not planning on doing it in the preprocessor, but compilation
is one of the big challenges I haven't yet conquered. I have the book by Burge
which has a good description of the algorithms I'll need (I think). But my
in-memory representation might be too clumsy to use directly. I may need to
introduce a symbolic intermediate representation that's easier to do rewrites
upon.

You're right that (for now) it's just recursive descent in disguise. But already
it's a pretty guise. The C interface is fairly lightweight and transparent,
iff you can accept the practice of typedefing pointers and treating
certain types as having reference semantics.

> There is a reason your attempts are failing.

Perhaps I'm in too deep to see it this way. But it seems to me that each
draft has failed in a slightly different manner.

>You don't have good closures or co-routines in C.

I think my closures are pretty good. Serviceable, at least. It's an object that
couples a function pointer with it's saved environment.

I don't have co-routines, but I have lists where a node can be a "suspension"
which can transparently (?) invoke the closure to produce elements of
the list lazily. So I have a kind of continuation passing which serves many
of the same uses that I would have for co-routines (unless I'm wrong).

> You cannot get them easily. You cannot really
> interact well with the function stack in C.

That's true. But I can build up graphs of functions and then call the
the thing to run all the doodads.

>And, your efforts to put the
> state behind pointers, while necessary only get you part of the way there.

That's also true. For the present case, I'll also need to dynamically allocate
integer objects and keep them in the environment for the calling
function which is one of these closures, a suspension function that converts
the first element of a stream and returns a list with a suspension in the cdr.

That's the only way I'll get row and column counters to exist on a per-file
basis.

> So, if you are lucky, you will have something that looks like a parser
> combinator language, but which actually has either
> - broken semantics (i.e. cases that don't work properly and don't warn you
> that they don't)
> or
> - is very complicated.
>

If it fails, it fails. I'll try to learn something from it and try something else.
You may indeed be right. But I've steered past the island of the proprocessor
sirens. It's a danger of different kind. Function spaghetti, or maybe
"Function Orzo"(tm).

Re: Wrestling with phase 1 of a C compiler

<22-09-009@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=544&group=comp.compilers#544

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: gah...@u.washington.edu (gah4)
Newsgroups: comp.compilers
Subject: Re: Wrestling with phase 1 of a C compiler
Date: Wed, 14 Sep 2022 15:40:33 -0700 (PDT)
Organization: Compilers Central
Lines: 26
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-09-009@comp.compilers>
References: <22-09-001@comp.compilers> <22-09-004@comp.compilers> <22-09-005@comp.compilers> <22-09-006@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="98209"; mail-complaints-to="abuse@iecc.com"
Keywords: performance, history
Posted-Date: 15 Sep 2022 12:18:26 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-09-006@comp.compilers>
 by: gah4 - Wed, 14 Sep 2022 22:40 UTC

On Wednesday, September 14, 2022 at 1:25:39 PM UTC-7, christoph...@compiler-resources.com wrote:

(snip)

> On a related note, I have heard stories from C++ compiler
> implementors about the various template libraries that have been created
> which attempt to do "Turing machine" style (NP-complete) computations via
> types and parameters, where the users wonder why the compilation process
> takes much longer than running the resultant program.

Many years ago, there was a story about a Fortran benchmark program,
very complicated with lots of statement functions evaluating many
complicated expressions.

(I probably don't have to mention the problems with
designing good benchmarks here. This was years ago.)

Then it was run though the IBM OS/360 Fortran H compiler.
Among others, it expands statement functions inline, and does
constant expression evaluation. It did the whole thing at
compile time, except printing out a single number.

Fortran now requires many complicated constant expressions
be evaluated at compile time, and some have come up with some,
though not intentionally, that evaluate very slowly. (OK, maybe
some were intentional.)

Re: Wrestling with phase 1 of a C compiler

<22-09-010@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=545&group=comp.compilers#545

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: luser.dr...@gmail.com (luser droog)
Newsgroups: comp.compilers
Subject: Re: Wrestling with phase 1 of a C compiler
Date: Thu, 15 Sep 2022 20:11:15 -0700 (PDT)
Organization: Compilers Central
Lines: 87
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-09-010@comp.compilers>
References: <22-09-001@comp.compilers> <22-09-004@comp.compilers> <22-09-008@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="9865"; mail-complaints-to="abuse@iecc.com"
Keywords: parse, design
Posted-Date: 20 Sep 2022 11:17:38 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-09-008@comp.compilers>
 by: luser droog - Fri, 16 Sep 2022 03:11 UTC

On Thursday, September 15, 2022 at 11:17:50 AM UTC-5, luser droog wrote:
> On Monday, September 12, 2022 at 2:46:47 PM UTC-5, christoph...@compiler-resources.com wrote:

> >And, your efforts to put the
> > state behind pointers, while necessary only get you part of the way there.
> That's also true. For the present case, I'll also need to dynamically allocate
> integer objects and keep them in the environment for the calling
> function which is one of these closures, a suspension function that converts
> the first element of a stream and returns a list with a suspension in the cdr.
>
> That's the only way I'll get row and column counters to exist on a per-file
> basis.

So here's the rest of it. This ought to do the whole phase 1 of the C compilation
process while also supplementing each byte with its row and column numbers.
And it holds the state in the local environment in the closure,
although it has to create a new environment for each iteration because I don't
have a function for updating definitions (--supposed to be "functional" after all,
as much as practical).

(the header file defines the names POS_ROW, POS_COL, and POS_INPUT in an enum.)

static fSuspension force_chars_with_positions;
static list position( object item, int *row, int *col );
static parser position_grammar( void );
static fOperator new_line;

list
chars_with_positions( list input ){
return Suspension( env( NIL_, 3,
Symbol(POS_ROW), Int( 0 ),
Symbol(POS_COL), Int( 0 ),
Symbol(POS_INPUT), input ),
force_chars_with_positions );
}

list
force_chars_with_positions( list ev ){
list input = assoc_symbol( POS_INPUT, ev );
integer row = assoc_symbol( POS_ROW, ev );
integer col = assoc_symbol( POS_COL, ev );

static parser position_parser;
if( ! position_parser ) position_parser = position_grammar();
object result = parse( position_parser, input );
if( not_ok( result ) ) return rest( rest( result ) );

object payload = rest( result );
list pos = position( first( payload ), &row->Int.i, &col->Int.i );
return cons( pos,
Suspension( env( NIL_, 3,
Symbol(POS_ROW), row,
Symbol(POS_COL), col,
Symbol(POS_INPUT), rest( payload ) ),
force_chars_with_positions ) );
}

static list
position( object item, int *row, int *col ){
if( valid( eq_int( '\n', item ) ) )
return cons( item, cons( Int( ++ *row ), Int( *col = 0 ) ) );
else
return cons( item, cons( Int( *row ), Int( ++ *col ) ) );
}

static parser
position_grammar( void ){
return either( bind( ANY( str("\r\n"),
chr('\r'),
chr('\n') ),
Operator( NIL_, new_line ) ),
item() );
}

static object
new_line( list env, object input ){
return Int('\n');
}

I think it's pretty nice and readable, while hiding some of the magic.

One big unsolved issue is how to use a GC in C without access to the
stack. My only solution is to only call the GC from the top level, or
otherwise carefully cultivating the root set and calling from near the
top level where the root set is easy to manage. But as this is off topic
here, I'd invite any thoughts about user space GC over in comp.lang.c.

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor