Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

"They ought to make butt-flavored cat food." --Gallagher


devel / comp.std.c / Re: Open source C compiler using Regular Expressions

SubjectAuthor
* Open source C compiler using Regular Expressionssasho648
`* Re: Open source C compiler using Regular ExpressionsBenjamin Williams (Hodgez)
 `* Re: Open source C compiler using Regular Expressionssasho648
  `- Re: Open source C compiler using Regular Expressionssasho648

1
Open source C compiler using Regular Expressions

<28c7430e-ca37-4e04-92bc-bea6b1348744n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=319&group=comp.std.c#319

  copy link   Newsgroups: comp.std.c
X-Received: by 2002:ac8:428a:: with SMTP id o10mr276480qtl.200.1630513729559;
Wed, 01 Sep 2021 09:28:49 -0700 (PDT)
X-Received: by 2002:a4a:e907:: with SMTP id z7mr181384ood.20.1630513729288;
Wed, 01 Sep 2021 09:28:49 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!border2.nntp.dca1.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.std.c
Date: Wed, 1 Sep 2021 09:28:49 -0700 (PDT)
Injection-Info: google-groups.googlegroups.com; posting-host=80.72.92.226; posting-account=bWYUxQoAAAB5z1oRNVqrl4M_INNX4Qxq
NNTP-Posting-Host: 80.72.92.226
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <28c7430e-ca37-4e04-92bc-bea6b1348744n@googlegroups.com>
Subject: Open source C compiler using Regular Expressions
From: sasho...@gmail.com (sasho648)
Injection-Date: Wed, 01 Sep 2021 16:28:49 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Lines: 39
 by: sasho648 - Wed, 1 Sep 2021 16:28 UTC

It uses PCRE2 to parse the C file and match a huge regex composed of several .regex files stitched together one Perl script (main.pl). There are about 94 currently callouts placed inside it which invoke C++ code that reads named capture groups and calls the LLVM APIs appropriately to construct a program.

https://github.com/6a4h8/cparser2/tree/wip

This is an open source compiler using regular expressions and mainly focusing on the C89 (from fips pub 160 pdf document).

The backend was originally a huge C switch which I recently converted into C++ virtual functions - there are two pair of them - one for parsing - they can alter the match and one for producing.

The parsing one is mainly used for typedefs since they require context sensitive parsing inside functions.

Currently it doesn't implement: initialization, WIP on implementing conditional evaluation with the logical ops, incomplete types, un-prototyped functions.

Most importantly it doesn't support attributes and preprocessor directives.

It does implement: everything else hopefully.

Check out the WIP branch (lastly worked on Windows). Invocation:

cparser main.pl in_src.c

Expected output (llvm bitcode and IR representation):

in_src.c.bc
in_src.c.ll

It can be debugged if you uncomment the ending of line 6 in main.h. This will produce 2 output.txt files and significantly slow down the compilation process.

Re: Open source C compiler using Regular Expressions

<ecf48113-ca57-474f-b1e1-02ad58624cbbn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=515&group=comp.std.c#515

  copy link   Newsgroups: comp.std.c
X-Received: by 2002:a05:622a:1a04:b0:3f7:fab0:6317 with SMTP id f4-20020a05622a1a0400b003f7fab06317mr2560897qtb.10.1686549334334;
Sun, 11 Jun 2023 22:55:34 -0700 (PDT)
X-Received: by 2002:a05:6820:160d:b0:558:c5d4:9ce9 with SMTP id
bb13-20020a056820160d00b00558c5d49ce9mr2969457oob.0.1686549334013; Sun, 11
Jun 2023 22:55:34 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!3.us.feeder.erje.net!feeder.erje.net!border-1.nntp.ord.giganews.com!nntp.giganews.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.std.c
Date: Sun, 11 Jun 2023 22:55:33 -0700 (PDT)
In-Reply-To: <28c7430e-ca37-4e04-92bc-bea6b1348744n@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=2601:405:4800:b3d0:3e1c:191:cede:1137;
posting-account=ZLCgfAoAAADwFO1pltaSWEIAQ9JZURAK
NNTP-Posting-Host: 2601:405:4800:b3d0:3e1c:191:cede:1137
References: <28c7430e-ca37-4e04-92bc-bea6b1348744n@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <ecf48113-ca57-474f-b1e1-02ad58624cbbn@googlegroups.com>
Subject: Re: Open source C compiler using Regular Expressions
From: benjamin...@gmail.com (Benjamin Williams (Hodgez))
Injection-Date: Mon, 12 Jun 2023 05:55:34 +0000
Content-Type: text/plain; charset="UTF-8"
Lines: 0
 by: Benjamin Williams (H - Mon, 12 Jun 2023 05:55 UTC

Absolute mad lad. I love it. I will have to give it a try later to see how all it works.

Re: Open source C compiler using Regular Expressions

<bb9385ca-d08c-47e9-84a7-587d06cb0eefn@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=516&group=comp.std.c#516

  copy link   Newsgroups: comp.std.c
X-Received: by 2002:a37:4353:0:b0:75c:c431:37d7 with SMTP id q80-20020a374353000000b0075cc43137d7mr1205289qka.11.1686555975294;
Mon, 12 Jun 2023 00:46:15 -0700 (PDT)
X-Received: by 2002:a05:6830:57:b0:6af:975f:4af with SMTP id
d23-20020a056830005700b006af975f04afmr2492210otp.1.1686555974887; Mon, 12 Jun
2023 00:46:14 -0700 (PDT)
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!1.us.feeder.erje.net!feeder.erje.net!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.std.c
Date: Mon, 12 Jun 2023 00:46:14 -0700 (PDT)
In-Reply-To: <ecf48113-ca57-474f-b1e1-02ad58624cbbn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=80.72.66.12; posting-account=bWYUxQoAAAB5z1oRNVqrl4M_INNX4Qxq
NNTP-Posting-Host: 80.72.66.12
References: <28c7430e-ca37-4e04-92bc-bea6b1348744n@googlegroups.com> <ecf48113-ca57-474f-b1e1-02ad58624cbbn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <bb9385ca-d08c-47e9-84a7-587d06cb0eefn@googlegroups.com>
Subject: Re: Open source C compiler using Regular Expressions
From: sasho...@gmail.com (sasho648)
Injection-Date: Mon, 12 Jun 2023 07:46:15 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2341
 by: sasho648 - Mon, 12 Jun 2023 07:46 UTC

On Monday, June 12, 2023 at 8:55:35 AM UTC+3, Benjamin Williams (Hodgez) wrote:
> Absolute mad lad. I love it. I will have to give it a try later to see how all it works.
Just FYI - it's on https://github.com/AnFunctionArray/cllvmbackend on now (with git submodule - the actual perl/regex part). I guess on the "mad lad" part you'll be happy to hear that this version is also multithreaded (because it turned out (last time - I've not checked out the last perl updates) that this way was actually faster - with the bottleneck otherwise being the regex engine) - you need this evn vars:

MAXTHREADS=8
MINLEN=50000
SILENT=1

Otherwise the syntax is the same:

regularc ./parse.pl ./bulk/tests/test.c

But also generally last time it had some issues (since I was trying it for different purposes (for which there is the non standard INTPROM env var)). However I also had success compiling the c donut program with slight modifications (mainly removed the preprocessor - line concatenation and comments) at certain point in the past.

Re: Open source C compiler using Regular Expressions

<0e36e5b3-4326-4312-9671-138c64e17018n@googlegroups.com>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=517&group=comp.std.c#517

  copy link   Newsgroups: comp.std.c
X-Received: by 2002:a37:715:0:b0:760:73ce:b00a with SMTP id 21-20020a370715000000b0076073ceb00amr342611qkh.11.1686556186161;
Mon, 12 Jun 2023 00:49:46 -0700 (PDT)
X-Received: by 2002:a05:6830:18db:b0:6b1:6ad9:8950 with SMTP id
v27-20020a05683018db00b006b16ad98950mr2449203ote.4.1686556185858; Mon, 12 Jun
2023 00:49:45 -0700 (PDT)
Path: i2pn2.org!i2pn.org!usenet.blueworldhosting.com!diablo1.usenet.blueworldhosting.com!peer02.iad!feed-me.highwinds-media.com!news.highwinds-media.com!news-out.google.com!nntp.google.com!postnews.google.com!google-groups.googlegroups.com!not-for-mail
Newsgroups: comp.std.c
Date: Mon, 12 Jun 2023 00:49:45 -0700 (PDT)
In-Reply-To: <bb9385ca-d08c-47e9-84a7-587d06cb0eefn@googlegroups.com>
Injection-Info: google-groups.googlegroups.com; posting-host=80.72.66.12; posting-account=bWYUxQoAAAB5z1oRNVqrl4M_INNX4Qxq
NNTP-Posting-Host: 80.72.66.12
References: <28c7430e-ca37-4e04-92bc-bea6b1348744n@googlegroups.com>
<ecf48113-ca57-474f-b1e1-02ad58624cbbn@googlegroups.com> <bb9385ca-d08c-47e9-84a7-587d06cb0eefn@googlegroups.com>
User-Agent: G2/1.0
MIME-Version: 1.0
Message-ID: <0e36e5b3-4326-4312-9671-138c64e17018n@googlegroups.com>
Subject: Re: Open source C compiler using Regular Expressions
From: sasho...@gmail.com (sasho648)
Injection-Date: Mon, 12 Jun 2023 07:49:46 +0000
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Received-Bytes: 2573
 by: sasho648 - Mon, 12 Jun 2023 07:49 UTC

On Monday, June 12, 2023 at 10:46:16 AM UTC+3, sasho648 wrote:
> On Monday, June 12, 2023 at 8:55:35 AM UTC+3, Benjamin Williams (Hodgez) wrote:
> > Absolute mad lad. I love it. I will have to give it a try later to see how all it works.
> Just FYI - it's on https://github.com/AnFunctionArray/cllvmbackend on now (with git submodule - the actual perl/regex part). I guess on the "mad lad" part you'll be happy to hear that this version is also multithreaded (because it turned out (last time - I've not checked out the last perl updates) that this way was actually faster - with the bottleneck otherwise being the regex engine) - you need this evn vars:
>
> MAXTHREADS=8
> MINLEN=50000
> SILENT=1
>
> Otherwise the syntax is the same:
>
> regularc ./parse.pl ./bulk/tests/test.c
>
> But also generally last time it had some issues (since I was trying it for different purposes (for which there is the non standard INTPROM env var)).. However I also had success compiling the c donut program with slight modifications (mainly removed the preprocessor - line concatenation and comments) at certain point in the past.
Faster - that's for **very large** files - otherwise it's the same.

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor