Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

And Bruce is effectively building BruceIX -- Alan Cox


devel / comp.compilers / Re: Fortran to C/C++ translation: a running example.

SubjectAuthor
* Fortran to C/C++ translation: a running example.Rock Brentwood
+- Re: Fortran to C/C++ translation: a running example.Thomas Koenig
`* Re: Fortran to C/C++ translation: a running example.Lydia Marie Williamson
 +- Re: Fortran to C/C++ translation: a running example.gah4
 `- Re: Fortran to C/C++ translation: a running example.Thomas Koenig

1
Fortran to C/C++ translation: a running example.

<22-05-032@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=389&group=comp.compilers#389

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: rockbren...@gmail.com (Rock Brentwood)
Newsgroups: comp.compilers
Subject: Fortran to C/C++ translation: a running example.
Date: Mon, 16 May 2022 12:27:30 -0700 (PDT)
Organization: Compilers Central
Lines: 59
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-05-032@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="7120"; mail-complaints-to="abuse@iecc.com"
Keywords: translator, history, comment
Posted-Date: 16 May 2022 15:53:06 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Rock Brentwood - Mon, 16 May 2022 19:27 UTC

The classic text-based computer game Zork / dungeon was originally devised on
MIT computers in a LISP-offshoot (MDL), and translated to Fortran 77 by an
"Anonymous" author. Some time later an enterprising soul converted a version
of the Fortran edition of Zork into C ... pre-ANSI C ... with the aid of an
earlier version of "f2c", but left no detailed paper trail behind on the
actual translation process and stages.

I think this is the kind of project our moderator would really like.

It's been retranslated from Fortran (with the aid of a later version of "f2c")
here:

https://github.com/LydiaMarieWilliamson/zork-fortran

every intermediate stage of the process is archived in the history log and
commit history. This was carried out in tandem with a revision of the Fortran
source, itself (as Fortran 2018 no longer supports all of Fortran 77), and an
upward revision of the 1991 translation into C99. Both the newer C
translation, from 2021, and 2021 revision of the older 1991 C translation have
converted onto the same result.

A key issue that arise, which led to later revision in the Fortran standard,
is the lack of information required to distinguish between parameters that are
input-only, output-only, input/output. That has to be inferred, which requires
either transparency of library functions (here: the functions in the f2c
library or whatever is written in its place) or I/O specifications in the
library functions. So, a "strength reduction" step is required to lift
input/output parameters (the default) to input-only or output-only.

A similar issue arises with locals, which are "static", by default, in Fortran
(or the Fortran equivalent of "static"). A "strength reduction" step is
required to lift non-static locals to bona fide "auto" locals.

Another key issue the aliasing that goes on with "equivalence" constructs.
There is no good uniform translation for this into C ... it actually better
fits C++, where you have reference types available. There's really no good
reason why those have been left out of C, when other things which appeared
first in C++, like "const", "bool" or function prototypes, found their way
into C.

However, a substantial chunk of use-cases for equivalence constructs can be
carved out as "enum" types, so there was a strength reduction step for this,
too.

Perhaps the moderator will have more to say about the intricacies of Fortran
translation. In the meanwhile, another project has already been staged for
conversion to C++ - LAPACK

https://github.com/LydiaMarieWilliamson/lapack

but is in a holding pattern for now. This one will more heavily involve the
synthesis of "template" types. To date, ongoing attempts, elsewhere, have been
mostly limited to creating C or C++ shells for the Fortran core, rather than a
conversion of the core, itself.
[It's been at least 20 years since I've done any sort of Fortran translation
so for this maze of twisty little passages, I'm afraid you're on your own.
I'm always surprised in translation exercises how many ways that languages
that look superficially the same are different in ways that make the translation
much harder. -John]

Re: Fortran to C/C++ translation: a running example.

<22-05-036@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=392&group=comp.compilers#392

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.compilers
Subject: Re: Fortran to C/C++ translation: a running example.
Date: Tue, 17 May 2022 14:59:15 -0000 (UTC)
Organization: news.netcologne.de
Lines: 80
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-05-036@comp.compilers>
References: <22-05-032@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="72082"; mail-complaints-to="abuse@iecc.com"
Keywords: Fortran, history
Posted-Date: 17 May 2022 11:41:04 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Thomas Koenig - Tue, 17 May 2022 14:59 UTC

Rock Brentwood <rockbrentwood@gmail.com> schrieb:
[...]

> A key issue that arise, which led to later revision in the Fortran standard,
> is the lack of information required to distinguish between parameters that are
> input-only, output-only, input/output.

Nit: In Fortran, "parameters" are what you would call "constants"
in another language. Arguments to functions or subroutines are
called "dummy arguments", which are then associated with "actual
arguments" on the caller's side.

> That has to be inferred, which requires
> either transparency of library functions (here: the functions in the f2c
> library or whatever is written in its place) or I/O specifications in the
> library functions. So, a "strength reduction" step is required to lift
> input/output parameters (the default) to input-only or output-only.

"Strength reduction" is a term normally used for something else,
for example when replacing multiplication (as in a loop for array
processing) by addition.

It's a question of the semantics of the code. For something like
(C side)

aux_var = 5;
foo (&aux_var);

you can almost certainly rewrite foo to take a value argument.

> A similar issue arises with locals, which are "static", by default, in Fortran
> (or the Fortran equivalent of "static"). A "strength reduction" step is
> required to lift non-static locals to bona fide "auto" locals.

The FORTRAN language never guaranteed that variables would keep their
data unless SAVE was specified, but many compilers did it anyway, so the
code may indeed assume so.

Some experimentation on the Fortran side can help there. Compiling
the code with -frecursive and/or with one of the -finit-integer
and -finit-real options (I'm talking gfortran options here, but
other compilers have similar) will help you find trouble spots.
If you happen to have access to nagfor, they have a -C=all option
which will find very many bugs in code that people think correct,
even more with -C=undefined.

> Another key issue the aliasing that goes on with "equivalence" constructs.

> There is no good uniform translation for this into C ...

The question is - what is equivalence used for? Something sane?

Generally, C's union are a good match for Fortran's equivalence,
with the same problem with undefined behavior if the unions are
used for type punning.

>it actually better
> fits C++, where you have reference types available. There's really no good
> reason why those have been left out of C, when other things which appeared
> first in C++, like "const", "bool" or function prototypes, found their way
> into C.
>
> However, a substantial chunk of use-cases for equivalence constructs can be
> carved out as "enum" types, so there was a strength reduction step for this,
> too.
>
> Perhaps the moderator will have more to say about the intricacies of Fortran
> translation. In the meanwhile, another project has already been staged for
> conversion to C++ - LAPACK
>
> https://github.com/LydiaMarieWilliamson/lapack
>
> but is in a holding pattern for now. This one will more heavily involve the
> synthesis of "template" types. To date, ongoing attempts, elsewhere, have been
> mostly limited to creating C or C++ shells for the Fortran core, rather than a
> conversion of the core, itself.

Fortran has guarantees on the semantics which are quite well tuned for
optimization. Converting it into C or C++ may well lose execution
speed.

Re: Fortran to C/C++ translation: a running example.

<22-05-038@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=394&group=comp.compilers#394

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: lydiamar...@gmail.com (Lydia Marie Williamson)
Newsgroups: comp.compilers
Subject: Re: Fortran to C/C++ translation: a running example.
Date: Fri, 20 May 2022 16:34:48 -0700 (PDT)
Organization: Compilers Central
Lines: 47
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-05-038@comp.compilers>
References: <22-05-032@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="47840"; mail-complaints-to="abuse@iecc.com"
Keywords: Fortran, C, history
Posted-Date: 21 May 2022 11:54:43 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-05-032@comp.compilers>
 by: Lydia Marie Williams - Fri, 20 May 2022 23:34 UTC

On Monday, May 16, 2022 at 2:53:09 PM UTC-5, Rock Brentwood wrote:
> Another key issue the aliasing that goes on with "equivalence" constructs.
> There is no good uniform translation for this into C ... it actually better
> fits C++, where you have reference types available. There's really no good
> reason why those have been left out of C, when other things which appeared
> first in C++, like "const", "bool" or function prototypes, found their way
> into C.
>
> However, a substantial chunk of use-cases for equivalence constructs can be
> carved out as "enum" types, so there was a strength reduction step for this,
> too.

This is not exactly correct. It's "common blocks" that were handled in this
way.

In the Fortran source of Zork/dungeon, the "equivalence" statements and
"common blocks" were used together, so it's easy to get the issue confused. I
don't know if their being used together is something that always happened in
Fortran, or if it was just particular to this program.

> In the meanwhile, another project has already been staged for
> conversion to C++ - LAPACK
>
> https://github.com/LydiaMarieWilliamson/lapack
>
> but is in a holding pattern for now.

There were several stages to the translation, one of which involved
regularizing and normalizing the Fortran, itself.
This is also on the local machines here.
But while that was happening, LAPACK came back alive, and is out on GitHub and
being actively maintained again.
Originally, it was (mostly) inert.

> [It's been at least 20 years since I've done any sort of Fortran translation
> so for this maze of twisty little passages, I'm afraid you're on your own.
> I'm always surprised in translation exercises how many ways that languages
> that look superficially the same are different in ways that make the
translation much harder. -John]

Things would be easier going into C++, instead of C, since it already has
aliasing, operator overloading, re-defineable array indexing, and
call-by-reference. This inclusion of more Fortran-friendly features into C++
was apparently done intentionally.
[It was not unusual to use common and equivalence together, particularly when memory
was tight. But equivalence is like a union, not an enum. -John]

Re: Fortran to C/C++ translation: a running example.

<22-05-041@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=397&group=comp.compilers#397

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: gah...@u.washington.edu (gah4)
Newsgroups: comp.compilers
Subject: Re: Fortran to C/C++ translation: a running example.
Date: Sat, 21 May 2022 09:31:45 -0700 (PDT)
Organization: Compilers Central
Lines: 25
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-05-041@comp.compilers>
References: <22-05-032@comp.compilers> <22-05-038@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="61175"; mail-complaints-to="abuse@iecc.com"
Keywords: Fortran
Posted-Date: 21 May 2022 13:25:58 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-05-038@comp.compilers>
 by: gah4 - Sat, 21 May 2022 16:31 UTC

On Saturday, May 21, 2022 at 8:54:47 AM UTC-7, Lydia Marie Williamson wrote:

(snip on COMMON and EQUIVALENCE)

> This is not exactly correct. It's "common blocks" that were handled in this
> way.

> In the Fortran source of Zork/dungeon, the "equivalence" statements and
> "common blocks" were used together, so it's easy to get the issue confused. I
> don't know if their being used together is something that always happened in
> Fortran, or if it was just particular to this program.

COMMON and EQUIVALENCE are closely related in the Fortran standard,
and in the implementation by compilers. A variable equivalenced to a
variable in common, is also in common. Such variable can extend the
length of the common block, but only at the end, not the beginning.

It used to be that compilers would print out a variable map, with the
address, or offset, of each variable, and its length and type. That was
often useful to be sure that the compiler did what you thought it did.
Also, it would include the length of each common block, again good
to check to be sure they agree with what you expect.

The Fortran standard has a C interoperability feature that explains
how Fortran features and C features work together.

Re: Fortran to C/C++ translation: a running example.

<22-05-042@comp.compilers>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=398&group=comp.compilers#398

  copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: tkoe...@netcologne.de (Thomas Koenig)
Newsgroups: comp.compilers
Subject: Re: Fortran to C/C++ translation: a running example.
Date: Sat, 21 May 2022 17:24:37 -0000 (UTC)
Organization: news.netcologne.de
Lines: 62
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-05-042@comp.compilers>
References: <22-05-032@comp.compilers> <22-05-038@comp.compilers>
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="61499"; mail-complaints-to="abuse@iecc.com"
Keywords: Fortran, storage
Posted-Date: 21 May 2022 13:26:54 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
 by: Thomas Koenig - Sat, 21 May 2022 17:24 UTC

Lydia Marie Williamson <lydiamariewilliamson@gmail.com> schrieb:
> On Monday, May 16, 2022 at 2:53:09 PM UTC-5, Rock Brentwood wrote:
>> Another key issue the aliasing that goes on with "equivalence" constructs.
>> There is no good uniform translation for this into C ... it actually better
>> fits C++, where you have reference types available. There's really no good
>> reason why those have been left out of C, when other things which appeared
>> first in C++, like "const", "bool" or function prototypes, found their way
>> into C.
>>
>> However, a substantial chunk of use-cases for equivalence constructs can be
>> carved out as "enum" types, so there was a strength reduction step for this,
>> too.
>
> This is not exactly correct. It's "common blocks" that were handled in this way.
>
> In the Fortran source of Zork/dungeon, the "equivalence" statements and
> "common blocks" were used together, so it's easy to get the issue confused. I
> don't know if their being used together is something that always happened in
> Fortran, or if it was just particular to this program.

Fortran has the concept of storage association - under certain
circumstances, the ordering of variables is prescribed by the
standard.

COMMON blocks are one example of this. Taking an example from the
original Fortran source code:

COMMON /SYNTAX/ VFLAG,DOBJ,DFL1,DFL2,DFW1,DFW2,
& IOBJ,IFL1,IFL2,IFW1,IFW2

This declares a common block /SYNTAX/ with 11 named variables
(all of them integers due to an IMPLICIT INTEGER (A-Z) earlier in
all files), which have to be contiguous in memory.

The next line

INTEGER SYN(11)

declares an integer array with 11 elements.

Finally, the statement

EQUIVALENCE (VFLAG, SYN)

tells the compiler that the address of the (first element of) SYN
and VFLAG are the same.

So, you can now use SYN(1) to refer to VFLAG, SYN(2) to DOBJ and so on.

Why is this done? I see only one use case, in np3.for

DO 10 I=1,11
C !CLEAR SYNTAX.
SYN(I)=0
10 CONTINUE

simply to create a shortcut for clearing the syntax.

This is a benign (and standard-conforming) way of using COMMON
and EQUIVALENCE. Equivalent C code might create a 'struct syntax'
and clear it with a memset, or have 11 individual variables and
zero them individually.

1
server_pubkey.txt

rocksolid light 0.9.8
clearnet tor