Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

My computer can beat up your computer. -- Karl Lehenbauer


devel / comp.compilers / Re: Programming language similarity

SubjectAuthor
* Programming language similarityDerek Jones
+- Re: Programming language similarityDerek Jones
+* Re: Programming language similarityFernando
|`- Re: Programming language similarityDerek Jones
+* Re: Programming language similarityJan Ziak
|`* Re: Programming language similarityDerek Jones
| `* Re: Programming language similaritygah4
|  `- Re: Programming language similarityDerek Jones
`- Re: Programming language similarityMeshach Mitchell

1
Programming language similarity

<22-04-012@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=361&group=comp.compilers#361

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: der...@NOSPAM-knosof.co.uk (Derek Jones)
Newsgroups: comp.compilers
Subject: Programming language similarity
Date: Mon, 25 Apr 2022 00:00:40 +0100
Organization: Compilers Central
Lines: 15
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-04-012@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="64575"; mail-complaints-to="abuse@iecc.com"
Keywords: question, comment
Posted-Date: 24 Apr 2022 22:49:00 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Content-Language: en-US
 by: Derek Jones - Sun, 24 Apr 2022 23:00 UTC

All,

There has been remarkably little work that tries to measure
programming language similarity.

Yes, there are many multi-language runtime benchmark comparisons, and
people extract data from Wikipedia to made dubious claims.

Does anybody know of other kinds of attempts at measuring language
similarity?

Here is one approach
https://shape-of-code.com/2022/04/24/programming-language-similarity-based-on-their-traits/
[That seems awfully simplistic. Fortran and PL/I both have FORMAT statements that look
superficially similar but the semantics are very different. -John]

Re: Programming language similarity

<22-04-013@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=362&group=comp.compilers#362

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: der...@NOSPAM-knosof.co.uk (Derek Jones)
Newsgroups: comp.compilers
Subject: Re: Programming language similarity
Date: Mon, 25 Apr 2022 08:59:44 +0100
Organization: Compilers Central
Lines: 18
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-04-013@comp.compilers>
References: <22-04-012@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="52400"; mail-complaints-to="abuse@iecc.com"
Keywords: design, semantics
Posted-Date: 25 Apr 2022 12:17:24 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Content-Language: en-US
In-Reply-To: <22-04-012@comp.compilers>
 by: Derek Jones - Mon, 25 Apr 2022 07:59 UTC

John,

> https://shape-of-code.com/2022/04/24/programming-language-similarity-based-on-their-traits/
> [That seems awfully simplistic.  Fortran and PL/I both have FORMAT statements that look
> superficially similar but the semantics are very different. -John]

Many keywords have different meanings, e.g., the do keyword in Fortran/C.

Even binary operators differ, binary plus for string concatenation.

The blog post uses a token based approach, which does not require
lots of time to gather the data.

A semantics based approach requires lots of head scratching. I made a
start by collecting information on function definitions (mostly forms
of argument passing). The semantic traits I looked at tended to have a
small number of characteristics, so some form of aggregating is needed
to create significant differences.

Re: Programming language similarity

<22-04-014@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=363&group=comp.compilers#363

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: prone...@gmail.com (Fernando)
Newsgroups: comp.compilers
Subject: Re: Programming language similarity
Date: Mon, 25 Apr 2022 04:24:38 -0700 (PDT)
Organization: Compilers Central
Lines: 25
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-04-014@comp.compilers>
References: <22-04-012@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="52691"; mail-complaints-to="abuse@iecc.com"
Keywords: design, semantics
Posted-Date: 25 Apr 2022 12:17:55 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-04-012@comp.compilers>
 by: Fernando - Mon, 25 Apr 2022 11:24 UTC

Hi Derek,

Your repository is very nice! Can I use the "language info" part in the class
on programming language paradigms? It will be nice to give students some idea
about the number of keywords in different programming languages, for
instance.

By the way, perhaps you should consider also comparing the languages with
regards to the static and the dynamic aspects of their type systems, e.g.:
typing discipline (static, dynamic, gradual?), type verification (inference,
annotations, mixed?), type enforcement (weak, strong), static type equivalence
(nominal, structural, mixed?), etc. That might lead to very different trees.
For instance, in your keyword tree, Java and JavaScript are close, but they
are very different semantically.

> Does anybody know of other kinds of attempts at measuring language
similarity?

About that: I don't know of other studies. There is the article on Wikipedia
(Programming Languages Comparison), but it does not cite a paper with a
comparative study.

Regards,

Fernando

Re: Programming language similarity

<22-04-016@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=365&group=comp.compilers#365

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: 0xe2.0x9...@gmail.com (Jan Ziak)
Newsgroups: comp.compilers
Subject: Re: Programming language similarity
Date: Mon, 25 Apr 2022 06:00:12 -0700 (PDT)
Organization: Compilers Central
Lines: 17
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-04-016@comp.compilers>
References: <22-04-012@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="57922"; mail-complaints-to="abuse@iecc.com"
Keywords: design
Posted-Date: 25 Apr 2022 12:33:39 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-04-012@comp.compilers>
 by: Jan Ziak - Mon, 25 Apr 2022 13:00 UTC

On Monday, April 25, 2022 at 4:49:03 AM UTC+2, Derek Jones wrote:
> All,
>
> There has been remarkably little work that tries to measure
> programming language similarity.
>
> Yes, there are many multi-language runtime benchmark comparisons, and
> people extract data from Wikipedia to made dubious claims.
>
> Does anybody know of other kinds of attempts at measuring language
> similarity? ...

Just some "food for thought" on a conceptually similar topic:

Denis Roegel: A brief survey of 20th century logical notations (https://hal.inria.fr/hal-02340520/document)

-atom

Re: Programming language similarity

<22-04-017@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=366&group=comp.compilers#366

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: meshach....@gmail.com (Meshach Mitchell)
Newsgroups: comp.compilers
Subject: Re: Programming language similarity
Date: Mon, 25 Apr 2022 12:06:02 -0400
Organization: Compilers Central
Lines: 54
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-04-017@comp.compilers>
References: <22-04-012@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="58444"; mail-complaints-to="abuse@iecc.com"
Keywords: design
Posted-Date: 25 Apr 2022 12:34:48 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-04-012@comp.compilers>
 by: Meshach Mitchell - Mon, 25 Apr 2022 16:06 UTC

I could see how that could be interesting as an academic pursuit, but I
think the dearth of exploration here is most likely because pretty much
anyone in a position to do that already knows that every turing complete
language is equivalent. The comparison, therefore, would be a comparison of
placement of syntactic sugar. I have trouble visualizing a real-world use
for such a comparison, by which I mean, what is the problem that I would be
able to solve by knowing which languages are similar? In the current
environment, anywhere you would work already has a whole tech stack already
mapped out.

I have actually thought about this, and vaguely remember looking up
articles on the subject. The article you linked is interesting, but I agree
with your analysis; semantic similarity has some value but IMO what really
matters is "supported patterns". ie. what a language provides "for free".
Now., TINSTAAFL, so there is no real "free" but there is some optimization
done by a language [compiler, interpreter] to support statements
represented in the grammar. An example that comes to mind is in javascript
(I know, I *know*, but I have a family, and we need to eat.) Early
implementations of async in js used the *Promise* object to implement
asynchronous execution, but newer versions of the language use *async* and
*await* keywords. The former piggy-backs on the existing OO architecture,
while the latter, implemented as keywords, is available to lower level
abstraction and optimization.

We've been doing this long enough that a number of "higher level" patterns
have emerged. The aforementioned asynchronous (threaded, maybe?) execution
is one. *Events* also come to mind, which are generally implemented as good
old-fashioned polling under the hood or function registration and
hash-lookup. What is actually happening in the machine translates to vastly
different computation cost, and seems to me to be non-trivial. I think a
meaningful categorization could be done based on this idea of language
"provisions" over language semantics, and some deeper analysis of how
exactly a language [compiler, interpreter] implements what necessarily
boils down to syntactic sugar.

To answer your actual question, No, I don't know of other attempts, but I
can understand the scarcity. Hope my thoughts have some value.

-- Meshach Mitchell

On Sun, Apr 24, 2022 at 10:49 PM Derek Jones <derek@nospam-knosof.co.uk>
wrote:

> All,
>
> There has been remarkably little work that tries to measure
> programming language similarity.
>
> Yes, there are many multi-language runtime benchmark comparisons, and
> people extract data from Wikipedia to made dubious claims.
>
> Does anybody know of other kinds of attempts at measuring language
> similarity?

Re: Programming language similarity

<22-04-018@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=367&group=comp.compilers#367

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: der...@NOSPAM-knosof.co.uk (Derek Jones)
Newsgroups: comp.compilers
Subject: Re: Programming language similarity
Date: Mon, 25 Apr 2022 19:35:43 +0100
Organization: Compilers Central
Lines: 41
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-04-018@comp.compilers>
References: <22-04-012@comp.compilers> <22-04-014@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="90391"; mail-complaints-to="abuse@iecc.com"
Keywords: design
Posted-Date: 25 Apr 2022 14:53:24 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Content-Language: en-US
In-Reply-To: <22-04-014@comp.compilers>
 by: Derek Jones - Mon, 25 Apr 2022 18:35 UTC

Fernando,

> Your repository is very nice! Can I use the "language info" part in the class
> on programming language paradigms? It will be nice to give students some idea

Please do. The code is under a GPL license.

> about the number of keywords in different programming languages, for
> instance.

I was surprised by the diversity of words used.

> By the way, perhaps you should consider also comparing the languages with
> regards to the static and the dynamic aspects of their type systems, e.g.:
> typing discipline (static, dynamic, gradual?), type verification (inference,
> annotations, mixed?), type enforcement (weak, strong), static type equivalence
> (nominal, structural, mixed?), etc. That might lead to very different trees.

I looked into building a tree based on allowed implicit types, with
the hope of coming up with a measure of strong/week typing.

A list of implicit conversions performed by a language seems like a
good start. But this approach makes Fortran 77 look like it's strongly
typed; there are fewer implicit conversions than other languages
because it supports fewer types, e.g., no enums or pointers. C's
relatively large number of integer types, and the corresponding
implicit conversions, make it look weakly typed compared to languages
with fewer integer types (and hence fewer implicit conversions).

The list of characteristics you list might be combined in some
meaningful way, such that a type 'distance' tree could be constructed.
Lots of careful reading of language specifications would be needed to
figure out the details.

> About that: I don't know of other studies. There is the article on Wikipedia
> (Programming Languages Comparison), but it does not cite a paper with a
> comparative study.

Some of the Yes/No classifications on this page are somewhat surprising
(at least to me)
https://en.wikipedia.org/wiki/Comparison_of_programming_languages

Re: Programming language similarity

<22-04-019@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=368&group=comp.compilers#368

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: der...@NOSPAM-knosof.co.uk (Derek Jones)
Newsgroups: comp.compilers
Subject: Re: Programming language similarity
Date: Mon, 25 Apr 2022 20:51:30 +0100
Organization: Compilers Central
Lines: 17
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-04-019@comp.compilers>
References: <22-04-012@comp.compilers> <22-04-016@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="22931"; mail-complaints-to="abuse@iecc.com"
Keywords: history
Posted-Date: 25 Apr 2022 16:54:55 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Content-Language: en-US
In-Reply-To: <22-04-016@comp.compilers>
 by: Derek Jones - Mon, 25 Apr 2022 19:51 UTC

Jan,

> Denis Roegel: A brief survey of 20th century logical notations (https://hal.inria.fr/hal-02340520/document)

This is an interesting collection of decisions made by authors
over 120 years.

What makes somebody choose a particular set of symbols.
My guess is that their past experience is a major factor,
i.e., the use of symbols they had previously been exposed to.

Of course it could be something as mundane as the characters
available on their typewriter, or their printer of the journal
the work was published in.

Then again, academics do love to do their own thing. Perhaps
the decisions are based on the need to be different.

Re: Programming language similarity

<22-04-020@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=369&group=comp.compilers#369

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: gah...@u.washington.edu (gah4)
Newsgroups: comp.compilers
Subject: Re: Programming language similarity
Date: Mon, 25 Apr 2022 14:58:12 -0700 (PDT)
Organization: Compilers Central
Lines: 25
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-04-020@comp.compilers>
References: <22-04-012@comp.compilers> <22-04-016@comp.compilers> <22-04-019@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="56753"; mail-complaints-to="abuse@iecc.com"
Keywords: design, history
Posted-Date: 25 Apr 2022 18:57:01 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
In-Reply-To: <22-04-019@comp.compilers>
 by: gah4 - Mon, 25 Apr 2022 21:58 UTC

On Monday, April 25, 2022 at 1:54:58 PM UTC-7, Derek Jones wrote:

(snip)

> What makes somebody choose a particular set of symbols.
> My guess is that their past experience is a major factor,
> i.e., the use of symbols they had previously been exposed to.

Early Fortran was limited by the number of characters available
on the IBM 026 keypunch. They redefined some of the punch
codes with different symbols for scientific use, as that was
easier than designing a whole new machine.

Much of that was then fixed with EBCDIC in S/360, where
an 8 bit code allowed, and pretty much required, that they be
separated. In any case, the characters (with new punches)
were kept. (And new compilers have an option to accept
the old punch codes.)

I do remember punching ALGOL programs on the 026, where
you had to use the multipunch key, along with big charts on
the wall, to get the needed characters.

In any case, character set limitations stay with us long after
the reason for the limitation has gone.

Re: Programming language similarity

<22-04-022@comp.compilers>

 copy mid

https://www.novabbs.com/devel/article-flat.php?id=371&group=comp.compilers#371

 copy link   Newsgroups: comp.compilers
Path: i2pn2.org!i2pn.org!weretis.net!feeder6.news.weretis.net!news.misty.com!news.iecc.com!.POSTED.news.iecc.com!nerds-end
From: der...@NOSPAM-knosof.co.uk (Derek Jones)
Newsgroups: comp.compilers
Subject: Re: Programming language similarity
Date: Tue, 26 Apr 2022 00:50:23 +0100
Organization: Compilers Central
Lines: 9
Sender: news@iecc.com
Approved: comp.compilers@iecc.com
Message-ID: <22-04-022@comp.compilers>
References: <22-04-012@comp.compilers> <22-04-016@comp.compilers> <22-04-019@comp.compilers> <22-04-020@comp.compilers>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Info: gal.iecc.com; posting-host="news.iecc.com:2001:470:1f07:1126:0:676f:7373:6970";
logging-data="98154"; mail-complaints-to="abuse@iecc.com"
Keywords: history
Posted-Date: 25 Apr 2022 22:40:48 EDT
X-submission-address: compilers@iecc.com
X-moderator-address: compilers-request@iecc.com
X-FAQ-and-archives: http://compilers.iecc.com
Content-Language: en-US
In-Reply-To: <22-04-020@comp.compilers>
 by: Derek Jones - Mon, 25 Apr 2022 23:50 UTC

gah4,

> In any case, character set limitations stay with us long after
> the reason for the limitation has gone.

More than you probably wanted to know about character set
history still being with us
https://archive.org/details/mackenzie-coded-char-sets

1
server_pubkey.txt

rocksolid light 0.9.7
clearnet tor