Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

6 May, 2024: The networking issue during the past two days has been identified and appears to be fixed. Will keep monitoring.


devel / comp.lang.python / Re: What to use for finding as many syntax errors as possible.

SubjectAuthor
* Re: What to use for finding as many syntax errors as possible.Avi Gross
`* Re: What to use for finding as many syntax errors as possible.Michael F. Stemper
 +- Re: What to use for finding as many syntax errors as possible.Robert Latest
 +- Re: What to use for finding as many syntax errors as possible.Cameron Simpson
 +- Re: What to use for finding as many syntax errors as possible.Thomas Passin
 +- RE: What to use for finding as many syntax errors as possible.<avi.e.gross
 +- Re: What to use for finding as many syntax errors as possible.Chris Angelico
 +- RE: What to use for finding as many syntax errors as possible.<avi.e.gross
 +- Re: What to use for finding as many syntax errors as possible.Chris Angelico
 +- RE: What to use for finding as many syntax errors as possible.<avi.e.gross
 +- Re: What to use for finding as many syntax errors as possible.Weatherby,Gerard
 +- Re: What to use for finding as many syntax errors as possible.Chris Angelico
 +- Re: What to use for finding as many syntax errors as possible.Thomas Passin
 +- Re: What to use for finding as many syntax errors as possible.Chris Angelico
 +- Re: What to use for finding as many syntax errors as possible.Thomas Passin
 +- Re: What to use for finding as many syntax errors as possible.Thomas Passin
 `- Re: What to use for finding as many syntax errors as possible.Peter J. Holzer

1
Re: What to use for finding as many syntax errors as possible.

<mailman.556.1665330610.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19694&group=comp.lang.python#19694

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: avi.e.gr...@gmail.com (Avi Gross)
Newsgroups: comp.lang.python
Subject: Re: What to use for finding as many syntax errors as possible.
Date: Sun, 9 Oct 2022 11:49:55 -0400
Lines: 30
Message-ID: <mailman.556.1665330610.20444.python-list@python.org>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
X-Trace: news.uni-berlin.de MBLbFBIkzo3FDNf1q1xooACBWIGaAloyBeoe4fcIiHQA==
Return-Path: <avi.e.gross@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=EdISImwh;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.014
X-Spam-Evidence: '*H*': 0.97; '*S*': 0.00; 'away.': 0.05; 'variable':
0.05; 'sun,': 0.07; 'cc:addr:python-list': 0.09; 'url:mailman':
0.15; 'syntax': 0.15; '2022,': 0.16; 'agreements': 0.16; 'antoon':
0.16; 'cc:name:python list': 0.16; 'definitions': 0.16; 'ones?':
0.16; 'pardon': 0.16; 'recover': 0.16; 'subject:syntax': 0.16;
'tries': 0.16; 'wrote:': 0.16; 'python': 0.16; 'uses': 0.19;
'figure': 0.19; 'it?': 0.19; 'cc:addr:python.org': 0.20; 'first,':
0.22; 'code': 0.23; 'url-ip:188.166.95.178/32': 0.25; 'url-
ip:188.166.95/24': 0.25; 'url:listinfo': 0.25; 'cc:2**0': 0.25;
'url-ip:188.166/16': 0.25; 'local': 0.27; 'function': 0.27;
'wrong': 0.28; 'it,': 0.29; 'error': 0.29; 'url-ip:188/8': 0.31;
'guess': 0.32; 'received:209.85.214': 0.32; 'message-
id:@mail.gmail.com': 0.32; 'but': 0.32; 'subject:for': 0.33;
'there': 0.33; '100': 0.33; 'header:In-Reply-To:1': 0.34;
'received:google.com': 0.34; 'from:addr:gmail.com': 0.35; 'fix':
0.36; 'errors': 0.36; 'functions': 0.36; 'change': 0.36;
'received:209.85': 0.37; 'others': 0.37; 'file': 0.38; 'could':
0.38; 'received:209': 0.39; 'safe': 0.39; 'finding': 0.39;
'still': 0.40; 'subject:What': 0.40; 'want': 0.40; 'try': 0.40;
'should': 0.40; 'to:none': 0.60; 'likely': 0.61; 'full': 0.64;
'universal': 0.64; 'tool': 0.65; 'right': 0.68; 'resume': 0.69;
'within': 0.69; 'risk': 0.71; 'zone': 0.76; 'strategy': 0.84;
'dozen': 0.84; 'spell': 0.84; 'subject:many': 0.84
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
h=cc:subject:message-id:date:from:in-reply-to:references:mime-version
:from:to:cc:subject:date:message-id:reply-to;
bh=+lbVTpzqRk8VVINmR2wfBHXTg+7KUuWrPFQ+7J2LkIE=;
b=EdISImwhAtCRNA77iliE8CXZ4r070Yy60uTmWq+3R6WDU7aywVDoDFBDXWAY9SWj7S
0UcvCPAY6lPYMSqDcUisuaLxK0wrMH2tca62Dq0bBYEbqn6hosi88M6pX6bJQL+Llrm/
qpBUxYdfGtm8rI5ajpwrBgG8svTWXdvVL6jyysj/rRKdTON7rX/0yT8LxfNouJ6rh9y3
EdIgh6/SkHO9qtXnPywSx/McVIjqeHCHaS7OZfksya4Etput/tf1UB00hd7E+gvk18pJ
9YgiQqcapxWu5bfA/pW3LcAeJI42AdPCPZ9UxVwTahCPCGvAjOl8dkknOPBFuf6fPUIh
49mg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=cc:subject:message-id:date:from:in-reply-to:references:mime-version
:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
bh=+lbVTpzqRk8VVINmR2wfBHXTg+7KUuWrPFQ+7J2LkIE=;
b=LztUc02uJAXdo7s7j3FpJOpSDe9oUA5y0cVZrsYZyxqH2BkpA/sJHCfFbzsLzdAjMt
FJj1YtGrLLiLscBCPEAcOZmywZH82ka0huJXJfAU3bVYWylPpLsyksekpGzS9R8J07Vv
PxwxYDXbycU/F2/r89VKMlS/Uqh3YcmQ5YBvdjB+DSonCe6p9Jved8zohjOxGQG/JVyH
ZLaBH50EAWnUi+k79VNTifldOlSM7+XAvvey6V9/UAPSEh/7x9IxtbuPpL8FucGtYBv6
yKv72I1Q8ffe2z4tDuTFiU7C7HhRpCRBCZzabQTVuwgvhK1rHWcSbcf8vw/8XHDqiyHO
H7/g==
X-Gm-Message-State: ACrzQf1hV2nt3Yp0LysTdQpP5OlvGvVuyPdkeea01/h030cHMn4GeQJF
vsB3dhUqfyRJ09V6+m+4AiWygfqPxH0JM2KadCh6DgQC
X-Google-Smtp-Source: AMsMyM5QYE2+KTyVPPUGCglCKiFjZ23kV1C9+mWDV/4NX1CrcdJw7Uva2hWTlbl0/6QNMJpQXkBdCUHkLxgMxXhVR0M=
X-Received: by 2002:a17:90a:6441:b0:203:6aa1:56f8 with SMTP id
y1-20020a17090a644100b002036aa156f8mr26176053pjm.25.1665330607175; Sun, 09
Oct 2022 08:50:07 -0700 (PDT)
In-Reply-To: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
X-Content-Filtered-By: Mailman/MimeDel 2.1.39
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
X-Mailman-Original-References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
 by: Avi Gross - Sun, 9 Oct 2022 15:49 UTC

Anton

There likely are such programs out there but are there universal agreements
on how to figure out when a new safe zone of code starts where error
testing can begin?

For example a file full of function definitions might find an error in
function 1 and try to find the end of that function and resume checking the
next function. But what if a function defines local functions within it?
What if the mistake in one line of code could still allow checking the next
line rather than skipping it all?

My guess is that finding 100 errors might turn out to be misleading. If you
fix just the first, many others would go away. If you spell a variable name
wrong when declaring it, a dozen uses of the right name may cause errors.
Should you fix the first or change all later ones?

On Sun, Oct 9, 2022, 6:11 AM Antoon Pardon <antoon.pardon@vub.be> wrote:

> I would like a tool that tries to find as many syntax errors as possible
> in a python file. I know there is the risk of false positives when a
> tool tries to recover from a syntax error and proceeds but I would
> prefer that over the current python strategy of quiting after the first
> syntax error. I just want a tool for syntax errors. No style
> enforcements. Any recommandations? -- Antoon Pardon
> --
> https://mail.python.org/mailman/listinfo/python-list
>

Re: What to use for finding as many syntax errors as possible.

<ti169d$qntd$1@dont-email.me>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19754&group=comp.lang.python#19754

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: michael....@gmail.com (Michael F. Stemper)
Newsgroups: comp.lang.python
Subject: Re: What to use for finding as many syntax errors as possible.
Date: Mon, 10 Oct 2022 08:21:41 -0500
Organization: A noiseless patient Spider
Lines: 31
Message-ID: <ti169d$qntd$1@dont-email.me>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Mon, 10 Oct 2022 13:21:50 -0000 (UTC)
Injection-Info: reader01.eternal-september.org; posting-host="65808dbab928aa08508cae4732febbde";
logging-data="876461"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/A1bWDGKweJ1sorvryc1LLuaXtsBw2Kv0="
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
Thunderbird/68.10.0
Cancel-Lock: sha1:hHQurom0fm0prSHRLvW95BCbo6A=
Content-Language: en-US
In-Reply-To: <mailman.556.1665330610.20444.python-list@python.org>
 by: Michael F. Stemper - Mon, 10 Oct 2022 13:21 UTC

On 09/10/2022 10.49, Avi Gross wrote:
> Anton
>
> There likely are such programs out there but are there universal agreements
> on how to figure out when a new safe zone of code starts where error
> testing can begin?
>
> For example a file full of function definitions might find an error in
> function 1 and try to find the end of that function and resume checking the
> next function. But what if a function defines local functions within it?
> What if the mistake in one line of code could still allow checking the next
> line rather than skipping it all?
>
> My guess is that finding 100 errors might turn out to be misleading. If you
> fix just the first, many others would go away. If you spell a variable name
> wrong when declaring it, a dozen uses of the right name may cause errors.
> Should you fix the first or change all later ones?

How does one declare a variable in python? Sometimes it'd be nice to
be able to have declarations and any undeclared variable be flagged.

When I was writing F77 for a living, I'd (temporarily) put:
IMPLICIT CHARACTER*3
at the beginning of a program or subroutine that I was modifying,
in order to have any typos flagged.

I'd love it if there was something similar that I could do in python.

--
Michael F. Stemper
87.3% of all statistics are made up by the person giving them.

Re: What to use for finding as many syntax errors as possible.

<jqj1otFf6lsU4@mid.individual.net>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19764&group=comp.lang.python#19764

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: boblat...@yahoo.com (Robert Latest)
Newsgroups: comp.lang.python
Subject: Re: What to use for finding as many syntax errors as possible.
Date: 10 Oct 2022 17:06:37 GMT
Lines: 7
Message-ID: <jqj1otFf6lsU4@mid.individual.net>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
X-Trace: individual.net fVjAndOEo+hspHTBpmhGuQ5qL8icU8TxYtllfVEVz0O0mqxKTw
Cancel-Lock: sha1:tz3A0gx/OQTtEvv/clhtshiVrT4=
User-Agent: slrn/1.0.3 (Linux)
 by: Robert Latest - Mon, 10 Oct 2022 17:06 UTC

Michael F. Stemper wrote:
> How does one declare a variable in python? Sometimes it'd be nice to
> be able to have declarations and any undeclared variable be flagged.

To my knowledge, the closest to that is using __slots__ in class definitions.
Many a time have I assigned to misspelled class members until I discovered
__slots__.

Re: What to use for finding as many syntax errors as possible.

<mailman.629.1665440550.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19779&group=comp.lang.python#19779

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: cs...@cskk.id.au (Cameron Simpson)
Newsgroups: comp.lang.python
Subject: Re: What to use for finding as many syntax errors as possible.
Date: Tue, 11 Oct 2022 09:22:25 +1100
Lines: 29
Message-ID: <mailman.629.1665440550.20444.python-list@python.org>
References: <ti169d$qntd$1@dont-email.me>
<Y0SbIQ5ZytV1AIR4@cskk.homeip.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii; format=flowed
X-Trace: news.uni-berlin.de G4GKu1GgcIbMYMlfqLoGOQ6B3NbYXt5dF1RhQlX3WjKg==
Return-Path: <cameron@cskk.id.au>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.002
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'python?': 0.03; 'away.':
0.05; 'variable': 0.05; 'python.': 0.08; 'occasionally': 0.09;
'cheers,': 0.11; 'syntax': 0.15; 'antoon': 0.16; 'avi': 0.16;
'cameron': 0.16; 'declare': 0.16; 'from:addr:cs': 0.16;
'from:addr:cskk.id.au': 0.16; 'from:name:cameron simpson': 0.16;
'gross': 0.16; "it'd": 0.16; 'message-id:@cskk.homeip.net': 0.16;
'ones?': 0.16; 'received:13.237': 0.16; 'received:13.237.201':
0.16; 'received:13.237.201.189': 0.16; 'received:cskk.id.au':
0.16; 'received:id.au': 0.16; 'received:mail.cskk.id.au': 0.16;
'set).': 0.16; 'simpson': 0.16; 'static': 0.16; 'subject:syntax':
0.16; 'these.': 0.16; 'wrote:': 0.16; 'says': 0.17; 'uses': 0.19;
'to:addr:python-list': 0.20; 'first,': 0.22; 'it,': 0.29; 'header
:User-Agent:1': 0.30; 'guess': 0.32; 'but': 0.32; 'subject:for':
0.33; 'there': 0.33; '100': 0.33; 'path': 0.33; 'script': 0.33;
'able': 0.34; 'header:In-Reply-To:1': 0.34; 'particularly': 0.35;
'received:au': 0.35; 'runs': 0.35; 'fix': 0.36; 'errors': 0.36;
'change': 0.36; 'others': 0.37; 'this.': 0.37; 'could': 0.38;
'this,': 0.39; 'use': 0.39; 'finding': 0.39; 'subject:What': 0.40;
'something': 0.40; 'michael': 0.60; 'love': 0.62; 'received:13':
0.64; 'similar': 0.65; 'well': 0.65; 'received:userid': 0.66;
'right': 0.68; 'analysis': 0.69; 'compared': 0.71; '"set': 0.84;
'dozen': 0.84; 'spell': 0.84; 'subject:many': 0.84; 'errors,':
0.91
Mail-Followup-To: python-list@python.org
Content-Disposition: inline
In-Reply-To: <ti169d$qntd$1@dont-email.me>
User-Agent: Mutt/2.2.7 (2022-08-07)
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <Y0SbIQ5ZytV1AIR4@cskk.homeip.net>
X-Mailman-Original-References: <ti169d$qntd$1@dont-email.me>
 by: Cameron Simpson - Mon, 10 Oct 2022 22:22 UTC

On 09/10/2022 10.49, Avi Gross wrote:
>>My guess is that finding 100 errors might turn out to be misleading.
>>If you
>>fix just the first, many others would go away. If you spell a variable name
>>wrong when declaring it, a dozen uses of the right name may cause errors.
>>Should you fix the first or change all later ones?

Just to this, these are semantic errors, not syntax errors. Linters do
an ok job of spotting these. Antoon is after _syntax errors_.

On 10Oct2022 08:21, Michael F. Stemper <michael.stemper@gmail.com> wrote:
>How does one declare a variable in python? Sometimes it'd be nice to
>be able to have declarations and any undeclared variable be flagged.

Linters do pretty well at this. They can trace names and their use
compared to their first definition/assignment (often - there are of
course some constructs which are correct but unclear to a static
analysis - certainly one of my linters occasionally says "possible
undefine use" to me because there may be a path to use before set). This
is particularly handy for typos, which often make for "use before set"
or "set and not used".

>I'd love it if there was something similar that I could do in python.

Have you used any lint programmes? My "lint" script runs pyflakes and
pylint.

Cheers,
Cameron Simpson <cs@cskk.id.au>

Re: What to use for finding as many syntax errors as possible.

<mailman.631.1665445350.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19781&group=comp.lang.python#19781

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: lis...@tompassin.net (Thomas Passin)
Newsgroups: comp.lang.python
Subject: Re: What to use for finding as many syntax errors as possible.
Date: Mon, 10 Oct 2022 18:25:42 -0400
Lines: 38
Message-ID: <mailman.631.1665445350.20444.python-list@python.org>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
<1a7b14b1-5648-89a1-af03-771b9170d37a@tompassin.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de gcq3KxpDNzd91x5Gs0hxtgo4tK+yx2sKrERpqPOgsyXA==
Return-Path: <list1@tompassin.net>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=tompassin.net header.i=@tompassin.net header.b=Q8uZdQ4I;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.004
X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'url-ip:140.82/16': 0.03;
'python?': 0.03; 'away.': 0.05; 'variable': 0.05; 'python.': 0.08;
'implicit': 0.09; 'received:23.83.212': 0.09;
'received:elm.relay.mailchannels.net': 0.09; 'url:github': 0.14;
'syntax': 0.15; 'url-ip:140/8': 0.15; '(python)': 0.16;
'agreements': 0.16; 'avi': 0.16; 'declare': 0.16; 'definitions':
0.16; 'gross': 0.16; "it'd": 0.16; 'ones?': 0.16;
'received:10.0.0': 0.16; 'received:64.90': 0.16;
'received:64.90.62': 0.16; 'received:64.90.62.162': 0.16;
'subject:syntax': 0.16; 'wrote:': 0.16; 'uses': 0.19; 'figure':
0.19; 'it?': 0.19; 'to:addr:python-list': 0.20; 'first,': 0.22;
'code': 0.23; "i'd": 0.24; '(and': 0.25; 'local': 0.27;
'function': 0.27; 'wrong': 0.28; 'it,': 0.29; 'error': 0.29;
'header:User-Agent:1': 0.30; 'am,': 0.31; 'program': 0.31;
'guess': 0.32; 'received:10.0': 0.32; 'received:mailchannels.net':
0.32; 'received:relay.mailchannels.net': 0.32; 'but': 0.32;
'subject:for': 0.33; 'there': 0.33; '100': 0.33; 'able': 0.34;
'header:In-Reply-To:1': 0.34; 'fix': 0.36; 'errors': 0.36;
'functions': 0.36; 'change': 0.36; 'others': 0.37; 'file': 0.38;
'could': 0.38; 'safe': 0.39; 'finding': 0.39; 'still': 0.40;
'beginning': 0.40; 'subject:What': 0.40; 'something': 0.40; 'try':
0.40; 'should': 0.40; 'michael': 0.60; 'likely': 0.61; 'love':
0.62; 'full': 0.64; 'universal': 0.64; 'your': 0.64; 'similar':
0.65; 'header:Received:6': 0.67; 'received:64': 0.67; 'right':
0.68; 'order': 0.69; 'resume': 0.69; 'within': 0.69; 'zone': 0.76;
'dozen': 0.84; 'living,': 0.84; 'spell': 0.84; 'subject:many':
0.84; '\xc2\xa0\xc2\xa0\xc2\xa0\xc2\xa0\xc2\xa0': 0.84
X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net
ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1665440744; a=rsa-sha256;
cv=none;
b=i44snUAB/RGElvnDQJT2GZ+1BPw0VEiteoTOkrUZJXhgEGQCnt7M9CT9DFVTpGJaPRWZQG
I9dcKCEGF1ggFtKXIO1ampQ1bfOQAdcy6NQRizrzLXnehdNq1smtMXhfrjJEse4gTBHyL8
qTMFKwiKMbdNJfz+e0AsVG3n7VvDc1j5bZGY2XOr4H2JUI72MnancqVNlp17iQgM3fsI7E
/zx3PNYiL1Y8yF08kwVAvMikVn9/Zm0UL63DsA3bWYTBYMPPM7ZFeeNBfMjXtMCokeIpfD
ouO8r3lBFW1fCfNon8D0q2N/ogANpOrOFr9DeYMXmV85f6VLTu13yT+YpwpPww==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
d=mailchannels.net; s=arc-2022; t=1665440744;
h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
to:to:cc:mime-version:mime-version:content-type:content-type:
content-transfer-encoding:content-transfer-encoding:
in-reply-to:in-reply-to:references:references:dkim-signature;
bh=kbvXeOZhuC3qWO0Ys6n0PuiDyPyiLbXLjFvY/e5aYD0=;
b=bC6sJ33JQDnGe6OxtgSsGcKlZbwd8lytD6WfkRbjf+KP990Pt/GA3z0p9bwMcrG0SHu6EM
BuEI1IOm7t5130hcH6/vEDSUGbfblUlFfuHnPUeW3nlgZgih71U8FO5jfEZnOKJsmTnSQM
X11etgcYfhpUmHAh0S0+uXUi6M48CIaBetrvlxW+fUl8BgpBZIU6dXcZs0twi5kXlDEEoa
cHe6y93J346BpslIwqj6CKTDI0p1fpo5jUkJjzmfQBtx1VmNWTztwNchUZ6bpzcTj6GmAe
ykjgRDO2U8Oxh78sie4W4tPaWK6qP4nec2H0AB/zz5smfTGgdDuAk1lezS9+6g==
ARC-Authentication-Results: i=1; rspamd-5798657bcf-986bg;
auth=pass smtp.auth=dreamhost smtp.mailfrom=list1@tompassin.net
X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net
X-MC-Relay: Neutral
X-MailChannels-SenderId: dreamhost|x-authsender|tpassin@tompassin.net
X-MailChannels-Auth-Id: dreamhost
X-Thoughtful-Harmony: 32b9c4b41aafe28a_1665440744269_3882744942
X-MC-Loop-Signature: 1665440744269:3546680206
X-MC-Ingress-Time: 1665440744269
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tompassin.net;
s=dreamhost; t=1665440743;
bh=kbvXeOZhuC3qWO0Ys6n0PuiDyPyiLbXLjFvY/e5aYD0=;
h=Date:Subject:To:From:Content-Type:Content-Transfer-Encoding;
b=Q8uZdQ4IgQ86pmTKLECFTA2qpcZQKKoLqtolvBBctwi9GcE4YaHrALrbOWsidXmDp
1ofG/rwaS9CUV7eYTXJgUg2qUHAgUugZL6h7XdhDmq+F57Z3UxRh42lbis/WYh0fBk
CbG/+ZVRZQVewb8Wi8Hdf4APP2y/gnoOXT6q3RO92MYCFTJThGvLqLD5TmC5nFCOWR
rK/ftS24oj7u6qOU3EzZ8AnRsOqVtemwPl44R25TQ7lIWnT6mEhvFZihtOSqdChKdr
zc3gDtOIRDQZZi/dLdKTLxB/BGh46HwbudP3L3/W7uSwSes+QAfyq9TTKYN0HajKPj
6R3nki/db1DEA==
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.3.2
Content-Language: en-US
In-Reply-To: <ti169d$qntd$1@dont-email.me>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <1a7b14b1-5648-89a1-af03-771b9170d37a@tompassin.net>
X-Mailman-Original-References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
 by: Thomas Passin - Mon, 10 Oct 2022 22:25 UTC

On 10/10/2022 9:21 AM, Michael F. Stemper wrote:
> On 09/10/2022 10.49, Avi Gross wrote:
>> Anton
>>
>> There likely are such programs out there but are there universal
>> agreements
>> on how to figure out when a new safe zone of code starts where error
>> testing can begin?
>>
>> For example a file full of function definitions might find an error in
>> function 1 and try to find the end of that function and resume
>> checking the
>> next function.  But what if a function defines local functions within it?
>> What if the mistake in one line of code could still allow checking the
>> next
>> line rather than skipping it all?
>>
>> My guess is that finding 100 errors might turn out to be misleading.
>> If you
>> fix just the first, many others would go away. If you spell a variable
>> name
>> wrong when declaring it, a dozen uses of the right name may cause errors.
>> Should you fix the first or change all later ones?
>
> How does one declare a variable in python? Sometimes it'd be nice to
> be able to have declarations and any undeclared variable be flagged.
>
> When I was writing F77 for a living, I'd (temporarily) put:
>       IMPLICIT CHARACTER*3
> at the beginning of a program or subroutine that I was modifying,
> in order to have any typos flagged.
>
> I'd love it if there was something similar that I could do in python.

The Leo editor (https://github.com/leo-editor/leo-editor) will notify
you of undeclared variables (and some syntax errors) each time you save
your (Python) file.

RE: What to use for finding as many syntax errors as possible.

<mailman.632.1665454152.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19782&group=comp.lang.python#19782

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From:
Newsgroups: comp.lang.python
Subject: RE: What to use for finding as many syntax errors as possible.
Date: Mon, 10 Oct 2022 22:09:06 -0400
Lines: 108
Message-ID: <mailman.632.1665454152.20444.python-list@python.org>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
<008201d8dd16$718fd420$54af7c60$@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Trace: news.uni-berlin.de 9/qX7lNavr1knjenjjpNggNJIQfzMtfEuIbvzYvEbsTQ==
Return-Path: <avi.e.gross@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=Hc0yvX3E;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.002
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'python?': 0.03; 'this:':
0.03; '2022': 0.05; 'away.': 0.05; 'searching': 0.05; 'variable':
0.05; 'formal': 0.07; 'lets': 0.07; 'python.': 0.08; 'implicit':
0.09; 'meant': 0.09; 'namespace': 0.09; 'numpy': 0.09; 'other.':
0.09; 'parse': 0.09; 'received:209.85.219': 0.09; 'import': 0.15;
'url:mailman': 0.15; 'syntax': 0.15; 'agreements': 0.16;
'attributes': 0.16; 'avi': 0.16; 'corrected': 0.16; 'declare':
0.16; 'declared': 0.16; 'definitions': 0.16; 'encounter': 0.16;
'explicit': 0.16; 'gross': 0.16; 'initialize': 0.16; "it'd": 0.16;
'locate': 0.16; 'ones?': 0.16; 'patterns': 0.16; 'relatively':
0.16; 'run-time': 0.16; 'subject:syntax': 0.16; 'tries': 0.16;
'wrote:': 0.16; 'python': 0.16; 'october': 0.17; 'code.': 0.17;
'message-id:@gmail.com': 0.18; 'uses': 0.19; 'figure': 0.19;
'it?': 0.19; 'to:addr:python-list': 0.20; 'language': 0.21;
'languages': 0.22; 'creation': 0.22; 'first,': 0.22; 'code': 0.23;
"i'd": 0.24; 'anything': 0.25; 'past': 0.25; 'skip:- 10': 0.25;
'url-ip:188.166.95.178/32': 0.25; 'url-ip:188.166.95/24': 0.25;
'examples': 0.25; 'url:listinfo': 0.25; 'cannot': 0.25; 'url-
ip:188.166/16': 0.25; 'brings': 0.26; 'local': 0.27; 'bit': 0.27;
'function': 0.27; 'suggest': 0.28; 'wrong': 0.28; 'it,': 0.29;
'error': 0.29; 'code,': 0.31; 'module': 0.31; 'url-ip:188/8':
0.31; 'program': 0.31; 'question': 0.32; 'context': 0.32; 'guess':
0.32; 'language.': 0.32; 'python-list': 0.32; 'but': 0.32;
'subject:for': 0.33; 'there': 0.33; '100': 0.33; 'able': 0.34;
'header:In-Reply-To:1': 0.34; 'received:google.com': 0.34;
'question.': 0.35; 'yes,': 0.35; 'from:addr:gmail.com': 0.35;
'fix': 0.36; 'errors': 0.36; 'functions': 0.36; 'people': 0.36;
'change': 0.36; 'those': 0.36; 'missing': 0.37; "it's": 0.37;
'received:209.85': 0.37; 'others': 0.37; 'file': 0.38; 'way':
0.38; 'could': 0.38; 'received:209': 0.39; 'least': 0.39; '10,':
0.61; 'likely': 0.61; 'above': 0.62; 'from:': 0.62; 'to:': 0.62;
'seen': 0.62; 'love': 0.62; 'reasonable': 0.62; 'here': 0.62;
'send': 0.63; 'once': 0.63; 'full': 0.64; 're:': 0.64;
'universal': 0.64; 'your': 0.64; 'top': 0.65; 'parts': 0.65;
'spam': 0.65; 'similar': 0.65; 'look': 0.65; 'well': 0.65;
'named': 0.65; 'let': 0.66; 'time.': 0.66; 'guaranteed': 0.67;
'right': 0.68; 'exactly': 0.68; 'order': 0.69; 'interpreted':
0.69; 'resume': 0.69; 'within': 0.69; 'depending': 0.70; '....':
0.76; 'supposed': 0.76; 'zone': 0.76; 'sent:': 0.78; 'happens':
0.84; 'dozen': 0.84; 'eventually': 0.84; 'living,': 0.84;
'michael,': 0.84; 'prototype': 0.84; 'spell': 0.84; 'spelling':
0.84; 'subject:many': 0.84; 'declaration': 0.91
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
h=thread-index:content-language:content-transfer-encoding
:mime-version:message-id:date:subject:in-reply-to:references:to:from
:from:to:cc:subject:date:message-id:reply-to;
bh=T7yi76IcZCDzNBu2ronONNsjCBAcTKyqxB35zj/g0hI=;
b=Hc0yvX3Ettcg1I+2uf/qwwIqHQ6Gt+ztK9ZHtlt3kw6Kj9CMqJMmyF9uMvRpViFoSJ
2B0p6jl2LFHuFJPRe7oqPi7kdTCsGueDOhTcn/fI/qOfxxCAdhtnhPEi+cJwKvfwF4JS
i6SNT2arENP08agcacu2FcHFSfy11wP/gJ+RbBkIQXjH04a7HTkle/iefM5DDT72qZwR
oTGZuAz1mHzrPbp+8f2/37v6zLom4hgMLCv8W8xsLW36phQo7Atxbo81WY+5PuOiwxji
/uZ/KhGx1dsTIvf7b1k4i/Dvju0SJcIZliDPvF/Yc4WFRcA50JlYvdvgUao1izgVuild
Dv8g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=thread-index:content-language:content-transfer-encoding
:mime-version:message-id:date:subject:in-reply-to:references:to:from
:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
bh=T7yi76IcZCDzNBu2ronONNsjCBAcTKyqxB35zj/g0hI=;
b=pS/D+Z6Iu+jhH0KxVIp76aXKwu5dzS20hIvF1j5mhu5msbTnqB8mYYVnK7vF2SPVyZ
3B0GLcIjNl5x7j+P0Jm+aTDOkzprBNJYJqnF6242/Z85Pu4XadRE8T251HsPpxRUn760
ym6DeqU7R4CCA0xEVe6XK7wIgN4C/UBMCsLJz3aURaC2kWPjZvu9BPVIA7kH/xDK7jjP
ijQCfvQiI8tVa9kPIf/lIDSBuxTCVSHQdMcFNEOs7/v1rqpGJXSJ/kv5Nq6oEO1c6wye
COB+gFHpfuQwtX2/5QYYwHOosUaPFSh8eyvM6v28PLTZxGYQiMA2vT1tzUsRr6Xdxy9U
9RFw==
X-Gm-Message-State: ACrzQf0pvtKeee8ziJx8wjN5oNSct1dUXS/uH7jsFPOvOTeNRekDx+Jh
IsQfwBFAMUWhynaO5kDAXW/az1wqml4=
X-Google-Smtp-Source: AMsMyM6O7BWQ/WCpeyuTc4oAJOmiaLhGsP6L/Vt9SolG1HaxGCg/l7S+mwWiW+O6ntJ1V5df1F4dOg==
X-Received: by 2002:a0c:e0ca:0:b0:4b1:7b31:94d3 with SMTP id
x10-20020a0ce0ca000000b004b17b3194d3mr16564350qvk.119.1665454148622;
Mon, 10 Oct 2022 19:09:08 -0700 (PDT)
In-Reply-To: <ti169d$qntd$1@dont-email.me>
X-Mailer: Microsoft Outlook 16.0
Content-Language: en-us
Thread-Index: AQJGir9yaZ0DEwGC3Gj3x2nt5vIbEwHsTY14AVeIrSwBuP6d/q0FKpJQ
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <008201d8dd16$718fd420$54af7c60$@gmail.com>
X-Mailman-Original-References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
 by: - Tue, 11 Oct 2022 02:09 UTC

Michael,

A reasonable question. Python lets you initialize variables but has no
explicit declarations. Languages differ and I juggle attributes of many in
my mind and am reacting to the original question NOT about whether and how
Python should report many possible errors all at once but how ANY language
can be expected to do this well. Many others do have a variable declaration
phase or an optional declaration or perhaps just a need to declare a
function prototype so it can be used by others even if the formal function
creation will happen later in the code.

But what I meant in a Python context was something like this:

Wronk = who cares # this should fail
....
If (Wronk > 5): ...
....
Wronger = Wronk + 1
....
X = minimum(Wronk, Wronger, 12)

The first line does not parse well so you have an error. But in any case as
the line makes no sense, Wronk is not initialized to anything. Later code
may use it in various ways and some of those may be seen as errors for an
assortment of reasons, then at one point the code does provide a value for
Wronk and suddenly code beyond that has no seeming errors. The above
examples are not meant to be real but just give a taste that programs with
holes in them for any reason may not be consistent. The only relatively
guaranteed test for sanity has to start at the top and encounter no errors
or missing parts based on an anything such as I/O errors.

And I suggest there are some things sort of declared in python such as:

Import numpy as np

Yes, that brings in code from a module if it works and initializes a
variable called np to sort of point at the module or it's namespace or
whatever, depending on the language. It is an assignment but also a way to
let the program know things. If the above is:

Import grumpy as np

Then what happens if the code tries to find a file named "grumpy" somewhere
and cannot locate it and this is considered a syntax error rather than a
run-time error for whatever reason? Can you continue when all kinds of
functionality is missing and code asking to make a np.array([1,2,3]) clearly
fails?

Many of us here are talking past each other.

Yes, it would be nice to get lots of info and arguably we may eventually
have machine-learning or AI programs a bit more like SPAM detectors that
look for patterns commonly found and try to fix your program from common
errors or at least do a temporary patch so they can continue searching for
more errors. This could result in the best case in guessing right every
time. If you allowed it to actually fix your code, it might be like people
who let their spelling be corrected and do not proofread properly and send
out something embarrassing or just plain wrong!

And it will compile or be interpreted without complaint albeit not do
exactly what it is supposed to!

-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On
Behalf Of Michael F. Stemper
Sent: Monday, October 10, 2022 9:22 AM
To: python-list@python.org
Subject: Re: What to use for finding as many syntax errors as possible.

On 09/10/2022 10.49, Avi Gross wrote:
> Anton
>
> There likely are such programs out there but are there universal
> agreements on how to figure out when a new safe zone of code starts
> where error testing can begin?
>
> For example a file full of function definitions might find an error in
> function 1 and try to find the end of that function and resume
> checking the next function. But what if a function defines local
functions within it?
> What if the mistake in one line of code could still allow checking the
> next line rather than skipping it all?
>
> My guess is that finding 100 errors might turn out to be misleading.
> If you fix just the first, many others would go away. If you spell a
> variable name wrong when declaring it, a dozen uses of the right name may
cause errors.
> Should you fix the first or change all later ones?

How does one declare a variable in python? Sometimes it'd be nice to be able
to have declarations and any undeclared variable be flagged.

When I was writing F77 for a living, I'd (temporarily) put:
IMPLICIT CHARACTER*3
at the beginning of a program or subroutine that I was modifying, in order
to have any typos flagged.

I'd love it if there was something similar that I could do in python.

--
Michael F. Stemper
87.3% of all statistics are made up by the person giving them.
--
https://mail.python.org/mailman/listinfo/python-list

Re: What to use for finding as many syntax errors as possible.

<mailman.633.1665456113.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19783&group=comp.lang.python#19783

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!aioe.org!feeder1.feed.usenet.farm!feed.usenet.farm!news-out.netnews.com!news.alt.net!fdc2.netnews.com!fu-berlin.de!uni-berlin.de!not-for-mail
From: ros...@gmail.com (Chris Angelico)
Newsgroups: comp.lang.python
Subject: Re: What to use for finding as many syntax errors as possible.
Date: Tue, 11 Oct 2022 13:41:40 +1100
Lines: 26
Message-ID: <mailman.633.1665456113.20444.python-list@python.org>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
<008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
X-Trace: news.uni-berlin.de OGzVW6VrrWPyhv3FXIyvxwdVyYl3A1bSPsQgzv1dYZwA==
Return-Path: <rosuav@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=dta3jvXl;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.023
X-Spam-Evidence: '*H*': 0.95; '*S*': 0.00; '2022': 0.05; 'run.': 0.05;
'variable': 0.05; 'import': 0.15; 'syntax': 0.15; 'chrisa': 0.16;
'from:addr:rosuav': 0.16; 'from:name:chris angelico': 0.16;
'locate': 0.16; 'parsing': 0.16; 'received:209.85.218': 0.16;
'run-time': 0.16; 'subject:syntax': 0.16; 'tries': 0.16; 'wrote:':
0.16; 'python': 0.16; 'tue,': 0.19; 'to:addr:python-list': 0.20;
'code': 0.23; 'cannot': 0.25; "isn't": 0.27; 'local': 0.27;
'error': 0.29; 'attempt': 0.31; "doesn't": 0.32; 'message-
id:@mail.gmail.com': 0.32; 'but': 0.32; 'subject:for': 0.33;
'header:In-Reply-To:1': 0.34; 'received:google.com': 0.34;
'majority': 0.35; 'from:addr:gmail.com': 0.35; 'errors': 0.36;
'missing': 0.37; 'using': 0.37; 'received:209.85': 0.37; 'file':
0.38; 'received:209': 0.39; 'text': 0.39; 'use': 0.39; 'program.':
0.40; 'subject:What': 0.40; 'try': 0.40; 'method': 0.61; 'above':
0.62; 'look': 0.65; 'named': 0.65; 'surrounding': 0.69; 'vast':
0.69; 'happens': 0.84; 'stage,': 0.84; 'subject:many': 0.84;
'errors,': 0.91; 'grammar': 0.91
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
h=to:subject:message-id:date:from:in-reply-to:references:mime-version
:from:to:cc:subject:date:message-id:reply-to;
bh=GeHP6SvgQ2+tRZO3InrH+Hl99RdiLp8KjrK/YWf9rXI=;
b=dta3jvXlOLJ/L5ta0Tk11SImykG0N9T9ZEhzlPRrNbi6FLjzqgQ40fGMsnfp+9uutE
e6MwuLWP4MaU1GFmAyp1kTwcdnItiIhOMyuzkUlbKUWhvR90XjmETBmpY6P2AkQ5an4o
OO8/Dtw9edxKmbcQUVWc7zO9B69znSZl//oiHgu47OIYB0O4NBdm0U+PAYYd7f+5GrrW
48rp1I9gN5hHum6EGdKUgiVpoXdzkVXuhjcjW3dhce+cBs/JGqb+vTBPCm2j0Kw4DLUa
ieOQ2NusZmvNqwpqAvCFnUIDxQS465StnDzYVdDzuElO78TOKXkJFNi+OK3WFLcxgWHK
tk+Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=to:subject:message-id:date:from:in-reply-to:references:mime-version
:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
bh=GeHP6SvgQ2+tRZO3InrH+Hl99RdiLp8KjrK/YWf9rXI=;
b=LiR3/EahxtEksH5vijVsD1Q7eBqe4mP9ft9s2ejz+dJOQ7BMmuflmHzI0gmGUVhMku
XAUunLdJkOkPt/2WXBZsqyf7amcLb62mm9jU17DpQ2NVbjETe3PJ1CpQ5IBbkiUOkPzu
8hXgEqsr6GbwaKtQeVH8c/EvHYr/pPq7rJpfGyG1eyu5KtPlkcyISdMxgGevrlK0UfF5
4xF23h3tdWmAfRUsLHQkkFplAERBzw7XDOnje9Py0P25DYW7Mxw0I0STXHfcvFTSr/F/
TRB/8qu0ldB5HTvTh0hSoZKEyepTAFXZwmdzwUNl2RF2Om4AbM1PiQzk+YXTKl5K7Y1g
+AVg==
X-Gm-Message-State: ACrzQf1pqofbmaJiaTiD0sODB8PdwPJJdCeLrCjFXoJGLXgY47srDnl7
IEwKR3o5RTTlab5zo9QiADAgYdZj17Wh2fz0mV7YvBJI
X-Google-Smtp-Source: AMsMyM59xiKmdsCK9K39lx0T7g9F5L6peGesnAbaw5rUeUboisB/bGsvyH2RDB3xYc8Iswp3X8NZXhRya/mFZJhkUf0=
X-Received: by 2002:a17:907:80d:b0:73d:a576:dfbd with SMTP id
wv13-20020a170907080d00b0073da576dfbdmr16966568ejb.402.1665456111394; Mon, 10
Oct 2022 19:41:51 -0700 (PDT)
In-Reply-To: <008201d8dd16$718fd420$54af7c60$@gmail.com>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
X-Mailman-Original-References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
<008201d8dd16$718fd420$54af7c60$@gmail.com>
 by: Chris Angelico - Tue, 11 Oct 2022 02:41 UTC

On Tue, 11 Oct 2022 at 13:10, <avi.e.gross@gmail.com> wrote:
> If the above is:
>
> Import grumpy as np
>
> Then what happens if the code tries to find a file named "grumpy" somewhere
> and cannot locate it and this is considered a syntax error rather than a
> run-time error for whatever reason? Can you continue when all kinds of
> functionality is missing and code asking to make a np.array([1,2,3]) clearly
> fails?

That's not a syntax error. Syntax is VERY specific. It is an error in
Python to attempt to add 1 to "one", it is an error to attempt to look
up the upper() method on None, it is an error to try to use a local
variable you haven't assigned to yet, and it is an error to open a
file that doesn't exist. But not one of these is a *syntax* error.

Syntax errors are detected at the parsing stage, before any code gets
run. The vast majority of syntax errors are grammar errors, where the
code doesn't align with the parseable text of a Python program.
(Non-grammatical parsing errors include using a "nonlocal" statement
with a name that isn't found in any surrounding scope, using "await"
in a non-async function, and attempting to import braces from the
future.)

ChrisA

RE: What to use for finding as many syntax errors as possible.

<mailman.635.1665458688.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19785&group=comp.lang.python#19785

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!fu-berlin.de!uni-berlin.de!not-for-mail
From:
Newsgroups: comp.lang.python
Subject: RE: What to use for finding as many syntax errors as possible.
Date: Mon, 10 Oct 2022 23:24:42 -0400
Lines: 63
Message-ID: <mailman.635.1665458688.20444.python-list@python.org>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me> <008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Trace: news.uni-berlin.de xOJaEVgYjVypUITDShExgggKOrAnaREmoq9J3T9Di8sw==
Return-Path: <avi.e.gross@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=ooBwMiTV;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.042
X-Spam-Evidence: '*H*': 0.92; '*S*': 0.00; '2022': 0.05; 'run.': 0.05;
'variable': 0.05; 'angelico': 0.09; 'received:209.85.219': 0.09;
'import': 0.15; 'url:mailman': 0.15; 'syntax': 0.15; 'categories':
0.16; 'chrisa': 0.16; 'compiled': 0.16; 'corrected': 0.16;
'error?': 0.16; 'halts': 0.16; 'indeed': 0.16; 'interpreter':
0.16; 'locate': 0.16; 'mathematical': 0.16; 'parsing': 0.16; 'run-
time': 0.16; 'static': 0.16; 'subject:syntax': 0.16;
'suggestions?': 0.16; 'tries': 0.16; 'wrote:': 0.16; 'python':
0.16; 'october': 0.17; 'message-id:@gmail.com': 0.18; "aren't":
0.19; 'tue,': 0.19; 'to:addr:python-list': 0.20; 'language': 0.21;
'code': 0.23; 'run': 0.23; 'skip:- 10': 0.25; 'url-
ip:188.166.95.178/32': 0.25; 'url-ip:188.166.95/24': 0.25;
'url:listinfo': 0.25; 'cannot': 0.25; 'url-ip:188.166/16': 0.25;
'normally': 0.26; "isn't": 0.27; 'local': 0.27; 'done': 0.28;
'chris': 0.28; 'error': 0.29; 'attempt': 0.31; 'code,': 0.31;
'url-ip:188/8': 0.31; 'program': 0.31; "doesn't": 0.32;
'language.': 0.32; 'python-list': 0.32; 'but': 0.32;
'subject:for': 0.33; 'there': 0.33; 'header:In-Reply-To:1': 0.34;
'received:google.com': 0.34; 'majority': 0.35; 'yes,': 0.35;
'from:addr:gmail.com': 0.35; 'errors': 0.36; 'missing': 0.37;
'using': 0.37; 'received:209.85': 0.37; 'hard': 0.37; 'others':
0.37; 'file': 0.38; 'way': 0.38; 'read': 0.38; 'received:209':
0.39; 'text': 0.39; 'use': 0.39; 'block': 0.39; 'evaluation':
0.39; 'finding': 0.39; 'received:100': 0.39; 'happen': 0.40;
'program.': 0.40; 'subject:What': 0.40; 'something': 0.40; 'try':
0.40; 'including': 0.60; '10,': 0.61; 'likely': 0.61; 'method':
0.61; 'above': 0.62; 'from:': 0.62; 'to:': 0.62; 'basis': 0.62;
'once': 0.63; 'others,': 0.64; 're:': 0.64; 'recovery': 0.64;
'look': 0.65; 'named': 0.65; 'pay': 0.65; 'back': 0.67; 'stand':
0.67; 'surrounding': 0.69; 'vast': 0.69; 'analysis': 0.69;
'depending': 0.70; 'rules': 0.70; 'tools': 0.74; 'sent:': 0.78;
'happens': 0.84; 'eventually': 0.84; 'phases': 0.84; 'stage,':
0.84; 'subject:many': 0.84; 'errors,': 0.91; 'grammar': 0.91;
'sin': 0.91; 'fall': 0.95
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
h=thread-index:content-language:content-transfer-encoding
:mime-version:message-id:date:subject:in-reply-to:references:to:from
:from:to:cc:subject:date:message-id:reply-to;
bh=x3Tqtp/pgap1vkIs4lcFKwriAsbA/jbKk0NSIRu3xDs=;
b=ooBwMiTVjkk5VF71zchIlVzvYq7HcudsvH0aKJD7vy/B0p9ZGoJ2hERvhoeT0k4iKX
BIIFOGdXEk5d2X2xrZ6O/GWNAvuTG0bsGDlihYcH1ofVmK6+UOPAJrE34Q3zkUzJRb+w
Bdtzlf2RWDnqg60QsbUAbWizPdNCxQECZHnLNJGVRb6A2pPVfLJFlR9uFDzCK9h/wkAs
ru77bymf6Hfuh248CPueXz5RiF1rwitLqGYpTNZ+Oy1Kf//PyWXUMlS55INCuzeS66mf
8qugdE5TI3pGQtpKCT/l0/OzZCBDCY3lKPn1lrfd7uWhVO6eZEC0w3SEoBEfCTggqcrn
vs7Q==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=thread-index:content-language:content-transfer-encoding
:mime-version:message-id:date:subject:in-reply-to:references:to:from
:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
bh=x3Tqtp/pgap1vkIs4lcFKwriAsbA/jbKk0NSIRu3xDs=;
b=ei4ep0Q5CusU1hDFZW/O97BgT/XLUBK6aFtIjX+lT0zBdDCVGlZg7O8HvFgniIERlT
g/8Tb0LDcpBQpdSQSd3oxsXdLL2S2nQ0HgKlM/4mUleS9npF+XdzrgDzoxgROPrAc6+K
XmNSs+6AoFktj2AnqY2lhG7u0iHqtdJop6bvb9+8rcp2BQ6CRrXn2kXedtZZwXkXUd9D
ijapyUFPQtvhJ/93uXEiM6uYSxlnTpMom7T7+pKd3xVr20bUwXI/RnfFOwwYQZ2PZta6
P6MXPoZCy58dl4fmT8oAJakR+wYbK7UuNUCN7jzJOW0IlZzJG6YcHgGxLvqVuaC0/Eyr
l+6Q==
X-Gm-Message-State: ACrzQf2A45BOt1Tz2XaG72pyHJYlkDJGB7eAqNSdnd9PwgHrvqH4EARr
jx1qSXsF5E4nbt2PmWusE18zl5EwENc=
X-Google-Smtp-Source: AMsMyM6a23jOfSz4/0H8A+vMYLWjbUlBES6VnuIFyzLHvYjlTXVBTGW6YuVi9fRLm5V4Yx3UQkwYVQ==
X-Received: by 2002:a05:6214:246a:b0:4b3:d99f:259e with SMTP id
im10-20020a056214246a00b004b3d99f259emr11773642qvb.99.1665458684787;
Mon, 10 Oct 2022 20:24:44 -0700 (PDT)
In-Reply-To: <CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
X-Mailer: Microsoft Outlook 16.0
Content-Language: en-us
Thread-Index: AQJGir9yaZ0DEwGC3Gj3x2nt5vIbEwHsTY14AVeIrSwBuP6d/gL8Y5TaAmG366ms2lHWsA==
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <00b001d8dd21$01475160$03d5f420$@gmail.com>
X-Mailman-Original-References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me> <008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
 by: - Tue, 11 Oct 2022 03:24 UTC

I stand corrected Chris, and others, as I pay the sin tax.

Yes, there are many kinds of errors that logically fall into different
categories or phases of evaluation of a program and some can be determined
by a more static analysis almost on a line by line (or "statement" or
"expression", ...) basis and others need to sort of simulate some things
and look back and forth to detect possible incompatibilities and yet others
can only be detected at run time and likely way more categories depending on
the language.

But when I run the Python interpreter on code, aren't many such phases done
interleaved and at once as various segments of code are parsed and examined
and perhaps compiled into block code and eventually executed?

So is the OP asking for something other than a Python Interpreter that
normally halts after some kind of error? Tools like a linter may indeed fit
that mold.

This may limit some of the objections of when an error makes it hard for the
parser to find some recovery point to continue from as no code is being run
and no harmful side effects happen by continuing just an analysis.

Time to go read some books about modern ways to evaluate a language based on
more mathematical rules including more precisely what is syntax versus ...

Suggestions?

-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On
Behalf Of Chris Angelico
Sent: Monday, October 10, 2022 10:42 PM
To: python-list@python.org
Subject: Re: What to use for finding as many syntax errors as possible.

On Tue, 11 Oct 2022 at 13:10, <avi.e.gross@gmail.com> wrote:
> If the above is:
>
> Import grumpy as np
>
> Then what happens if the code tries to find a file named "grumpy"
> somewhere and cannot locate it and this is considered a syntax error
> rather than a run-time error for whatever reason? Can you continue
> when all kinds of functionality is missing and code asking to make a
> np.array([1,2,3]) clearly fails?

That's not a syntax error. Syntax is VERY specific. It is an error in Python
to attempt to add 1 to "one", it is an error to attempt to look up the
upper() method on None, it is an error to try to use a local variable you
haven't assigned to yet, and it is an error to open a file that doesn't
exist. But not one of these is a *syntax* error.

Syntax errors are detected at the parsing stage, before any code gets run.
The vast majority of syntax errors are grammar errors, where the code
doesn't align with the parseable text of a Python program.
(Non-grammatical parsing errors include using a "nonlocal" statement with a
name that isn't found in any surrounding scope, using "await"
in a non-async function, and attempting to import braces from the
future.)

ChrisA
--
https://mail.python.org/mailman/listinfo/python-list

Re: What to use for finding as many syntax errors as possible.

<mailman.637.1665460527.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19787&group=comp.lang.python#19787

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: ros...@gmail.com (Chris Angelico)
Newsgroups: comp.lang.python
Subject: Re: What to use for finding as many syntax errors as possible.
Date: Tue, 11 Oct 2022 14:55:12 +1100
Lines: 152
Message-ID: <mailman.637.1665460527.20444.python-list@python.org>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
<008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
X-Trace: news.uni-berlin.de YuqcBHl1waZa0yCzTCOPjwkQmmfZJGiwtKptW7Bm6BvQ==
Return-Path: <rosuav@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=pduniP4R;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'stream': 0.04;
'traceback': 0.04; 'yet.': 0.04; '(most': 0.05; '2022': 0.05;
'last):': 0.05; "python's": 0.05; '"""': 0.09; 'ast': 0.09;
'attempts': 0.09; 'compiler': 0.09; 'example:': 0.09; 'itself,':
0.09; 'parse': 0.09; 'restart': 0.09; 'set.': 0.09; 'text.': 0.09;
'though.': 0.09; 'token': 0.09; 'steps': 0.11; 'import': 0.15;
'syntax': 0.15; '"move': 0.16; '(it': 0.16; '(note': 0.16;
'__future__': 0.16; 'arbitrary': 0.16; 'categories': 0.16;
'chrisa': 0.16; 'compiled': 0.16; 'constant': 0.16; 'corrected':
0.16; 'error"': 0.16; 'error?': 0.16; 'from:addr:rosuav': 0.16;
'from:name:chris angelico': 0.16; 'functions,': 0.16; 'halts':
0.16; 'handful': 0.16; 'indeed': 0.16; 'interpreter': 0.16;
'linear': 0.16; 'mathematical': 0.16; 'mean.': 0.16; 'means.':
0.16; 'modules,': 0.16; 'nodes': 0.16; 'nonsense': 0.16; 'ones.':
0.16; 'parsing': 0.16; 'probe': 0.16; 'resulted': 0.16; 'run-
time': 0.16; 'sees': 0.16; 'static': 0.16; 'subject:syntax': 0.16;
'suggestions?': 0.16; 'syntaxerror:': 0.16; 'these.': 0.16;
'tree,': 0.16; 'turns': 0.16; 'wrote:': 0.16; 'python': 0.16;
'code.': 0.17; "aren't": 0.19; 'concerning': 0.19; 'figure': 0.19;
'tue,': 0.19; 'to:addr:python-list': 0.20; 'language': 0.21;
'maybe': 0.22; 'version': 0.23; 'code': 0.23; 'skip:p 30': 0.23;
'run': 0.23; "i'd": 0.24; 'past': 0.25; 'actual': 0.25; 'depends':
0.25; 'attack': 0.26; 'normally': 0.26; "isn't": 0.27; 'done':
0.28; '>>>': 0.28; 'sense': 0.28; 'error': 0.29; 'code,': 0.31;
'program': 0.31; "doesn't": 0.32; '"",': 0.32; 'fine.': 0.32;
'guess': 0.32; 'here,': 0.32; 'language.': 0.32; 'point,': 0.32;
'message-id:@mail.gmail.com': 0.32; 'but': 0.32; "i'm": 0.33;
'subject:for': 0.33; 'there': 0.33; 'able': 0.34; 'same': 0.34;
'mean': 0.34; 'skip:n 30': 0.67; 'types': 0.67; 'back': 0.67;
'stand': 0.67; 'that,': 0.67; 'exactly': 0.68; 'matter': 0.68;
'interpreted': 0.69; 'perfectly': 0.69; 'repeatedly': 0.69;
'sequence': 0.69; 'skip:b 40': 0.69; 'vast': 0.69; 'analysis':
0.69; 'depending': 0.70; 'raised': 0.70; 'rules': 0.70; 'skip:t
60': 0.70; 'too.': 0.70; 'carry': 0.71; 'ignore': 0.71; 'chance':
0.71; "you'll": 0.73; 'tools': 0.74; '....': 0.76; 'happens':
0.84; 'absolutely': 0.84; 'characters': 0.84; 'eventually': 0.84;
'figuring': 0.84; 'from.': 0.84; 'phases': 0.84; 'pipeline': 0.84;
'skip:" 40': 0.84; 'speaking,': 0.84; 'stage,': 0.84;
'subject:many': 0.84; 'units.': 0.84; 'down,': 0.91; 'errors,':
0.91; 'grammar': 0.91; 'laid': 0.91; 'punctuation': 0.91; 'sin':
0.91; 'word.': 0.91; 'fall': 0.95; 'keywords': 0.95
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
h=to:subject:message-id:date:from:in-reply-to:references:mime-version
:from:to:cc:subject:date:message-id:reply-to;
bh=y25mU7A0Ff9/VwHQdTPgGDZewVkkxJOz5ww8ReLdI1k=;
b=pduniP4RXgvYZNPGHkYbZoC8/nRVKzswaNBSb0JwXPkxt+ZeK+7cck35/u75cKzfEY
GM5U8IE/97GGT2wf+lcgu2+9uuImbssnUB16KWGne7riUN4/dX/9WVCpxUNPCkJ8a1LS
0qkoTbtTHSAtWCvmDvScNaKv8oYnWxI/wXavBKKJqR/WZmSypQqKcgHZt6W31/yMiE9c
+kYJHRINVpkxuDQ1Eo4I2Z1PcLHRq+01cLcypz3J4mBEdcrfxgmwgRGGE+qEQmjK8J8m
YnOns8yGUzYcPVFVMDeVakKyWtDVmePOCghiGKF/b2vBKf1TXbbp2R7WFwLTpLNzbb+w
NlvA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=to:subject:message-id:date:from:in-reply-to:references:mime-version
:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
bh=y25mU7A0Ff9/VwHQdTPgGDZewVkkxJOz5ww8ReLdI1k=;
b=rraL7u664RKwVwbPyp6Tyx3Uewlysc3IKj3NGXks3VjNGHRJg5Cnkues9gy7eQ8KCU
j8m1G1IstTzHfYtEZ9Owv4B9iAk8hZ4wNA9B6uoddiCMloKFWMi0aT25LTpbtKuvIYCB
YrEpGsrogECayk2Lvm8M/LpmK7lSaE+GoXmLfdqNT/BAPbwTqpzZQtsvdZE4lizRgoeA
VbdyL84YwmqJVnExRVt1m+7XqgibulOTCRU+hDtFhf5vZ09SDYLXMJdxx+d1qwQH9oa7
ixpB/vTJTd9+D+58BzcBxG6DbnMqvYJtWhzG3t9S/7O0WugFiCXITYKIwqUpD5diKBWi
XnyA==
X-Gm-Message-State: ACrzQf0kJ9EEQQW8Fx+ifP6CSO5kKLm5cMweqCy/9Mp+uccZuuLh9sdv
hntodOItP0LgB3i0pjxC1o7H7z9X8cA1h4SAsFvHrPo5kaQ=
X-Google-Smtp-Source: AMsMyM6i7HrNQ0k/UBOo+9PjL3UPZyJpS4xXtcAgAIJkT4J+4rDSWUgGnuqpYUc8Pgh4SN19fmIusRT68TGnI4/Nfdw=
X-Received: by 2002:a05:6402:e01:b0:442:dd7e:f49d with SMTP id
h1-20020a0564020e0100b00442dd7ef49dmr12317140edh.355.1665460524393; Mon, 10
Oct 2022 20:55:24 -0700 (PDT)
In-Reply-To: <00b001d8dd21$01475160$03d5f420$@gmail.com>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
X-Mailman-Original-References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
<008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
 by: Chris Angelico - Tue, 11 Oct 2022 03:55 UTC

On Tue, 11 Oct 2022 at 14:26, <avi.e.gross@gmail.com> wrote:
>
> I stand corrected Chris, and others, as I pay the sin tax.
>
> Yes, there are many kinds of errors that logically fall into different
> categories or phases of evaluation of a program and some can be determined
> by a more static analysis almost on a line by line (or "statement" or
> "expression", ...) basis and others need to sort of simulate some things
> and look back and forth to detect possible incompatibilities and yet others
> can only be detected at run time and likely way more categories depending on
> the language.
>
> But when I run the Python interpreter on code, aren't many such phases done
> interleaved and at once as various segments of code are parsed and examined
> and perhaps compiled into block code and eventually executed?

Hmm, depends what you mean. Broadly speaking, here's how it goes:

0) Early pre-parse steps that don't really matter to most programs,
like checking character set. We'll ignore these.
1) Tokenize the text of the program into a sequence of
potentially-meaningful units.
2) Parse those tokens into some sort of meaningful "sentence".
3) Compile the syntax tree into actual code.
4) Run that code.

Example:
>>> code = """def f():
.... print("Hello, world", 1>=2)
.... print(Ellipsis, ...)
.... return True
.... """
>>>

In step 1, all that happens is that a stream of characters (or bytes,
depending on your point of view) gets broken up into units.

>>> for t in tokenize.tokenize(iter(code.encode().split(b"\n")).__next__):
.... print(tokenize.tok_name[t.exact_type], t.string)

It's pretty spammy, but you can see how the compiler sees the text.
Note that, at this stage, there's no real difference between the NAME
"def" and the NAME "print" - there are no language keywords yet.
Basically, all you're doing is figuring out punctuation and stuff.

Step 2 is what we'd normally consider "parsing". (It may well happen
concurrently and interleaved with tokenizing, and I'm giving a
simplified and conceptualized pipeline here, but this is broadly what
Python does.) This compares the stream of tokens to the grammar of a
Python program and attempts to figure out what it means. At this
point, the linear stream turns into a recursive syntax tree, but it's
still very abstract.

>>> import ast
>>> ast.dump(ast.parse(code))
"Module(body=[FunctionDef(name='f', args=arguments(posonlyargs=[],
args=[], kwonlyargs=[], kw_defaults=[], defaults=[]),
body=[Expr(value=Call(func=Name(id='print', ctx=Load()),
args=[Constant(value='Hello, world'), Compare(left=Constant(value=1),
ops=[GtE()], comparators=[Constant(value=2)])], keywords=[])),
Expr(value=Call(func=Name(id='print', ctx=Load()),
args=[Name(id='Ellipsis', ctx=Load()), Constant(value=Ellipsis)],
keywords=[])), Return(value=Constant(value=True))],
decorator_list=[])], type_ignores=[])"

(Side point: I would rather like to be able to
pprint.pprint(ast.parse(code)) but that isn't a thing, at least not
currently.)

This is where the vast majority of SyntaxErrors come from. Your code
is a sequence of tokens, but those tokens don't mean anything. It
doesn't make sense to say "print(def f[return)]" even though that'd
tokenize just fine. The trouble with the notion of "keeping going
after finding an error" is that, when you find an error, there are
almost always multiple possible ways that this COULD have been
interpreted differently. It's as likely to give nonsense results as
actually useful ones.

(Note that, in contrast to the tokenization stage, this version
distinguishes between the different types of word. The "def" has
resulted in a FunctionDef node, the "print" is a Name lookup, and both
"..." and "True" have now become Constant nodes - previously, "..."
was a special Ellipsis token, but "True" was just a NAME.)

Step 3: the abstract syntax tree gets parsed into actual runnable
code. This is where that small handful of other SyntaxErrors come
from. With these errors, you absolutely _could_ carry on and report
multiple; but it's not very likely that there'll actually *be* more
than one of them in a file. Here's some perfectly valid AST parsing:

>>> ast.dump(ast.parse("from __future__ import the_past"))
"Module(body=[ImportFrom(module='__future__',
names=[alias(name='the_past')], level=0)], type_ignores=[])"
>>> ast.dump(ast.parse("from __future__ import braces"))
"Module(body=[ImportFrom(module='__future__',
names=[alias(name='braces')], level=0)], type_ignores=[])"
>>> ast.dump(ast.parse("def f():\n\tdef g():\n\t\tnonlocal x\n"))
"Module(body=[FunctionDef(name='f', args=arguments(posonlyargs=[],
args=[], kwonlyargs=[], kw_defaults=[], defaults=[]),
body=[FunctionDef(name='g', args=arguments(posonlyargs=[], args=[],
kwonlyargs=[], kw_defaults=[], defaults=[]),
body=[Nonlocal(names=['x'])], decorator_list=[])],
decorator_list=[])], type_ignores=[])"

If you were to try to actually compile those to bytecode, they would fail:

>>> compile(ast.parse("from __future__ import braces"), "-", "exec")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "-", line 1
SyntaxError: not a chance

And finally, step 4 is actually running the compiled bytecode. Any
errors that happen at THIS stage are going to be run-time errors, not
syntax errors (a SyntaxError raised at run time would be from
compiling other code).

> So is the OP asking for something other than a Python Interpreter that
> normally halts after some kind of error? Tools like a linter may indeed fit
> that mold.

Yes, but linters are still going to go through the same process laid
out above. So if you have a huge pile of code that misuses "await" in
non-async functions, sure! Maybe a linter could half-compile the code,
then probe it repeatedly until it gets past everything. That's not
exactly a common case, though. More likely, you'll have parsing
errors, and the only way to "move past" a parsing error is to guess at
what token should be added or removed to make it "kinda work".

Alternatively, you'll get some kind of messy heuristics to try to
restart parsing part way down, but that's pretty imperfect too.

> This may limit some of the objections of when an error makes it hard for the
> parser to find some recovery point to continue from as no code is being run
> and no harmful side effects happen by continuing just an analysis.

It's pretty straight-forward to ensure that no code is run - just
compile it without running it. It's still possible to attack the
compiler itself, but far less concerning than running arbitrary code.
Attacks on the compiler are usually deliberate; code you don't want to
run yet might be a perfectly reasonable call to os.unlink()...

> Time to go read some books about modern ways to evaluate a language based on
> more mathematical rules including more precisely what is syntax versus ...
>
> Suggestions?
>

I'd recommend looking at Python's compile() function, the ast and
tokenizer modules, and everything that they point to.

ChrisA

RE: What to use for finding as many syntax errors as possible.

<mailman.639.1665472255.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19789&group=comp.lang.python#19789

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From:
Newsgroups: comp.lang.python
Subject: RE: What to use for finding as many syntax errors as possible.
Date: Tue, 11 Oct 2022 03:10:51 -0400
Lines: 230
Message-ID: <mailman.639.1665472255.20444.python-list@python.org>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me> <008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
<011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain;
charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Trace: news.uni-berlin.de xMYDxW2sK6Cb2jtEWH5Qmwza8+PPavRnL6X12WL9/vQQ==
Return-Path: <avi.e.gross@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=M4zP9toy;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'looks': 0.02; 'comments':
0.03; 'stream': 0.04; 'traceback': 0.04; 'yet.': 0.04; '(most':
0.05; '2022': 0.05; 'class,': 0.05; 'last):': 0.05; "python's":
0.05; 'restrictions': 0.05; 'class.': 0.07; 'explanation': 0.07;
'happened': 0.07; '"""': 0.09; 'angelico': 0.09; 'ast': 0.09;
'asynchronous': 0.09; 'attempts': 0.09; 'compiler': 0.09;
'derived': 0.09; 'effectively': 0.09; 'environments': 0.09;
'example:': 0.09; 'itself,': 0.09; 'much,': 0.09; 'other.': 0.09;
'parse': 0.09; 'received:209.85.219': 0.09; 'represents': 0.09;
'restart': 0.09; 'set.': 0.09; 'studying': 0.09; 'text.': 0.09;
'though.': 0.09; 'threads': 0.09; 'token': 0.09; 'steps': 0.11;
'import': 0.15; 'url:mailman': 0.15; 'syntax': 0.15; '"move':
0.16; '(it': 0.16; '(note': 0.16; '__future__': 0.16; 'arbitrary':
0.16; 'attributes': 0.16; 'bits': 0.16; 'categories': 0.16;
'chrisa': 0.16; 'cleanly': 0.16; 'compiled': 0.16; 'constant':
0.16; 'corrected': 0.16; 'css': 0.16; 'described.': 0.16;
'error"': 0.16; 'error?': 0.16; 'fascinating': 0.16; 'functions,':
0.16; 'gui,': 0.16; 'halts': 0.16; 'handful': 0.16; 'ignored':
0.16; 'indeed': 0.16; 'interpreter': 0.16; 'language?': 0.16;
'linear': 0.16; 'mathematical': 0.16; 'mean.': 0.16; 'means.':
0.16; 'modules,': 0.16; 'namespaces': 0.16; 'nested': 0.16;
'node)': 0.16; 'nodes': 0.16; 'nonsense': 0.16; 'ones.': 0.16;
'parsing': 0.16; 'probe': 0.16; 'resulted': 0.16; 'run-time':
0.16; 'sees': 0.16; 'segment': 0.16; 'static': 0.16; 'structures':
0.16; 'subject:syntax': 0.16; 'subset': 0.16; 'suggestions?':
0.16; 'syntaxerror:': 0.16; 'these.': 0.16; 'things,': 0.16;
'tree,': 0.16; 'turns': 0.16; 'ways.': 0.16; 'youth': 0.16;
'wrote:': 0.16; 'python': 0.16; 'october': 0.17; 'code.': 0.17;
'message-id:@gmail.com': 0.18; 'uses': 0.19; "aren't": 0.19;
'concerning': 0.19; 'figure': 0.19; 'tue,': 0.19; 'to:addr:python-
list': 0.20; 'language': 0.21; 'written': 0.22; 'languages': 0.22;
'maybe': 0.22; 'sent:': 0.78; 'highly': 0.78; 'html': 0.80;
'moment': 0.81; 'click': 0.83; 'guy': 0.84; 'happens': 0.84;
'powerful': 0.84; '(like': 0.84; 'absolutely': 0.84; 'characters':
0.84; 'complement': 0.84; 'dom': 0.84; 'eventually': 0.84;
'figuring': 0.84; 'from.': 0.84; 'manipulated': 0.84; 'mouse':
0.84; 'phases': 0.84; 'pipeline': 0.84; 'promised': 0.84;
'recognized.': 0.84; 'skip:" 40': 0.84; 'speaking,': 0.84;
'stage,': 0.84; 'subject:many': 0.84; 'units.': 0.84; 'down,':
0.91; 'errors,': 0.91; 'grammar': 0.91; 'laid': 0.91; 'outline':
0.91; 'punctuation': 0.91; 'sin': 0.91; 'word.': 0.91; 'aspects':
0.93; 'fall': 0.95; 'keywords': 0.95; 'storage': 0.95
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
h=thread-index:content-language:content-transfer-encoding
:mime-version:message-id:date:subject:in-reply-to:references:to:from
:from:to:cc:subject:date:message-id:reply-to;
bh=8QBemNYCQ81QcLOvnc/GUl4OU43kQh8EFOxyVSuprR0=;
b=M4zP9toynPdtS9WoBP+fnm/CStnxk/6RJVexNh+M6imMAG/t4TWcykG/5KWgfV+hNA
qKwpicWrgoHn2/4/xIF2edfOQyRCoMehw2oGjyE5cPDgx2D1qn0RxY0VNenGih+C9VIU
5GRqlQkQIbTpziuRyH9W7ANFd72QydOGBvE0T80+HY5rVBErz+9Lkd8YsJkC0hvsBVBY
zBS2qovDwlLyyi8H7BmF6cQLC4z9X0Ju2HGKLYrc/jXfMZw0BNA/H5ju9vCmv6rbnENu
bkcAXAFedOKQqHSfEoPVW8vKkt+ITK8g5hVkadXbtuE/jodJXRk18XDmVMTrnzaSD58x
bdHQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=thread-index:content-language:content-transfer-encoding
:mime-version:message-id:date:subject:in-reply-to:references:to:from
:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
bh=8QBemNYCQ81QcLOvnc/GUl4OU43kQh8EFOxyVSuprR0=;
b=L0ocXEjLpi7tCvZNjqfZoLfC0Ac7WY5zFPFVNMbUjUpYyV+FX5S+8UJsLclonYI7/R
VshKwcu9GsMdumMuurFCT/N5n1nR4h6hd5CkX0CaPIXDnbfsVtfMKvP00bQneFVGIS9e
QWeVmEUD0J/bR1V7UQOAnISpUhehpMj/i00tZhNvDRmFpzPSPK45l7p5GPubTcHc/WZT
kobrq6iFLVrMY+G1q4+QzZcyarJimT00pIZYXOSQIfLEP610UocQqNUva+tr2voUvz9F
xcd6wpx+t03hNTM/a24CjoFjZkwr87oRaT+0x91WcOmL/jp8Ha/D03wZfSLweqXDoUK6
/HXA==
X-Gm-Message-State: ACrzQf3gzc6IaXHxW5wWqA2LCkjVys5C7Vrt+ja7dCLlis0xmEx9zYr7
E0E+veLN/TNIHe6+22YN91qbg6PEoKg=
X-Google-Smtp-Source: AMsMyM49Yi9xTOxDehex0lgjBFb7cz4KRYBZRlNtDV1Gc6v7rmsAybYsLXM8dd+tdNEOM+twctsTnw==
X-Received: by 2002:a0c:cb8d:0:b0:4b1:7a87:8ad5 with SMTP id
p13-20020a0ccb8d000000b004b17a878ad5mr17747806qvk.35.1665472251970;
Tue, 11 Oct 2022 00:10:51 -0700 (PDT)
In-Reply-To: <CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
X-Mailer: Microsoft Outlook 16.0
Content-Language: en-us
Thread-Index: AQJGir9yaZ0DEwGC3Gj3x2nt5vIbEwHsTY14AVeIrSwBuP6d/gL8Y5TaAmG366kB9dwfUwIwz53+rLlX12A=
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
X-Mailman-Original-References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me> <008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
 by: - Tue, 11 Oct 2022 07:10 UTC

Thanks for a rather detailed explanation of some of what we have been
discussing, Chris. The overall outline is about what I assumed was there but
some of the details were, to put it politely, fuzzy.

I see resemblances to something like how a web page is loaded and operated.
I mean very different but at some level not so much.

I mean a typical web page is read in as HTML with various keyword regions
expected such as <BODY> ... </BODY> or <DIV ...> ... </DIV> with things
often cleanly nested in others. The browser makes nodes galore in some kind
of tree format with an assortment of objects whose attributes or methods
represent aspects of what it sees. The resulting treelike structure has
names like DOM.

To a certain approximation, this tree starts a certain way but is regularly
being manipulated (or perhaps a copy is) as it regularly is looked at to see
how to display it on the screen at the moment based on the current tree
contents and another set of rules in Cascading Style Sheets. But bits and
pieces of JavaScript are also embedded or imported that can read aspects of
the tree (and more) and modify the contents and arrange for all kinds of
asynchronous events when bits of code are invoked such as when you click a
button or hover or when an image finishes loading or every 100 milliseconds.
It can insert new objects into the DOM too. And of course there can be
interactions with restricted local storage as well as with servers and code
running there.

It is quite a mess but in some ways I see analogies. Your program reads a
stream of data and looks for tokens and eventually turns things into a tree
of sorts that represents relationships to a point. Additional structures
eventually happen at run time that let you store collections of references
to variables such as environments or namespaces and the program derived from
the trees makes changes as it goes and in a language like Python can even
possibly change the running program in some ways.

These are not at all the same thing but share a certain set of ideas and
methods and can be very powerful as things interact. In the web case, the
CSS may search for regions with some class or ID or that are the third
element of a bullet list and more, using powerful tools like jQuery, and
make changes. A CSS rule that previously ignored some region as not having a
particular class, might start including it after a JavaScript segment is
aroused while waiting on an event listener for say a mouse hovering over an
area and then changes that part of the DOM (like a node) to be in that
class. Suddenly the area on your screen changes background or whatever the
CSS now dictates. We have multiple systems written in an assortment of
"languages" that complement each other. Some running programs, especially
ones that use asynchronous methods like threads or callbacks on events, such
as a GUI, can effectively do similar things.

In effect the errors in the web situation have such analogies too as in what
happens if a region of HTML is not well-formed or uses a keyword not
recognized. This becomes even more interesting in XML where anything can be
a keyword and you often need other kinds of files (often also in ML) to
define what the XML can be like and what restrictions it may have such as
can a <BOOK> have multiple authors but only one optional publication date
and so on. It can be fascinating and highly technical. So I am up for a
challenge of studying anything from early compilers for languages of my
youth to more recent ways including some like what you show.

I have time to kill and this might be more fun than other things, for a
while.

There was a guy around a few years ago who suggested he would create a
system where you could create a series of some kind of configuration files
for ANY language and his system would them compile or run programs for each
and every such language? Was that on this forum? What ever happened to him?

But although what he promised seemed a bit too much, I can see from your
comments below how in some ways a limited amount of that might be done for
some subset of languages which can be parsed and manipulated as described.

-----Original Message-----
From: Python-list <python-list-bounces+avi.e.gross=gmail.com@python.org> On
Behalf Of Chris Angelico
Sent: Monday, October 10, 2022 11:55 PM
To: python-list@python.org
Subject: Re: What to use for finding as many syntax errors as possible.

On Tue, 11 Oct 2022 at 14:26, <avi.e.gross@gmail.com> wrote:
>
> I stand corrected Chris, and others, as I pay the sin tax.
>
> Yes, there are many kinds of errors that logically fall into different
> categories or phases of evaluation of a program and some can be
> determined by a more static analysis almost on a line by line (or
> "statement" or "expression", ...) basis and others need to sort of
> simulate some things and look back and forth to detect possible
> incompatibilities and yet others can only be detected at run time and
> likely way more categories depending on the language.
>
> But when I run the Python interpreter on code, aren't many such phases
> done interleaved and at once as various segments of code are parsed
> and examined and perhaps compiled into block code and eventually executed?

Hmm, depends what you mean. Broadly speaking, here's how it goes:

0) Early pre-parse steps that don't really matter to most programs, like
checking character set. We'll ignore these.
1) Tokenize the text of the program into a sequence of
potentially-meaningful units.
2) Parse those tokens into some sort of meaningful "sentence".
3) Compile the syntax tree into actual code.
4) Run that code.

Example:
>>> code = """def f():
.... print("Hello, world", 1>=2)
.... print(Ellipsis, ...)
.... return True
.... """
>>>

In step 1, all that happens is that a stream of characters (or bytes,
depending on your point of view) gets broken up into units.

>>> for t in tokenize.tokenize(iter(code.encode().split(b"\n")).__next__):
.... print(tokenize.tok_name[t.exact_type], t.string)

It's pretty spammy, but you can see how the compiler sees the text.
Note that, at this stage, there's no real difference between the NAME "def"
and the NAME "print" - there are no language keywords yet.
Basically, all you're doing is figuring out punctuation and stuff.

Step 2 is what we'd normally consider "parsing". (It may well happen
concurrently and interleaved with tokenizing, and I'm giving a simplified
and conceptualized pipeline here, but this is broadly what Python does.)
This compares the stream of tokens to the grammar of a Python program and
attempts to figure out what it means. At this point, the linear stream turns
into a recursive syntax tree, but it's still very abstract.

>>> import ast
>>> ast.dump(ast.parse(code))
"Module(body=[FunctionDef(name='f', args=arguments(posonlyargs=[], args=[],
kwonlyargs=[], kw_defaults=[], defaults=[]),
body=[Expr(value=Call(func=Name(id='print', ctx=Load()),
args=[Constant(value='Hello, world'), Compare(left=Constant(value=1),
ops=[GtE()], comparators=[Constant(value=2)])], keywords=[])),
Expr(value=Call(func=Name(id='print', ctx=Load()), args=[Name(id='Ellipsis',
ctx=Load()), Constant(value=Ellipsis)], keywords=[])),
Return(value=Constant(value=True))],
decorator_list=[])], type_ignores=[])"

(Side point: I would rather like to be able to
pprint.pprint(ast.parse(code)) but that isn't a thing, at least not
currently.)

This is where the vast majority of SyntaxErrors come from. Your code is a
sequence of tokens, but those tokens don't mean anything. It doesn't make
sense to say "print(def f[return)]" even though that'd tokenize just fine.
The trouble with the notion of "keeping going after finding an error" is
that, when you find an error, there are almost always multiple possible ways
that this COULD have been interpreted differently. It's as likely to give
nonsense results as actually useful ones.

(Note that, in contrast to the tokenization stage, this version
distinguishes between the different types of word. The "def" has resulted in
a FunctionDef node, the "print" is a Name lookup, and both "..." and "True"
have now become Constant nodes - previously, "..."
was a special Ellipsis token, but "True" was just a NAME.)

Step 3: the abstract syntax tree gets parsed into actual runnable code. This
is where that small handful of other SyntaxErrors come from. With these
errors, you absolutely _could_ carry on and report multiple; but it's not
very likely that there'll actually *be* more than one of them in a file.
Here's some perfectly valid AST parsing:

>>> ast.dump(ast.parse("from __future__ import the_past"))
"Module(body=[ImportFrom(module='__future__',
names=[alias(name='the_past')], level=0)], type_ignores=[])"
>>> ast.dump(ast.parse("from __future__ import braces"))
"Module(body=[ImportFrom(module='__future__',
names=[alias(name='braces')], level=0)], type_ignores=[])"
>>> ast.dump(ast.parse("def f():\n\tdef g():\n\t\tnonlocal x\n"))
"Module(body=[FunctionDef(name='f', args=arguments(posonlyargs=[], args=[],
kwonlyargs=[], kw_defaults=[], defaults=[]), body=[FunctionDef(name='g',
args=arguments(posonlyargs=[], args=[], kwonlyargs=[], kw_defaults=[],
defaults=[]), body=[Nonlocal(names=['x'])], decorator_list=[])],
decorator_list=[])], type_ignores=[])"


Click here to read the complete article
Re: What to use for finding as many syntax errors as possible.

<mailman.642.1665487591.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19792&group=comp.lang.python#19792

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: gweathe...@uchc.edu (Weatherby,Gerard)
Newsgroups: comp.lang.python
Subject: Re: What to use for finding as many syntax errors as possible.
Date: Tue, 11 Oct 2022 11:26:15 +0000
Lines: 28
Message-ID: <mailman.642.1665487591.20444.python-list@python.org>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me> <008201d8dd16$718fd420$54af7c60$@gmail.com>
<SA1PR14MB585536EFE7DC13C44675A347B9239@SA1PR14MB5855.namprd14.prod.outlook.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="Windows-1252"
Content-Transfer-Encoding: quoted-printable
X-Trace: news.uni-berlin.de O1qqDXPlfA4wHYZ/OcY6Mgy5uoejAUZbBxpuo60iCPjw==
Return-Path: <prvs=02831dc2d6=gweatherby@uchc.edu>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=uchc.edu header.i=@uchc.edu header.b=NgrmkPBP;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.078
X-Spam-Evidence: '*H*': 0.85; '*S*': 0.00; '2022': 0.05; 'lets': 0.07;
'tab': 0.07; 'received:namprd14.prod.outlook.com': 0.09; 'import':
0.15; 'syntax': 0.15; '***': 0.16; 'completion': 0.16; 'explicit':
0.16; 'initialize': 0.16; 'subject:syntax': 0.16; 'python': 0.16;
'october': 0.17; 'to:addr:python-list': 0.20; 'code': 0.23;
'to:name:python-list@python.org': 0.24; 'header:Received:9': 0.26;
'opening': 0.26; 'received:edu': 0.26; 'email
addr:python.org&gt;': 0.28; 'python-list': 0.32; 'but': 0.32;
'subject:for': 0.33; 'header:In-Reply-To:1': 0.34; 'question.':
0.35; 'errors': 0.36; 'received:filterd': 0.37;
'received:pps.filterd': 0.37; 'url-ip:13.107/16': 0.38; 'date:':
0.39; 'use': 0.39; 'finding': 0.39; 'methods': 0.39;
'subject:What': 0.40; 'charset:windows-1252': 0.60; 'email.':
0.61; '10,': 0.61; 'skip:o 10': 0.61; 'from:': 0.62; 'to:': 0.62;
'reasonable': 0.62; 'email addr:gmail.com': 0.63; 're:': 0.64;
'clicking': 0.76; 'links.': 0.81; 'attention:': 0.84; 'email
name:&lt;python-list': 0.84; 'michael,': 0.84; 'skip:& 50': 0.84;
'subject:many': 0.84
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uchc.edu;
h=from : to : subject :
date : message-id : references : in-reply-to : content-type :
mime-version; s=UCHC03162020;
bh=/oIrhnLPSQkO5b0efCj79PgxEr69p2NPKj/ojKvbJQY=;
b=NgrmkPBPQADCDw38GGgwBsNWot4Vf/f5/1VY7P1O8PFZCsjq/d7SyMhyRSPEUaGRlp6a
OZ+/17ocz0VM+lt8uHtGZ9ClOs/lYHC5Q7go/psA3m/jrIjEVqaiYjoprqSHfvRy2fcV
4ZvZzbOIFSAznoJr7UTpa+uP+fqd+hc2aAV6GPlT6+qjUUq5uJU/ll3ZQDqobJCTzcOn
TpVMGCQU8q5zW2JirISC9AwPcKSLePx43mI6MnhTW6wCbupNoeSff0v45x+Aa0BHOdR+
7Wekrt4+MZqAz33pEjGUFi6MW2BhuNd2EBc/H9tSjDL5y7BVqgv90e6LfoyMjrRFJDIH 2Q==
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
b=SaHhvlotGsMejV0hVzU2HVfVwBKS2DZfGW+JOxF0XgNzSlgjzX2R/2gBVK/l5n9DaDxVt0BLJlL0m7cu0A51CHjELYVFWxR0INePeNgHore/Yt+K8FTl4rw8mdxPPKevFNtP9pVISZ7u/pZXAsHF0KvyBDaBHpJLvxziHJFSWJ8RK+Aw9uW+Biqdb+H01ReJ7OGHUvkhQKE5mvoQylbwqgWk7ONyuYFjK6lwe7oo7ZOfBkWB4/qb5nyAQqe/hwp/1LYqlgPxcVZYedhUTTE79a19yHEc+NjTzCPAIzuECzINvsQ7LUkaZAREhtf/lRzn/vnIZQO1KlQZFcIgR2su6Q==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;
s=arcselector9901;
h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;
bh=/oIrhnLPSQkO5b0efCj79PgxEr69p2NPKj/ojKvbJQY=;
b=DCFp3jaBlxPSS87RGCU+mO7rEW34fHLwEe1uuU30i0VprCkuV0/1kCCycz/Kza6HYrb5GNzzRdfuw4rriU0mClVRlyumidppXk3NzSAFh5JPZcueDU0q1QxuERKztSLOY+S6DL1OHBHj+/aXsz2BySp+YheMk+OebEHyxzG/400INFw2R7VuO6kIq4WnUp6d3gPQhhWvCStN3xu00i7fiaEP6Uh0sH4ALVpYaMpocq33I0bx6YEZlA2duPncSHf52WOckpoWkouiBfiW4usBa43R1kxcNkp93OYZOf7FmaySNELy5PIwy/Umuc+/qQW2GtwagXKQjT7ISSreRaBSGw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
smtp.mailfrom=uchc.edu; dmarc=pass action=none header.from=uchc.edu;
dkim=pass header.d=uchc.edu; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=uchc.onmicrosoft.com;
s=selector2-uchc-onmicrosoft-com;
h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
bh=/oIrhnLPSQkO5b0efCj79PgxEr69p2NPKj/ojKvbJQY=;
b=M86cBPeF8rgVlt1GqFgNmATmobA6zgQ8HL2f6K9yk/oUxi8cDEctJj8bnrguWA4zi/len6fTcvbAsVYSbd2noe+ysw6WiOU7In60F/sd9xQCkASvgidvqc1DCmUGGru0sc2XfR6vA4uG8wqVIEZDcy0ZOq79jskbC/yFcN18GCo=
Thread-Topic: What to use for finding as many syntax errors as possible.
Thread-Index: AQHY28d4QvFojmYVl0OnmgRkGxvKKq4GNf6AgAH5wZCAAEWTAIAAmrZM
In-Reply-To: <008201d8dd16$718fd420$54af7c60$@gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-publictraffictype: Email
x-ms-traffictypediagnostic: SA1PR14MB5855:EE_|PH7PR14MB5845:EE_
x-ms-office365-filtering-correlation-id: d8ccdb90-6b92-48b3-cd51-08daab7b690f
x-ms-exchange-senderadcheck: 1
x-ms-exchange-antispam-relay: 0
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: lAjDgx1txSnAP7y/FQARqa/d/GdbaZkoT+9rMJRSgdd/zefPTiu5ph1+spHLAW+wcSwWkbeBxTieenpvP+c1BD2a08ocf8GxX/uSUe0v1qK+DimswfKwUpj1hS12o0c9o0CaAhBOednghhI+zE0opmBW2x4X5ZYG/Mhd8FG54agYpx+NRL3Vkp4DN//+LDJZgNuitzxZWEQ8kz6HKDvROLLL2NO0/PaLrYDfnpUyh52AD+UARyeW9hFKIrorDae50cSyUplRqsww1nCtxkuZK3p4LWKZ3ISUBAxKeoDkTIGaEoloVR9phoqRrrkw8bYA6jUH4g2HoW6+57yGqvSuD5EWbNQX/SHqswKKl4a652tE25RjxhxmTwzUBA/R6cY8KB+DXNw/giI0X5SJZP8yO3ESq7M+B3DZ7+qQWeAdJoSTi/dKIe+WTREqGmVF4sc/tS8zYamkEJvo2Aw+AHkhy02qYkJzJHu5Bze2VjnX37vCNT6QWOBpwCENZK0JIL81iEG/HP0KEQyOTgiVQ3OrIoAqvJ0ddwVxPpzxjctxVS7RCf88H/2ox073/ViQKTqtRVqLC/oDJZpvLCD8J0Of0PojXPqNMt7HwKe5C4acZH8OnPcEUrIU8e9736xoxTSiAH3Z7M8ADfr4/PBWpAekem487rbVk025ReOyBfp8B58D+ulYJQqaUdRCF0L3DKF6C/hADn3p5Dvssqq4TYxF3z/bD75q7se6cFG09CU2cRqbMryVA6OHAI2/nOAnhIcHTJaM7OVV/mw0jdTMR1kpaA==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:;
IPV:NLI; SFV:NSPM; H:SA1PR14MB5855.namprd14.prod.outlook.com; PTR:; CAT:NONE;
SFS:(13230022)(39850400004)(136003)(366004)(396003)(346002)(376002)(451199015)(75432002)(33656002)(38070700005)(38100700002)(122000001)(55016003)(66946007)(66446008)(91956017)(8676002)(66556008)(64756008)(71200400001)(76116006)(786003)(316002)(66476007)(110136005)(86362001)(2906002)(8936002)(41320700001)(4744005)(41300700001)(52536014)(5660300002)(186003)(9686003)(478600001)(7696005)(6506007)(26005)(53546011);
DIR:OUT; SFP:1101;
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: VPhO684gJ7isPyTD9orBtXnRIp1HkUEwbFCF6BgPnUw5wNqnA8nm/dMC
kAvqSdryYcD58gVmU9N98F2RxHpEHt1q4jWoXd/qORgA8iGEI6zuQibn
I2x6j5KrcAZdSXQBQYfS9+qbMPtBjUhOipj4nnsoBk0b9HkA2/HwQOfu
3GFIHy7+1iQjIHhJDhbYqUhk1VVStysG3t0CZTDejgtB/CXJjSaWBtSX
g5cmzQ9gap+PItMlvQddCfbOAC0giOqUAcojKgaYuNt4kzmxrtkRR7Kb
2uyvKZ9PwH/x3PLUhYA0eiVRPQaP+c729k1K+dArm8uE/8hc0Sq0RENl
R4CYqp2xZ/UAo94szw15gv6Q7V6qXV7M/cPTMwNVs7fm7YyQEbDnkgpi
SwNxOvGsC/2bvF8hEbz99JMU674RrD8Ak+LomGwI6Onp0YB7SC2keEYs
0xzWIzTx43dWKSpwOawqowPtgCIqhOtGctG2eS3NUWJTkCqKtlaHv1Dc
vyVb/j4rO2yoGG7tI/j+JRo43u0it7s997f3t9ZpuVzfBfXMi3Pn/eUE
cJARjND33RTz0nBHptiemp0OumBYn1uWSXb+D0YL80oT81e8Ul1pp+1T
HBsM9g0CaxHvkokANL2r9KBK0WaMDTBRfOc1/aHnVJYQ9X4xF3FdAsnE
pnCazUsBIYte9RDW+W7Ng7G0DuZa4yYXtwclLBofvo8FUByZgYfzznoo
nN4PgpeGo8WC1w4e4Rbaowv4EgoN1HdL1dzQIH31E7aVT4CzbRKuniyV
O3AR7z9XU5pdoyE4YEdwELrkSfiFZn/+1DFqkHmLyaPVUb46u967kQy4
PpkSBWO6lBGYzwybVMNI9znbufmrboS5+Das6TjeBEvpyZ9xWJoHOPMH
E+VSxklAMy2nJskCXb3XM9hjOulny/1ws2uWfscXipyvclcwOWYtWBNQ
nFmnx9pMtX62U2rBC8JnH+YF3e8BtU1/heL00o/zgiY9I82yxPnGfxYM
ApG2E08KdiBGBy6IwswiCLxP0BCvDfzOReRuMhMYGXExbu0N9ll9PBNp
+QsZty60SvM8nQ7LZZ2elfV0+CD5DYBdGfRwRpzNdQPL/3Q8bQ+z0w+G
PZhhXoE3sgNIq9NKPGuVTlZLI6BTpfJK8/DzkmOAYgrRd5NOER0+fZvx
L9WTiVQpFtBuTYhBvrimAGH6U3kRdbP9uEgY/JcGiuKAhRkSONJW7LTp
Zp2sJJ7UJjbIPM0PHZPslv93hw0Ujm7KP2z1nthvOCl0CVP4RhteYTqi
JrW0zNRgtWTa3p2XeEUvqlG152hMol6fI6BO8MZGL/xTY45lvDRzQbci
9eseStnRCZrGzpmg9j7pfCJ0cC/Z35bfwpCZoEaqVRR6a3OeiRW5ogHW
0HLw5tTdITu/x+UXh2auVwTzpR+wTwR6VGcNsjtFIqCjFRT2QzmcC4ZV
HQlgSTjsMt+UqgSutLG+yIjm9jQdChDF1wsmi0nt9X9sk7dWM6NlER8j
FZcQLz7i+LKwgJk7eXkcsvdsRrgl0TfNaU9m5mNljUOHH7VHOJYD/p+2
16tAjy+XTOVniUEnlsR2Kfnh2UPvTcwXWbZg+pvOnrDBpq19ldF95g
==
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: SA1PR14MB5855.namprd14.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: d8ccdb90-6b92-48b3-cd51-08daab7b690f
X-MS-Exchange-CrossTenant-originalarrivaltime: 11 Oct 2022 11:26:15.7848 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 5c82d83a-818a-4c16-b540-ded2344a7ad3
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: J+YZ3HasX7GvUYdI7Sfd47ioHcEsnDqu2bBxjW/JzYXZ2lENEdpgFwyA6CtE7PQovSKeg5cnQkrnyEYuYWOBDg==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH7PR14MB5845
X-Proofpoint-GUID: D4h0-ax_4n2vsOVpvXNR7IhzzaceA6GU
X-Proofpoint-ORIG-GUID: D4h0-ax_4n2vsOVpvXNR7IhzzaceA6GU
X-Proofpoint-Virus-Version: vendor=baseguard
engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1
definitions=2022-10-11_07,2022-10-11_02,2022-06-22_01
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0
priorityscore=1501
suspectscore=0 adultscore=0 malwarescore=0 bulkscore=0 clxscore=1011
mlxscore=0 spamscore=0 mlxlogscore=566 impostorscore=0 lowpriorityscore=0
phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1
engine=8.12.0-2209130000 definitions=main-2210110064
X-Content-Filtered-By: Mailman/MimeDel 2.1.39
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <SA1PR14MB585536EFE7DC13C44675A347B9239@SA1PR14MB5855.namprd14.prod.outlook.com>
X-Mailman-Original-References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me> <008201d8dd16$718fd420$54af7c60$@gmail.com>
 by: Weatherby,Gerard - Tue, 11 Oct 2022 11:26 UTC

Sure it does. They’re optional and not enforced at runtime, but I find them useful when writing code in PyCharm:

import os
from os import DirEntry

de : DirEntry
for de in os.scandir('/tmp'):
print(de.name)

de = 7
print(de)

Predeclaring de allows me to do the tab completion thing with DirEntry fields / methods

From: Python-list <python-list-bounces+gweatherby=uchc.edu@python.org> on behalf of avi.e.gross@gmail.com <avi.e.gross@gmail.com>
Date: Monday, October 10, 2022 at 10:11 PM
To: python-list@python.org <python-list@python.org>
Subject: RE: What to use for finding as many syntax errors as possible.
*** Attention: This is an external email. Use caution responding, opening attachments or clicking on links. ***

Michael,

A reasonable question. Python lets you initialize variables but has no
explicit declarations.

Re: What to use for finding as many syntax errors as possible.

<mailman.643.1665488361.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19793&group=comp.lang.python#19793

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: ros...@gmail.com (Chris Angelico)
Newsgroups: comp.lang.python
Subject: Re: What to use for finding as many syntax errors as possible.
Date: Tue, 11 Oct 2022 22:39:07 +1100
Lines: 86
Message-ID: <mailman.643.1665488361.20444.python-list@python.org>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
<008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
<011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
<CAPTjJmpVqM0vzmhUWnSy8LxtC3UK0Mg0uncch2bY7zOdkbo10w@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
X-Trace: news.uni-berlin.de nf6px+qzTI2Ofr9oVIY0xgIZMHMWZ4OlqLGHjnf9pKkA==
Return-Path: <rosuav@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=WE5TFGLP;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.127
X-Spam-Level: *
X-Spam-Evidence: '*H*': 0.75; '*S*': 0.00; '2022': 0.05;
'explanation': 0.07; 'happened': 0.07; 'matching': 0.07; 'ast':
0.09; 'parse': 0.09; 'token': 0.09; 'yes.': 0.09; 'afterwards.':
0.16; 'attributes': 0.16; 'backward': 0.16; 'chrisa': 0.16;
'cleanly': 0.16; 'desire': 0.16; 'from:addr:rosuav': 0.16;
'from:name:chris angelico': 0.16; 'indeed': 0.16; 'language?':
0.16; 'languages.': 0.16; 'nested': 0.16; 'nodes': 0.16;
'received:209.85.218': 0.16; 'sets.': 0.16; 'subject:syntax':
0.16; 'syntactic': 0.16; 'tree,': 0.16; 'wrote:': 0.16;
'probably': 0.17; 'uses': 0.19; 'tue,': 0.19; 'to:addr:python-
list': 0.20; 'language': 0.21; 'languages': 0.22; 'maybe': 0.22;
'run': 0.23; 'idea': 0.24; 'anything': 0.25; 'programming': 0.25;
'space': 0.26; 'again,': 0.26; 'bit': 0.27; 'series': 0.28;
'ideas': 0.28; 'effect': 0.31; 'looked': 0.31; 'guess': 0.32;
'here,': 0.32; 'keyword': 0.32; 'objects': 0.32; 'programmers':
0.32; 'structure': 0.32; 'unknown': 0.32; 'message-
id:@mail.gmail.com': 0.32; 'but': 0.32; "i'll": 0.33;
'subject:for': 0.33; 'there': 0.33; 'same': 0.34; "didn't": 0.34;
'mean': 0.34; 'header:In-Reply-To:1': 0.34; 'received:google.com':
0.34; 'english,': 0.35; 'from:addr:gmail.com': 0.35; 'files':
0.36; 'display': 0.36; 'errors': 0.36; 'those': 0.36; 'using':
0.37; "it's": 0.37; 'received:209.85': 0.37; 'author': 0.37;
'way': 0.38; 'could': 0.38; 'put': 0.38; 'read': 0.38; 'thanks':
0.38; 'received:209': 0.39; 'ago': 0.39; 'methods': 0.39; 'much.':
0.39; 'still': 0.40; 'define': 0.40; 'situation': 0.40;
'subject:What': 0.40; 'something': 0.40; 'want': 0.40; 'should':
0.40; "there's": 0.61; 'format': 0.62; 'american': 0.63; 'skip:o
20': 0.63; 'ever': 0.63; 'copy': 0.63; 'between': 0.63; 'share':
0.63; 'browser': 0.64; 'overall': 0.64; 'your': 0.64; 'his': 0.65;
'years': 0.65; 'now,': 0.67; 'latter': 0.69; 'sentence': 0.69;
'resulting': 0.70; 'rules': 0.70; 'skip:a 40': 0.70; '....': 0.76;
'languages,': 0.76; 'thousands': 0.78; 'html': 0.80; 'moment':
0.81; 'period': 0.81; 'guy': 0.84; 'happens': 0.84; 'powerful':
0.84; '(like': 0.84; 'angle': 0.84; 'devise': 0.84; 'dispute':
0.84; 'dom': 0.84; 'invented': 0.84; 'manipulated': 0.84;
'recognized.': 0.84; 'subject:many': 0.84; 'weird': 0.84; 'fresh':
0.91; 'grammar': 0.91; 'outline': 0.91; 'aspects': 0.93; 'former':
0.93
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
h=to:subject:message-id:date:from:in-reply-to:references:mime-version
:from:to:cc:subject:date:message-id:reply-to;
bh=CLqVZX6temRa27H6JxAJLs5yNj8uWZYfN2X2x9elJ9E=;
b=WE5TFGLPwT+41RBs3GeD6ASUeWKNUwv2DP1BMvMOv+VBpROL8FFevA/da8006G5wWp
0sDGn6D95IHhWDizqkP55wubHDoP/B8iLGyxv3H//Kp4Yr2G0J6V/bPOrTCYhvedt+JF
4JAm2zXyywBz8TzOyd5WLvhtb3duBjOazzoswqBU3rSlDj7RIJOoVpCBc2l8J9d5/FiX
P3X9G+lT/7BsEdouYsZoyPmP0SP8h67lw5ywZLMqm4bio2Mo79w1nlTdT/br6lfnOgoD
Ytr7tZzAG4br1uv+jHHQ2aLq7lZzArbBCtalfE/Hdjh4mid/WcOCbDGDEeZOIMYk5HcX
0XHw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=to:subject:message-id:date:from:in-reply-to:references:mime-version
:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
bh=CLqVZX6temRa27H6JxAJLs5yNj8uWZYfN2X2x9elJ9E=;
b=RWWjf2yrTh98RMplPK52Dd7Pr5DGH8sqzj3/EBbY70OGAbx9JAjLnkBC7a2B2WFdIg
DZFU8edbBxPvmeXioV+AOfplIW+aNO/xtohVd8I3DB9ITC5+N3BiXabYA91spRprKWha
OrdlkawtuKhYyRGx2wB5Kt80YbGbkNYlHD8nobzaEeV0Qrqjo5G0qAVFB3gcKODK96zX
Gvap3vTgRdkAt/Dtg12XnE0DYjne9sp28RC7Ukm0EYIcHosn7igU5NZFa2pHc6WisCof
KXSzfzRjpJiG11g40e3lLATboZixBHetYQmr1omkK4CbRcI0ECZvXYqjNtUvxZ1Wtvv6
r70Q==
X-Gm-Message-State: ACrzQf2USZrDS/RtUswOqFJAlsEQ649COZRyP/BJsdw1iyJWcb+rq+1W
o1pbadGRoi71CwTBnAr8aDQdMUaPVHddI4EZ34FKhwJAMuE=
X-Google-Smtp-Source: AMsMyM4JRWRTWZb5ZN/fZmbW6+36hCSOwI+mgt90P7SY9QR4lrsgal3NDxuWMd/0miHgx3KDEem8Woj764rXEDL0TvM=
X-Received: by 2002:a17:907:80d:b0:73d:a576:dfbd with SMTP id
wv13-20020a170907080d00b0073da576dfbdmr18388171ejb.402.1665488359377; Tue, 11
Oct 2022 04:39:19 -0700 (PDT)
In-Reply-To: <011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CAPTjJmpVqM0vzmhUWnSy8LxtC3UK0Mg0uncch2bY7zOdkbo10w@mail.gmail.com>
X-Mailman-Original-References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
<008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
<011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
 by: Chris Angelico - Tue, 11 Oct 2022 11:39 UTC

On Tue, 11 Oct 2022 at 18:12, <avi.e.gross@gmail.com> wrote:
>
> Thanks for a rather detailed explanation of some of what we have been
> discussing, Chris. The overall outline is about what I assumed was there but
> some of the details were, to put it politely, fuzzy.
>
> I see resemblances to something like how a web page is loaded and operated.
> I mean very different but at some level not so much.
>
> I mean a typical web page is read in as HTML with various keyword regions
> expected such as <BODY> ... </BODY> or <DIV ...> ... </DIV> with things
> often cleanly nested in others. The browser makes nodes galore in some kind
> of tree format with an assortment of objects whose attributes or methods
> represent aspects of what it sees. The resulting treelike structure has
> names like DOM.

Yes. The basic idea of "tokenize, parse, compile" can be used for
pretty much any language - even English, although its grammar is a bit
more convoluted than most programming languages, with many weird
backward compatibility features! I'll parse your last sentence above:

LETTERS The
SPACE
LETTERS resulting
SPACE
.... you get the idea
LETTERS like
SPACE
LETTERS DOM
FULLSTOP # or call this token PERIOD if you're American

Now, we can group those tokens into meaningful sets.

Sentence(type=Statement,
subject=Noun(name="structure", addenda=[
Article(type=The),
Adjective(name="treelike"),
]),
verb=Verb(type=Being, name="has", addenda=[]),
object=Noun(name="name", plural=True, addenda=[
Adjective(phrase=Phrase(verb=Verb(name="like"), object=Noun(name="DOM"),
]),
)

Grammar nerds will probably dispute some of the awful shorthanding I
did here, but I didn't want to devise thousands of AST nodes just for
this :)

> To a certain approximation, this tree starts a certain way but is regularly
> being manipulated (or perhaps a copy is) as it regularly is looked at to see
> how to display it on the screen at the moment based on the current tree
> contents and another set of rules in Cascading Style Sheets.

Yep; the DOM tree is initialized from the HTML (usually - it's
possible to start a fresh tree with no HTML) and then can be
manipulated afterwards.

> These are not at all the same thing but share a certain set of ideas and
> methods and can be very powerful as things interact.

Oh absolutely. That's why there are languages designed to help you
define other languages.

> In effect the errors in the web situation have such analogies too as in what
> happens if a region of HTML is not well-formed or uses a keyword not
> recognized.

Aaaaand they're horribly horribly messy, due to a few decades of
sloppy HTML programmers and the desire to still display the page even
if things are messed up :) But, again, there's a huge difference
between syntactic errors (like omitting a matching angle bracket) and
semantic errors (a keyword not known, like using <spam> when you
should have used <span>). In the latter case, you can still build a
DOM tree, but you have an unknown element; in the former case, you
have to guess at what the author meant, just to get anything going at
all.

> There was a guy around a few years ago who suggested he would create a
> system where you could create a series of some kind of configuration files
> for ANY language and his system would them compile or run programs for each
> and every such language? Was that on this forum? What ever happened to him?

That was indeed on this forum, and I have no idea what happened to
him. Maybe he realised that all he'd invented was the Unix shebang?

ChrisA

Re: What to use for finding as many syntax errors as possible.

<mailman.649.1665512510.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19800&group=comp.lang.python#19800

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: lis...@tompassin.net (Thomas Passin)
Newsgroups: comp.lang.python
Subject: Re: What to use for finding as many syntax errors as possible.
Date: Tue, 11 Oct 2022 14:11:56 -0400
Lines: 21
Message-ID: <mailman.649.1665512510.20444.python-list@python.org>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me> <008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
<011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
<4612662b-f15e-777d-bbea-7c8906ac1de3@tompassin.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: news.uni-berlin.de EkB1sojztd4L4j+EPoA6NAUNJ7ZK4UTs52TIoxEH/dQA==
Return-Path: <list1@tompassin.net>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=tompassin.net header.i=@tompassin.net header.b=DtSnbtdJ;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.132
X-Spam-Level: *
X-Spam-Evidence: '*H*': 0.74; '*S*': 0.00; 'reporting': 0.09;
'syntax': 0.15; '3:10': 0.16; 'attributes': 0.16; 'cleanly': 0.16;
'nested': 0.16; 'nodes': 0.16; 'received:10.0.0': 0.16;
'received:64.90': 0.16; 'received:64.90.62': 0.16;
'received:64.90.62.162': 0.16; 'recover': 0.16; 'subject:syntax':
0.16; 'wrote:': 0.16; 'to:addr:python-list': 0.20; 'lines': 0.23;
'actual': 0.25; 'output': 0.28; 'header:User-Agent:1': 0.30;
'am,': 0.31; 'context': 0.32; 'keyword': 0.32; 'objects': 0.32;
'received:10.0': 0.32; 'received:mailchannels.net': 0.32;
'received:relay.mailchannels.net': 0.32; 'same,': 0.32;
'structure': 0.32; 'but': 0.32; 'subject:for': 0.33; 'able': 0.34;
'mean': 0.34; 'header:In-Reply-To:1': 0.34; 'display': 0.36;
'errors': 0.36; 'could': 0.38; 'read': 0.38; 'methods': 0.39;
'much.': 0.39; 'rest': 0.39; 'subject:What': 0.40; 'something':
0.40; 'format': 0.62; 'showing': 0.62; 'email addr:gmail.com':
0.63; 'browser': 0.64; 'imagine': 0.64; 'documents': 0.65; 'back':
0.67; 'header:Received:6': 0.67; 'received:64': 0.67; 'resulting':
0.70; 'manage': 0.73; 'plus': 0.73; 'html': 0.80; 'happened.':
0.84; 'incorrect': 0.84; 'subject:many': 0.84; 'aspects': 0.93
X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net
ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1665511918; a=rsa-sha256;
cv=none;
b=BTBmXY2Byb2KibSmbqAUU9ltZT0H+rhPJZFQNInnP83goVR7dVQ4M6dxzH0rXeCXml0N+T
h1Mk4KYquNHETQJY9ILLycN8pc7vJBTUB4syUslZCZmcJEP8E26lCwcj3uUkPt12SQP6XV
aj5QH29/rREb5+xkNoHoL6DLRmkYjCA/AxnPrqc88HzCvjisAj7pTpaHbMSRREbOnTL/u0
QVySRH0hZq6i/x+xr4yMav/v/8TwsM2FgfUyScbzqmzek2zEjwyQQ9/NgkpbcYdiihb6W1
Gpyo7528lNn9bouvbp677I+8iN5ivF4qZO8+7pcVhJWPKzK+yuq9xuFpTHvhRA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
d=mailchannels.net; s=arc-2022; t=1665511918;
h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
to:to:cc:mime-version:mime-version:content-type:content-type:
content-transfer-encoding:content-transfer-encoding:
in-reply-to:in-reply-to:references:references:dkim-signature;
bh=kpD/TOwjgL7gIAtoC77DnNcJjO/+4+7DuImbm8hxIfk=;
b=UQs51Z1sJq4UagaBDeLvSwikAWHMBVxX9zfhiBMXXJWsK3avnSnLF/NKHjPMl6LDORoX/m
ChrWRH1lp4WTpgTpb7UTASqjPQQK5/ez+ti47dIwvf2uIg2z55wDkHiEHcFLJlK1Oh4v6Y
UaB6TxF0Q5B7MkzwTCp5SrpeojS2TXNCEP1TIEOVLhlBP1/EoPrK6LP+nCH0EkV+PaIWX4
UtelhAVI/VqcK4XQEOYgzIcyX/qG/aODj64pH4GkBd5455++MXRfphK1CM2gITomHGkIMP
bwEKLj6SxlgKeaXDPX2VIzRU9p0nJJ5a84BjnnS5L0fL9F34NfK+bNX9NSUmNg==
ARC-Authentication-Results: i=1; rspamd-5798657bcf-q6k6t;
auth=pass smtp.auth=dreamhost smtp.mailfrom=list1@tompassin.net
X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net
X-MC-Relay: Neutral
X-MailChannels-SenderId: dreamhost|x-authsender|tpassin@tompassin.net
X-MailChannels-Auth-Id: dreamhost
X-Rock-Company: 7998a78d0a570d7d_1665511918451_2241382889
X-MC-Loop-Signature: 1665511918451:2250360669
X-MC-Ingress-Time: 1665511918451
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tompassin.net;
s=dreamhost; t=1665511917;
bh=kpD/TOwjgL7gIAtoC77DnNcJjO/+4+7DuImbm8hxIfk=;
h=Date:From:Subject:To:Content-Type:Content-Transfer-Encoding;
b=DtSnbtdJYDF98qjhDw3ts9HvXB87g0byhx+XtnZ4sPshVWZO1xwl4AFX/LvCLHRDE
+mBZuwA/bmMP1V+dZpIZNX7vm43+M5k+7Q81xNgRao35pC5ETQ9RyGNidcCV2M/32d
rjqjHyfJTPzHIbvHDgJuqQcIW1lMk9fM6Zq/xX993HuFcrd26u0xyVdg0/dmukSOQ4
Y3k+BWMKfRBfb1Ls68IGgy8/AGFTBh851mNBLtrN5dAMPjY+ftrjl+MtSfQ8BbdP9x
TqndHLaZtnxshNI1lhvtq37Qf3KeC+ShWuQ126C2d1luaPLh3KByvQp4Py24+lYgb/
dMbnRAwyKzsTg==
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.3.2
Content-Language: en-US
In-Reply-To: <011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <4612662b-f15e-777d-bbea-7c8906ac1de3@tompassin.net>
X-Mailman-Original-References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me> <008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
<011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
 by: Thomas Passin - Tue, 11 Oct 2022 18:11 UTC

On 10/11/2022 3:10 AM, avi.e.gross@gmail.com wrote:
> I see resemblances to something like how a web page is loaded and operated.
> I mean very different but at some level not so much.
>
> I mean a typical web page is read in as HTML with various keyword regions
> expected such as <BODY> ... </BODY> or <DIV ...> ... </DIV> with things
> often cleanly nested in others. The browser makes nodes galore in some kind
> of tree format with an assortment of objects whose attributes or methods
> represent aspects of what it sees. The resulting treelike structure has
> names like DOM.

To bring things back to the context of the original post, actual web
browsers are extremely tolerant of HTML syntax errors (including
incorrect nesting of tags) in the documents they receive. They usually
recover silently from errors and are able to display the rest of the
page. Usually they manage this correctly. The OP would like to have a
parser or checker that could do the same, plus giving an output showing
where each of the errors happened.

I can imagine such a parser also reporting which lines it had to skip
before it was able to recover.

Re: What to use for finding as many syntax errors as possible.

<mailman.654.1665518464.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19806&group=comp.lang.python#19806

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.uzoreto.com!fu-berlin.de!uni-berlin.de!not-for-mail
From: ros...@gmail.com (Chris Angelico)
Newsgroups: comp.lang.python
Subject: Re: What to use for finding as many syntax errors as possible.
Date: Wed, 12 Oct 2022 07:00:50 +1100
Lines: 25
Message-ID: <mailman.654.1665518464.20444.python-list@python.org>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
<008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
<011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
<4612662b-f15e-777d-bbea-7c8906ac1de3@tompassin.net>
<CAPTjJmr2JFpbGoTbPaM5Xw4Jr06XSLtsg51Nj8MRDt_iBtgOqA@mail.gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
X-Trace: news.uni-berlin.de +LODLYyb+nsJ9im/WpMR+Qn4CAjh2oR56JtJPUVs6MKA==
Return-Path: <rosuav@gmail.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=gmail.com header.i=@gmail.com header.b=Iulx21zb;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.075
X-Spam-Evidence: '*H*': 0.85; '*S*': 0.00; '2022': 0.05; 'debug':
0.07; 'syntax': 0.15; '3:10': 0.16; 'attributes': 0.16; 'chrisa':
0.16; 'cleanly': 0.16; 'from:addr:rosuav': 0.16; 'from:name:chris
angelico': 0.16; 'nested': 0.16; 'nodes': 0.16;
'received:209.85.218': 0.16; 'recover': 0.16; 'resulted': 0.16;
'subject:syntax': 0.16; 'wrote:': 0.16; 'to:addr:python-list':
0.20; 'actual': 0.25; 'am,': 0.31; 'context': 0.32; 'keyword':
0.32; 'objects': 0.32; 'structure': 0.32; 'message-
id:@mail.gmail.com': 0.32; 'but': 0.32; "i'm": 0.33;
'subject:for': 0.33; 'able': 0.34; 'mean': 0.34; 'header:In-Reply-
To:1': 0.34; 'received:google.com': 0.34; 'from:addr:gmail.com':
0.35; 'display': 0.36; 'errors': 0.36; 'received:209.85': 0.37;
'read': 0.38; 'received:209': 0.39; 'methods': 0.39; 'much.':
0.39; 'rest': 0.39; 'wed,': 0.39; 'subject:What': 0.40;
'something': 0.40; 'format': 0.62; 'email addr:gmail.com': 0.63;
'browser': 0.64; 'documents': 0.65; 'back': 0.67; 'resulting':
0.70; 'manage': 0.73; 'html': 0.80; 'incorrect': 0.84;
'subject:many': 0.84; 'tiny': 0.84; 'weird': 0.84; 'reliable.':
0.91; 'aspects': 0.93
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112;
h=to:subject:message-id:date:from:in-reply-to:references:mime-version
:from:to:cc:subject:date:message-id:reply-to;
bh=0XjGQFGBzrG2Fxrac+tW3KSZ2ByHGCSr9KOOuFdMjH0=;
b=Iulx21zbLC1Nz+UvM3zZvLPm1NlEl5GjzFmLMAeknurRPIONf4NxRbIk+tNWYLsMho
2YnypMFRZpy3MxPsTE1zzuWmZF3y815lmwyiPYvjcYLKcKLTqlezg+AYrb/NIehu7yUD
ATuCz4KGRLHgjuyvHPACG7KqkZ3RjDUqYeAU6pPP+Wg0RCgPJEAO84JbUIqfqSYOhgTB
XB/QxTUD0g+vWz8vr7kTC3JRTbiKhZPhn9KmSUIWJ8nC75+vBP+4LrysAcKc28lGA/aw
V+1OPxSP6n49a/TtldmbVJc4NVCViymK/Kdtn7f2abQ8P2uUQq1iMlrEY4dcMKb2ZQQK
Jnvw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=to:subject:message-id:date:from:in-reply-to:references:mime-version
:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to;
bh=0XjGQFGBzrG2Fxrac+tW3KSZ2ByHGCSr9KOOuFdMjH0=;
b=BJbdWjefCDQEbxiUGqePiMlErAetwFUsU/CFPNdqWC8b+VEYPPbzpTQQv6973FQnCL
mpt+SuMtFwefiQVylyO02Fb2hcNhETKHQ/wmkMk3jCugw4/KmG5XkFiX9qlHU87KLnZm
7s5kbVTQTVAPg1ke9616uodnfJLNdAd81SvPwCvs4Tt8yz3RMdUDLt8CCJlmSfeUfIAc
lL5gZXv0/0sIXW96Mza99zN/X4NKJ8sJuoNesbQqGoYzKcqs+esG0OlzIbdSCjAIDBd9
ef31Zo4fU5cs+BnNd2em/N0NAj0Z4D3Kd43kqNYaMZNARmvmw6ZBW1ey+R8XIJAjmdri
P+8A==
X-Gm-Message-State: ACrzQf2axAjteuzOIW/CiML4F+573csmjMS9w2etRBIV0xnZLqlowldw
UMoqvco+ri8+fB74PuJpNEWWob/2edeFcjGLABiJoZNp
X-Google-Smtp-Source: AMsMyM6tPrD5+cXmOl1Bx0kT6JIv0LPGOvSjG+ZpvE23TCEoIo1ujC1xfVfsQLgVTY7NUYcn2q8CSbog3pWbm/pJLi8=
X-Received: by 2002:a17:907:97d2:b0:787:a9ee:8ced with SMTP id
js18-20020a17090797d200b00787a9ee8cedmr20234575ejc.335.1665518461655; Tue, 11
Oct 2022 13:01:01 -0700 (PDT)
In-Reply-To: <4612662b-f15e-777d-bbea-7c8906ac1de3@tompassin.net>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <CAPTjJmr2JFpbGoTbPaM5Xw4Jr06XSLtsg51Nj8MRDt_iBtgOqA@mail.gmail.com>
X-Mailman-Original-References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
<008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
<011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
<4612662b-f15e-777d-bbea-7c8906ac1de3@tompassin.net>
 by: Chris Angelico - Tue, 11 Oct 2022 20:00 UTC

On Wed, 12 Oct 2022 at 05:23, Thomas Passin <list1@tompassin.net> wrote:
>
> On 10/11/2022 3:10 AM, avi.e.gross@gmail.com wrote:
> > I see resemblances to something like how a web page is loaded and operated.
> > I mean very different but at some level not so much.
> >
> > I mean a typical web page is read in as HTML with various keyword regions
> > expected such as <BODY> ... </BODY> or <DIV ...> ... </DIV> with things
> > often cleanly nested in others. The browser makes nodes galore in some kind
> > of tree format with an assortment of objects whose attributes or methods
> > represent aspects of what it sees. The resulting treelike structure has
> > names like DOM.
>
> To bring things back to the context of the original post, actual web
> browsers are extremely tolerant of HTML syntax errors (including
> incorrect nesting of tags) in the documents they receive. They usually
> recover silently from errors and are able to display the rest of the
> page. Usually they manage this correctly.

Having had to debug tiny errors in HTML pages that resulted in
extremely weird behaviour, I'm not sure that I agree that they usually
manage correctly. Fundamentally, they guess, and guesswork is never
reliable.

ChrisA

Re: What to use for finding as many syntax errors as possible.

<mailman.660.1665522579.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19814&group=comp.lang.python#19814

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!aioe.org!news.uzoreto.com!fu-berlin.de!uni-berlin.de!not-for-mail
From: lis...@tompassin.net (Thomas Passin)
Newsgroups: comp.lang.python
Subject: Re: What to use for finding as many syntax errors as possible.
Date: Tue, 11 Oct 2022 17:09:27 -0400
Lines: 33
Message-ID: <mailman.660.1665522579.20444.python-list@python.org>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me> <008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
<011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
<4612662b-f15e-777d-bbea-7c8906ac1de3@tompassin.net>
<CAPTjJmr2JFpbGoTbPaM5Xw4Jr06XSLtsg51Nj8MRDt_iBtgOqA@mail.gmail.com>
<618720ac-e43b-5f71-93b7-8d418132a3eb@tompassin.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: news.uni-berlin.de 8TrsIiPKWh/CdPiZCcl6ngxy7ovyZ4HBj5lwQeX1GRFQ==
Return-Path: <list1@tompassin.net>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=tompassin.net header.i=@tompassin.net header.b=sn2m9vmX;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.046
X-Spam-Evidence: '*H*': 0.91; '*S*': 0.00; '2022': 0.05; 'thread':
0.05; 'debug': 0.07; 'angelico': 0.09; 'received:23.83.212': 0.09;
'received:elm.relay.mailchannels.net': 0.09; 'syntax': 0.15;
'3:10': 0.16; 'attributes': 0.16; 'cleanly': 0.16; 'correction':
0.16; 'for.': 0.16; 'nested': 0.16; 'nodes': 0.16;
'received:10.0.0': 0.16; 'received:64.90': 0.16;
'received:64.90.62': 0.16; 'received:64.90.62.162': 0.16;
'recover': 0.16; 'resulted': 0.16; 'subject:syntax': 0.16;
'wrote:': 0.16; 'pm,': 0.19; 'to:addr:python-list': 0.20;
'purposes': 0.22; 'actual': 0.25; 'programming': 0.25; "isn't":
0.27; '>>>': 0.28; 'chris': 0.28; 'error': 0.29; 'header:User-
Agent:1': 0.30; 'am,': 0.31; 'think': 0.32; '(as': 0.32;
'context': 0.32; 'keyword': 0.32; 'objects': 0.32;
'received:10.0': 0.32; 'received:mailchannels.net': 0.32;
'received:relay.mailchannels.net': 0.32; 'structure': 0.32; 'but':
0.32; "i'm": 0.33; 'subject:for': 0.33; 'able': 0.34; 'mean':
0.34; 'header:In-Reply-To:1': 0.34; 'files': 0.36; 'display':
0.36; 'errors': 0.36; 'though': 0.37; 'read': 0.38; 'methods':
0.39; 'much.': 0.39; 'rest': 0.39; 'wed,': 0.39; 'subject:What':
0.40; 'wants': 0.40; 'something': 0.40; 'format': 0.62;
'reasonable': 0.62; 'email addr:gmail.com': 0.63; 'browser': 0.64;
'his': 0.65; 'documents': 0.65; 'wish': 0.66; 'back': 0.67;
'generally': 0.67; 'header:Received:6': 0.67; 'received:64': 0.67;
'recovery,': 0.69; 'resulting': 0.70; 'manage': 0.73; 'html':
0.80; 'practical': 0.84; '4:00': 0.84; 'decent': 0.84;
'incorrect': 0.84; 'subject:many': 0.84; 'tiny': 0.84; 'weird':
0.84; 'reliable.': 0.91; 'aspects': 0.93
X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net
ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1665522568; a=rsa-sha256;
cv=none;
b=k9LrPAnzhN363l2tDKEy3DdfJqdslbth0Ai/tVYNNKQpfPpx35ojeV/CUDK5MV6q9Oiev4
hIUUt/T/DxWFd1Kj2zIQXyNe84Bef0p3PFAMXS78y9Yuf5lcnhEvvRsj4mtQ4dGvC2U3IF
R5Sa+QBt/8jZjyDeXVwaCJrO1HYCpJEUJqa7Z/aPHRt+SO06gqb71bU8BDtqkqoyYs7vy3
BjWEh6SKPm93cGKqLQ4J667g2FVyfPiCs8hyTrGx9uSpq8XGCQwURDH54X0U9LSxuYk5GO
tpVDXg+NhqlN8pyzyVTr9S+BpzBF/3udPd2GHjXUYKnq+NoNIO+6D5tcyJBlYQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
d=mailchannels.net; s=arc-2022; t=1665522568;
h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
to:to:cc:mime-version:mime-version:content-type:content-type:
content-transfer-encoding:content-transfer-encoding:
in-reply-to:in-reply-to:references:references:dkim-signature;
bh=Jmw82j2tKWbuJ9kpmv8MIYoER5+R5mV3jQCl4j3SNCQ=;
b=hKd8InfICmdtFDpH8KhA2KnbjcAldlnv+FpAb6qNpYImwBzERgfzY8Zrt3sKBDaQ6++7VP
j8YAv7Wyj9MI1GuAj8zfZep//M2HX2ajh0WYKxyO3ZgWbidVEapqkY2Dd0/asItuisTbZ1
st1PaeSJe0brjeQqJ+G1esVqtK2ca3E8TvyBSC3vIshsUVogUlhpUBUDN6dHYsH1FpHwYd
k5ZVgoHji4xLsNis3wPg5qXjKKR4wuWFWdA76lsLYLXp7yRq3qSaiUHbGaKEa+vwvoYswO
mP5U5yBri3pJ313C2tJO98mCUqlq447n5z3ZvRCstVUHLqFJzYUkJEOPvOy9fg==
ARC-Authentication-Results: i=1; rspamd-7c485dd8cf-f8dgg;
auth=pass smtp.auth=dreamhost smtp.mailfrom=list1@tompassin.net
X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net
X-MC-Relay: Neutral
X-MailChannels-SenderId: dreamhost|x-authsender|tpassin@tompassin.net
X-MailChannels-Auth-Id: dreamhost
X-Trail-Descriptive: 33c8d0554f941d12_1665522569176_1806426504
X-MC-Loop-Signature: 1665522569176:1280338053
X-MC-Ingress-Time: 1665522569175
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tompassin.net;
s=dreamhost; t=1665522568;
bh=Jmw82j2tKWbuJ9kpmv8MIYoER5+R5mV3jQCl4j3SNCQ=;
h=Date:Subject:To:From:Content-Type:Content-Transfer-Encoding;
b=sn2m9vmXETEpa6poFD6+t1e/QP54Oh3sBb8T6SxLQGcDq3ls0fRE+aIXWbjAu7F9k
LkItTE6teRkz8bnmlMD9RASTS5wN3JP1wSTTU8jKmtkxq7tiy+ERHsi/IqLpNMfRuf
WUA931xfeoW98cjkYpTcJu1f48cVVKQwLJg5ZinS6RPVMiSOvZCyCEI8Enamg160pO
KbjCR84YYQ2e8E+XE8nmHeKPHFOsIE02yRyDY8cl7RAtEdQqQB9nwAX0CkxoicRqix
zcMKoS23MBAZLa2cCmfN4CJ1AQFe6WQCu/IlNC7xLsB8nHVO1Cb1DgoMoEnck7WIkA
vzkk2fb2Qy6+g==
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.3.2
Content-Language: en-US
In-Reply-To: <CAPTjJmr2JFpbGoTbPaM5Xw4Jr06XSLtsg51Nj8MRDt_iBtgOqA@mail.gmail.com>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <618720ac-e43b-5f71-93b7-8d418132a3eb@tompassin.net>
X-Mailman-Original-References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me> <008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
<011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
<4612662b-f15e-777d-bbea-7c8906ac1de3@tompassin.net>
<CAPTjJmr2JFpbGoTbPaM5Xw4Jr06XSLtsg51Nj8MRDt_iBtgOqA@mail.gmail.com>
 by: Thomas Passin - Tue, 11 Oct 2022 21:09 UTC

On 10/11/2022 4:00 PM, Chris Angelico wrote:
> On Wed, 12 Oct 2022 at 05:23, Thomas Passin <list1@tompassin.net> wrote:
>>
>> On 10/11/2022 3:10 AM, avi.e.gross@gmail.com wrote:
>>> I see resemblances to something like how a web page is loaded and operated.
>>> I mean very different but at some level not so much.
>>>
>>> I mean a typical web page is read in as HTML with various keyword regions
>>> expected such as <BODY> ... </BODY> or <DIV ...> ... </DIV> with things
>>> often cleanly nested in others. The browser makes nodes galore in some kind
>>> of tree format with an assortment of objects whose attributes or methods
>>> represent aspects of what it sees. The resulting treelike structure has
>>> names like DOM.
>>
>> To bring things back to the context of the original post, actual web
>> browsers are extremely tolerant of HTML syntax errors (including
>> incorrect nesting of tags) in the documents they receive. They usually
>> recover silently from errors and are able to display the rest of the
>> page. Usually they manage this correctly.
>
> Having had to debug tiny errors in HTML pages that resulted in
> extremely weird behaviour, I'm not sure that I agree that they usually
> manage correctly. Fundamentally, they guess, and guesswork is never
> reliable.

Still, browsers generally do a very decent job of recovery, even though
perfection isn't possible. The OP wants to get help with problems in
his files even if it isn't perfect, and I think that's reasonable to
wish for. The link to a post about the lezer parser in a recent message
on this thread is partly about how a real, practical parser can do some
error correction in mid-flight, for the purposes of a programming editor
(as opposed to one that has to build a correct program).

Re: What to use for finding as many syntax errors as possible.

<mailman.661.1665524733.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19815&group=comp.lang.python#19815

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: lis...@tompassin.net (Thomas Passin)
Newsgroups: comp.lang.python
Subject: Re: What to use for finding as many syntax errors as possible.
Date: Tue, 11 Oct 2022 17:45:25 -0400
Lines: 38
Message-ID: <mailman.661.1665524733.20444.python-list@python.org>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me> <008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
<011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
<4612662b-f15e-777d-bbea-7c8906ac1de3@tompassin.net>
<CAPTjJmr2JFpbGoTbPaM5Xw4Jr06XSLtsg51Nj8MRDt_iBtgOqA@mail.gmail.com>
<618720ac-e43b-5f71-93b7-8d418132a3eb@tompassin.net>
<ff62b500-8606-925f-3789-1b1d40c04bbf@tompassin.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
X-Trace: news.uni-berlin.de rUNq997jnkrBcqRgN0zjHQjEBsk13GbhK+fGcAHb6bbQ==
Return-Path: <list1@tompassin.net>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=tompassin.net header.i=@tompassin.net header.b=BoYWH3K1;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.154
X-Spam-Level: *
X-Spam-Evidence: '*H*': 0.69; '*S*': 0.00; 'url-ip:140.82/16': 0.03;
'thread': 0.05; 'url:github': 0.14; 'syntax': 0.15; 'url-
ip:140/8': 0.15; 'correcting': 0.16; 'correction': 0.16;
'personally,': 0.16; 'received:10.0.0': 0.16;
'received:23.83.209.151': 0.16; 'received:64.90': 0.16;
'received:64.90.62': 0.16; 'received:64.90.62.162': 0.16;
'received:quail.birch.relay.mailchannels.net': 0.16;
'subject:syntax': 0.16; 'wrote:': 0.16; 'code.': 0.17; 'uses':
0.19; 'pm,': 0.19; 'to:addr:python-list': 0.20; 'purposes': 0.22;
'run': 0.23; "i'd": 0.24; 'programming': 0.25; 'seems': 0.26;
'visual': 0.26; "isn't": 0.27; 'mostly': 0.28; 'suggest': 0.28;
'error': 0.29; 'header:User-Agent:1': 0.30; 'program': 0.31;
'think': 0.32; '(as': 0.32; 'personally': 0.32; 'received:10.0':
0.32; 'received:mailchannels.net': 0.32;
'received:relay.mailchannels.net': 0.32; "wouldn't": 0.32; 'mark':
0.32; 'but': 0.32; 'subject:for': 0.33; 'server': 0.33; 'windows':
0.34; 'header:In-Reply-To:1': 0.34; 'complex': 0.35; 'files':
0.36; 'errors': 0.36; "it's": 0.37; 'way': 0.38; 'could': 0.38;
'url-ip:23.66/16': 0.39; 'mentioned': 0.39; 'setting': 0.39;
'use': 0.39; 'to.': 0.39; 'studio': 0.40; 'subject:What': 0.40;
'wants': 0.40; 'both': 0.40; 'something': 0.40; 'likely': 0.61;
'reasonable': 0.62; 'limited': 0.62; 'imagine': 0.64; 'in.': 0.64;
'your': 0.64; 'his': 0.65; 'tool': 0.65; 'wish': 0.66; 'time.':
0.66; 'earlier': 0.67; 'entire': 0.67; 'header:Received:6': 0.67;
'time,': 0.67; 'received:64': 0.67; 'per': 0.68; 'cost': 0.69;
'showed': 0.69; 'analysis': 0.69; 'free,': 0.70; 'chain': 0.76;
'choice': 0.76; 'handles': 0.76; 'quick': 0.77; 'practical': 0.84;
'apparent': 0.84; 'decent': 0.84; 'subject:many': 0.84; 'behind':
0.88; 'cheap': 0.91
X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net
ARC-Seal: i=1; s=arc-2022; d=mailchannels.net; t=1665524726; a=rsa-sha256;
cv=none;
b=9kSua6Ma6QJZf+fqwcUS3wSnFrYw4NuYXnPANFBCgfy4JwqR8NoFtdKjT/swps/jqlQveJ
C/Wp9OiEJg2kMUAEMfMrWUx5k+SyvmlBk/cEdSDbJ6AZ/7xtistOswhLDeu3a8FT4CmiG/
nSvD/KhtFMlxdeIqquDDKMqE9hwe+V0FUn5e5S/2DaUhFIhPkaxzJV2eott2cGG9W+7czN
IpmBzBuRIdwZgsA3TOIgBjrXdNMeK4ARXj6c34w9qSUg9cSlCDbcB8dvhKJM3z07oAMvKC
Ozo39MDshNgrqT0cRt9Tmymuz2BRm5L1a0HQl54iAytJ5ZSNKYERudAn9i7xzA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed;
d=mailchannels.net; s=arc-2022; t=1665524726;
h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
to:to:cc:mime-version:mime-version:content-type:content-type:
content-transfer-encoding:content-transfer-encoding:
in-reply-to:in-reply-to:references:references:dkim-signature;
bh=jnnHpZVOZCCjfgRkLsXWkQVKIn6rWHkm9ZEeYsQa4vs=;
b=OS34YqUOx0wylmprOCwqLKjDrgjz4JBb/sDi3uCqzrm2Ohn66ctUXMsaAFPT8g1K/dNSZY
iPOF95u+8354RvPF6NjhS8z8AVs7wMHT3tls+KGajmh8OkyUeD9tfQmjzyUXniJS9crvxh
s9H4IcLYe9eAg2XBtvFqTYg8+bgXj+sPWYSL1FyAqJ4kfVpK0SsAfXf3yDwg+AgEQQxVaA
1UcEz8f2ql6OL7tGHOuPo9MCSlZ0BfkNuX7R0FUHkM0hwPEEVqwUM72I2iGNXTV+PL1V5Q
ATxkJlKz6MUuR+CTc1Kwjha4GYuMsQ89xKKtFtdlVbm8mjGvZpraknEFGnDd5g==
ARC-Authentication-Results: i=1; rspamd-5798657bcf-4rxhb;
auth=pass smtp.auth=dreamhost smtp.mailfrom=list1@tompassin.net
X-Sender-Id: dreamhost|x-authsender|tpassin@tompassin.net
X-MC-Relay: Neutral
X-MailChannels-SenderId: dreamhost|x-authsender|tpassin@tompassin.net
X-MailChannels-Auth-Id: dreamhost
X-Continue-Average: 561af7e75dbe4273_1665524726995_2909086886
X-MC-Loop-Signature: 1665524726995:677553158
X-MC-Ingress-Time: 1665524726995
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tompassin.net;
s=dreamhost; t=1665524726;
bh=jnnHpZVOZCCjfgRkLsXWkQVKIn6rWHkm9ZEeYsQa4vs=;
h=Date:Subject:From:To:Content-Type:Content-Transfer-Encoding;
b=BoYWH3K1sNfSGLjuYprSMbR9SlcYAEHOSDszDdRqrebE3X/nQ//1EUTkNKguEGWBr
lhxAx7K9oghil4qd5iHHljlHY/tSjzYfE1r4+O/CstACXoqIenOSf4j3B0wcmU+lOC
xR28MEQ3ka/8//U8HewTDbWh/RMQ2RfBLu9B/rbSNXIlan3VrrnGUbsOIfFOFa049e
z4tmWbLAAbzzDz7OX5UTr/1Us71DD6GYJwzsH7wjktApjS98YPkfrjvOW++oq7UhuO
gtHi+I34q3wp4FJjwiZQUw5/T74fwP/yOoN3b6IUskz24ECV9pv0UGfDacbiezag0B
yqe74rU6Ie0zQ==
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101
Thunderbird/102.3.2
Content-Language: en-US
In-Reply-To: <618720ac-e43b-5f71-93b7-8d418132a3eb@tompassin.net>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <ff62b500-8606-925f-3789-1b1d40c04bbf@tompassin.net>
X-Mailman-Original-References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me> <008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
<011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
<4612662b-f15e-777d-bbea-7c8906ac1de3@tompassin.net>
<CAPTjJmr2JFpbGoTbPaM5Xw4Jr06XSLtsg51Nj8MRDt_iBtgOqA@mail.gmail.com>
<618720ac-e43b-5f71-93b7-8d418132a3eb@tompassin.net>
 by: Thomas Passin - Tue, 11 Oct 2022 21:45 UTC

On 10/11/2022 5:09 PM, Thomas Passin wrote:
<snip>
> The OP wants to get help with problems in
> his files even if it isn't perfect, and I think that's reasonable to
> wish for.  The link to a post about the lezer parser in a recent message
> on this thread is partly about how a real, practical parser can do some
> error correction in mid-flight, for the purposes of a programming editor
> (as opposed to one that has to build a correct program).

One editor that seems to do what the OP wants is Visual Studio Code. It
will mark apparent errors - not just syntax errors - not limited to one
per page. Sometimes it can even suggest corrections. I personally
dislike the visual clutter the markings impose, but I imagine I could
get used to it.

VSC uses a Microsoft system they call "PyLance" - see

https://devblogs.microsoft.com/python/announcing-pylance-fast-feature-rich-language-support-for-python-in-visual-studio-code/

Of course, you don't get something complex for free, and in this case
the cost is having to run a separate server to do all this analysis on
the fly. However, VSC handles all of that behind the scenes so you
don't have to.

Personally, I'd most likely go for a decent programming editor that you
can set up to run a program on your file, use that to run a checker,
like pyflakes for instance, and run that from time to time. You could
run it when you save a file. Even if it only showed one error at a
time, it would make quick work of correcting mistakes. And it wouldn't
need to trigger an entire tool chain each time.

My editor of choice for setting up helper "tools" like this on Windows
is Editplus (non-free but cheap and very worth it), and I have both
py_compile and pyflakes set up this way in it. However, as I mentioned
in an earlier post, the Leo Editor
(https://github.com/leo-editor/leo-editor) does this for you
automatically when you save, so it's very convenient. That's what I
mostly work in.

Re: What to use for finding as many syntax errors as possible.

<mailman.695.1665622136.20444.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=19882&group=comp.lang.python#19882

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: hjp-pyt...@hjp.at (Peter J. Holzer)
Newsgroups: comp.lang.python
Subject: Re: What to use for finding as many syntax errors as possible.
Date: Thu, 13 Oct 2022 02:48:54 +0200
Lines: 44
Message-ID: <mailman.695.1665622136.20444.python-list@python.org>
References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
<008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
<011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
<4612662b-f15e-777d-bbea-7c8906ac1de3@tompassin.net>
<20221013004854.bkfxunyqznxp3prx@hjp.at>
Mime-Version: 1.0
Content-Type: multipart/signed; micalg=pgp-sha512;
protocol="application/pgp-signature"; boundary="q6zvzpmelffmdxtc"
X-Trace: news.uni-berlin.de c9PA8dRhP+PeDcg8eqaGdwaf+LEz0TE2epRqL8uXXAbA==
Return-Path: <hjp-python@hjp.at>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.004
X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'content-
type:multipart/signed': 0.05; 'content-type:application/pgp-
signature': 0.09; 'filename:fname piece:asc': 0.09;
'filename:fname piece:signature': 0.09;
'filename:fname:signature.asc': 0.09; 'syntax': 0.15; '"creative':
0.16; '__/': 0.16; 'challenge!"': 0.16; 'from:addr:hjp-python':
0.16; 'from:addr:hjp.at': 0.16; 'from:name:peter j. holzer': 0.16;
'hjp@hjp.at': 0.16; 'holzer': 0.16; 'reality.': 0.16; 'recover':
0.16; 'stross,': 0.16; 'subject:syntax': 0.16; 'url-
ip:212.17.106.137/32': 0.16; 'url-ip:212.17.106/24': 0.16; 'url-
ip:212.17/16': 0.16; 'url:hjp': 0.16; 'well-defined': 0.16;
'|_|_)': 0.16; 'wrote:': 0.16; 'to:addr:python-list': 0.20;
'actual': 0.25; 'bit': 0.27; 'sense': 0.28; 'context': 0.32;
'subject:for': 0.33; 'there': 0.33; 'header:In-Reply-To:1': 0.34;
'errors': 0.36; 'subject:What': 0.40; 'received:212': 0.62;
'documents': 0.65; 'received:userid': 0.66; 'back': 0.67;
'exactly': 0.68; 'sequence': 0.69; 'url-ip:212/8': 0.69; 'html':
0.80; 'dom': 0.84; 'incorrect': 0.84; 'received:at': 0.84;
'subject:many': 0.84
Mail-Followup-To: python-list@python.org
Content-Disposition: inline
In-Reply-To: <4612662b-f15e-777d-bbea-7c8906ac1de3@tompassin.net>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <20221013004854.bkfxunyqznxp3prx@hjp.at>
X-Mailman-Original-References: <a95001e1-b60e-63ce-d363-30481d115282@vub.be>
<CABaFrRayYinb9Cd1w=iAz1NWmt6MOzam1yEYXPaAMjREriVUAg@mail.gmail.com>
<mailman.556.1665330610.20444.python-list@python.org>
<ti169d$qntd$1@dont-email.me>
<008201d8dd16$718fd420$54af7c60$@gmail.com>
<CAPTjJmoLYGUSPHOu5QPom39K3C5Xzk4N1VQB-wqK-kLZWqck2A@mail.gmail.com>
<00b001d8dd21$01475160$03d5f420$@gmail.com>
<CAPTjJmpJj7MGA5U-fqCrM-mdayw+_K9MdsfOzBe1HU8=NHWCBw@mail.gmail.com>
<011e01d8dd40$9896ba50$c9c42ef0$@gmail.com>
<4612662b-f15e-777d-bbea-7c8906ac1de3@tompassin.net>
 by: Peter J. Holzer - Thu, 13 Oct 2022 00:48 UTC
Attachments: signature.asc (application/pgp-signature)

On 2022-10-11 14:11:56 -0400, Thomas Passin wrote:
> To bring things back to the context of the original post, actual web
> browsers are extremely tolerant of HTML syntax errors (including incorrect
> nesting of tags) in the documents they receive.

HTML5 actually specifies exactly how to recover from errors. So since
every sequence of bytes results in a well-defined DOM tree you might
argue (a bit tongue in cheek) that there are no syntax errors in HTML5.

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

Attachments: signature.asc (application/pgp-signature)
1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor