Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

The bogosity meter just pegged.


devel / comp.lang.python / Re: lxml parsing with validation and target?

SubjectAuthor
o Re: lxml parsing with validation and target?Robin Becker

1
Re: lxml parsing with validation and target?

<mailman.173.1635951572.23718.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=15902&group=comp.lang.python#15902

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: rob...@reportlab.com (Robin Becker)
Newsgroups: comp.lang.python
Subject: Re: lxml parsing with validation and target?
Date: Wed, 3 Nov 2021 14:59:29 +0000
Lines: 38
Message-ID: <mailman.173.1635951572.23718.python-list@python.org>
References: <0723b9fd-62a8-cb87-98e7-d0966dfb4e98@everest.reportlab.co.uk>
<f0687e81-bf3a-e072-edea-d4f0a8641f9b@everest.reportlab.co.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: news.uni-berlin.de 3dy9GeLsY+ZcLOZ+irfvng4nLt3evBurw4KpW5D2NHRg==
Return-Path: <robin@reportlab.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=reportlab-com.20210112.gappssmtp.com
header.i=@reportlab-com.20210112.gappssmtp.com
header.b=DP+nPuR4; dkim-adsp=none (unprotected policy);
dkim-atps=neutral
X-Spam-Status: OK 0.048
X-Spam-Evidence: '*H*': 0.90; '*S*': 0.00; 'def': 0.04; 'else:': 0.09;
'skip:_ 20': 0.09; 'becker': 0.16; 'none:': 0.16;
'received:192.168.0.16': 0.16; 'robin': 0.16;
'subject:validation': 0.16; 'tuple': 0.16; 'wrote:': 0.16;
'problem': 0.16; 'to:addr:python-list': 0.20; 'returns': 0.22;
'skip:_ 10': 0.22; 'code': 0.23; 'error': 0.29; 'header:User-
Agent:1': 0.30; "i'm": 0.33; 'received:192.168.0': 0.33; 'header
:In-Reply-To:1': 0.34; 'received:google.com': 0.34; 'invalid':
0.35; 'using': 0.37; 'received:209.85': 0.37; 'class': 0.37;
'received:192.168': 0.37; 'received:209': 0.39; 'pass': 0.64;
'skip:r 20': 0.64; 'skip:t 20': 0.66; 'skip:n 30': 0.67; 'below':
0.69; 'kids': 0.70; 'raised': 0.70; 'utilizing': 0.76; 'out.':
0.80; '..........': 0.84
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=reportlab-com.20210112.gappssmtp.com; s=20210112;
h=message-id:date:mime-version:user-agent:subject:content-language
:from:to:references:in-reply-to:content-transfer-encoding;
bh=eeywt/9Sdn/klx9dfQXz5tfNlDdKZT89b83m4SK4YaY=;
b=DP+nPuR4T2/VX1jbnNJkWL40FblYGkZvZptZ9YjddmDVXeSvE5FwjPPeQvhnE3mcXI
vLxda8xY7yiafrwb5uabKHcZ6/VcPdiInxbKIR3XCcM+DbpskDniLh/qB/7jnFSKPW5g
YTsB/qI6Jyu3xWpVBMnoUnl2Izj1eRgG5jsF8oXNSR27eCMuL5eSj61APz/XDJ8Ic4NR
nriYAkM+SsKO+1w+FQ06P79Ku08T4rxS+OQjpWM2gG3u4Fwp4icQdD0X0vQJtO5JrqBs
ccaLveNa3DqJL2vOaYFfQNzF8Cph1tGOLZOoZ3CoB44yz3y5TrreCEnmz7o5Ef2yctE8
EwUw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=x-gm-message-state:message-id:date:mime-version:user-agent:subject
:content-language:from:to:references:in-reply-to
:content-transfer-encoding;
bh=eeywt/9Sdn/klx9dfQXz5tfNlDdKZT89b83m4SK4YaY=;
b=jPeO+N/qojKL5OPBGftWDSRgHidwRO7Njr6d2xW2oaVMQSh7OwnIjIQgSqtpoWWRCI
Bf+epkC8ARXUtfs8WIGhcfsbUbPUTxUhEFRc+2RUC1XezAWPCUlu8t6X3qA03+UR/ZKn
A8eFJKNQx3Xiwg0Y6y2ER82H1DovJeYpeJUKn4XejUgSw2geuqvvH4QeuyI5jRUUzAdH
U2+tUXLa5+f/y3+B0wdFfuhOXRrFkeKQbAYn0iYUSSrwU2kIBVS7y+dGY7Gyt3u+Klk8
tZxvbir+kJ/qZihf16V9mavQQ5dpplDJesG5Ucis9J+pyutkLdcxA4UE3d6UmAKJIabL
maCA==
X-Gm-Message-State: AOAM531YSP5Sed147LFR6bVx6xdUUfgBgXtCg3/8pWJyurZFxFAwmB0z
xZLViPQVuNBpwxFrBuH38+VJ5rx79WP3uA==
X-Google-Smtp-Source: ABdhPJxTdiT3rl16RR9A7Q9zXwb5Ge5PAX8nWcXpc3UARhqsUSYRWIP692xuPO7tXp5hHQDd3R2vmQ==
X-Received: by 2002:a5d:5287:: with SMTP id c7mr58687906wrv.236.1635951570902;
Wed, 03 Nov 2021 07:59:30 -0700 (PDT)
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.2.1
Content-Language: en-US-large
In-Reply-To: <0723b9fd-62a8-cb87-98e7-d0966dfb4e98@everest.reportlab.co.uk>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.35
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <f0687e81-bf3a-e072-edea-d4f0a8641f9b@everest.reportlab.co.uk>
X-Mailman-Original-References: <0723b9fd-62a8-cb87-98e7-d0966dfb4e98@everest.reportlab.co.uk>
 by: Robin Becker - Wed, 3 Nov 2021 14:59 UTC

On 02/11/2021 12:55, Robin Becker wrote:
> I'm having a problem using lxml.etree to make a treebuilding parser that validates; I have test code where invalid xml
> is detected and an error raised when the line below target=ET.TreeBuilder(), is commented out.
>
..........

I managed to overcome this problem by utilizing the non-targeted parser with returns an _ElementTree object. I can then
convert to tuple tree using code like this

> class TT:
> def __init__(self):
> pass
>
> def __call__(self,tree):
> if not tree: return
> return self.maketuple(next(tree.iter()))
>
> def maketuple(self,e):
> return (e.tag,e.attrib or None,self.content(e),e.sourceline)
>
> def content(self,e):
> t = e.text
> kids = e.getchildren()
> if len(kids)==0 and t is None:
> return t
> else:
> r = [].append
> if t is not None: r(t)
> for c in kids:
> r(self.maketuple(c))
> t = c.tail
> if t is not None:
> r(t)
> return r.__self__

--
Robin Becker

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor