Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

The best defense against logic is ignorance.


devel / comp.lang.python / Re: preserving entities with lxml

SubjectAuthor
o Re: preserving entities with lxmlRobin Becker

1
Re: preserving entities with lxml

<mailman.168.1642067892.3079.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=16682&group=comp.lang.python#16682

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: rob...@reportlab.com (Robin Becker)
Newsgroups: comp.lang.python
Subject: Re: preserving entities with lxml
Date: Thu, 13 Jan 2022 09:58:09 +0000
Lines: 30
Message-ID: <mailman.168.1642067892.3079.python-list@python.org>
References: <35c7bce2-947d-018d-78ab-eefbdadeac9c@everest.reportlab.co.uk>
<25055.16083.125267.882016@ixdm.fritz.box>
<f92683e8-8909-44a3-4318-786adb0ab565@everest.reportlab.co.uk>
<25055.61709.862129.992096@ixdm.fritz.box>
<1042fe01-f8f7-c803-5402-e6fd4aacdc5f@everest.reportlab.co.uk>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: news.uni-berlin.de PLrxOfViBNeEMKBr05ifXAXxhhE2c40fAEcv6kQDifnQ==
Return-Path: <robin@reportlab.com>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=reportlab-com.20210112.gappssmtp.com
header.i=@reportlab-com.20210112.gappssmtp.com
header.b=kYm9qgJ0; dkim-adsp=none (unprotected policy);
dkim-atps=neutral
X-Spam-Status: OK 0.055
X-Spam-Evidence: '*H*': 0.89; '*S*': 0.00; 'knows': 0.04; 'cc:addr
:python-list': 0.09; 'parse': 0.09; 'skip:` 20': 0.09;
'(default:': 0.16; '(especially': 0.16; 'becker': 0.16; 'dieter':
0.16; 'received:192.168.0.16': 0.16; 'robin': 0.16; 'wrote:':
0.16; 'cc:addr:python.org': 0.20; 'input': 0.21; 'cc:2**0': 0.25;
'normally': 0.26; '>>>': 0.28; 'output': 0.28; 'header:User-
Agent:1': 0.30; 'seem': 0.31; 'think': 0.32; 'expand': 0.32;
'guess': 0.32; 'structure': 0.32; 'transform': 0.32;
'received:192.168.0': 0.33; 'mean': 0.34; 'work.': 0.34; 'header
:In-Reply-To:1': 0.34; 'received:google.com': 0.34; 'using': 0.37;
'received:209.85': 0.37; 'received:192.168': 0.37; 'means': 0.38;
'thanks': 0.38; 'received:209': 0.39; 'text': 0.39; 'use': 0.39;
'wrote': 0.39; 'otherwise,': 0.40; 'should': 0.40; 'reference':
0.60; 'initial': 0.61; "there's": 0.61; 'pass': 0.64; 'skip:r 20':
0.64; 'your': 0.64; '&amp;': 0.65; 'content': 0.72; '-->': 0.84;
'restoring': 0.84; 'reversed': 0.84
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=reportlab-com.20210112.gappssmtp.com; s=20210112;
h=message-id:date:mime-version:user-agent:subject:content-language:to
:cc:newsgroups:references:from:in-reply-to:content-transfer-encoding;
bh=apSYLR6wNAgWS5bcHLkVNgeEso/ELcy1XjGZ2xXQedk=;
b=kYm9qgJ0KHwN+IvepZtRTgxGjlSBkry+mOgbGs+ejZ4DdzXJRLzTAwbSOJ3DzWUrqL
8Wjsd4p1VyOm6xYVgGPYOs1UD0xj4laNaa1sYXR32YC5WUoxbhdNmuBXAEXNV3pIIqCC
66CYgbFKpCQ2jwEV3vGBh05OKsHmfDyG987aG3G3l6dgEKRhCnCwSmNfLpRIirYXpV1q
HxFzrWM5zBy8Yx1KWjx/CxPbFaVm+fc6snJl3TsQwBPGvJnWO11ICqFRLb/aeE3iz2hO
ySGxBBGoJYqpiwAsJ6WhiEmSDPhw6g5dF4r+WAogSUvS0uuR9O2MNre+F/1fhX/DuFm7
XtTQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
d=1e100.net; s=20210112;
h=x-gm-message-state:message-id:date:mime-version:user-agent:subject
:content-language:to:cc:newsgroups:references:from:in-reply-to
:content-transfer-encoding;
bh=apSYLR6wNAgWS5bcHLkVNgeEso/ELcy1XjGZ2xXQedk=;
b=Am97IEnAwj9Gk2k3pU8aVGEdiWu4S9dx8Mj05KPcTfvKDvN8+2xp34praAlJ2/ViJL
FJvVT8N6jUWEETqkLFRQI/NK8g2J14R+xfg/GUmAsv3YNvxaIcFVRpbXmFs7y2/lk1Rj
EzuHYVBTJcIflk4OuSM28fVYpoPI1xbk8xo9RgzB4Svq9hXM78pBzP6BXWvpzZ90O3wn
Px5xRF1CxEDExp9ZuxbO9XDFqvWRzf4eHZZCxt3p8Gd4poWR2tB/v367PmXCpKhR0E45
sr7ss9g6k4rP5jC7vSqa6mj2Iq6Dpx2a6GwB3SEJSX8Fdbgsxmhk128lBlKrU6yXqKkn
5oSw==
X-Gm-Message-State: AOAM5306NObE0LjGLLahakGm+twSa8G7jmQUCrSB8puL5w5Rpl0q4xW8
MFj1Cyy+dkSCHBHF8HxUpJY9jEN0vNjkvQ==
X-Google-Smtp-Source: ABdhPJzvWOWlU29LEjmhYaM0YkAZiy9Ccv4Nby/ydju3mtbPd49sNVO/Ds12e8lasdeMz9WfhRgxBw==
X-Received: by 2002:a5d:4e4c:: with SMTP id r12mr3229201wrt.666.1642067890913;
Thu, 13 Jan 2022 01:58:10 -0800 (PST)
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101
Thunderbird/91.4.1
Content-Language: en-US-large
In-Reply-To: <25055.61709.862129.992096@ixdm.fritz.box>
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <1042fe01-f8f7-c803-5402-e6fd4aacdc5f@everest.reportlab.co.uk>
X-Mailman-Original-References: <35c7bce2-947d-018d-78ab-eefbdadeac9c@everest.reportlab.co.uk>
<25055.16083.125267.882016@ixdm.fritz.box>
<f92683e8-8909-44a3-4318-786adb0ab565@everest.reportlab.co.uk>
<25055.61709.862129.992096@ixdm.fritz.box>
 by: Robin Becker - Thu, 13 Jan 2022 09:58 UTC

On 13/01/2022 09:29, Dieter Maurer wrote:
> Robin Becker wrote at 2022-1-13 09:13 +0000:
>> On 12/01/2022 20:49, Dieter Maurer wrote:
>> ...
>>> Apparently, the `resolve_entities=False` was not effective: otherwise,
>>> your tree content should have more structure (especially some
>>> entity reference children).
>>>
>> except that the tree knows not to expand the entities using ET.tostring so in some circumstances resolve_entities=False
>> does work.
>
> I think this is a misunderstanding: `tostring` will represent the text character `&` as `&amp;`.

aaahhhh,

thanks I see now. So tostring is actually restoring some of the entities which on input are normally expanded. If that
means resolve_entities=False does not work at all then I guess there's no need to use it at all. The initial transform

& --> &amp;

does what I need as it is reversed on output of the tree fragments.

Wonder what resolve_entities is actually used for then? All the docs seem to say

> resolve_entities - replace entities by their text value (default: True)

I assumed False would mean that they would pass through the parse
--
Robin Becker

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor