Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

Staff meeting in the conference room in %d minutes.


devel / comp.lang.python / Re: imaplib: is this really so unwieldy?

SubjectAuthor
o Re: imaplib: is this really so unwieldy?Terry Reedy

1
Re: imaplib: is this really so unwieldy?

<mailman.348.1621981552.3087.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=13335&group=comp.lang.python#13335

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!weretis.net!feeder8.news.weretis.net!3.eu.feeder.erje.net!feeder.erje.net!news-2.dfn.de!news.dfn.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: tjre...@udel.edu (Terry Reedy)
Newsgroups: comp.lang.python
Subject: Re: imaplib: is this really so unwieldy?
Date: Tue, 25 May 2021 18:23:34 -0400
Lines: 68
Message-ID: <mailman.348.1621981552.3087.python-list@python.org>
References: <21fb6c5f-97a4-654b-887f-2c31a549bcbe@adminart.net>
<hd6qag98c37mvqurlu3mfcvie38o63kn6n@4ax.com>
<d0e29810-858a-8a32-fda6-a68c63224606@mrabarnett.plus.com>
<s8jtd7$e0d$1@ciao.gmane.io>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: quoted-printable
X-Trace: news.uni-berlin.de FbVUyFvRQRL6Go+y3Pc9Wg3zb9n89PVPIOe9upI3kBDQ==
Return-Path: <python-python-list@m.gmane-mx.org>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.006
X-Spam-Evidence: '*H*': 0.99; '*S*': 0.00; 'string': 0.05; 'utf-8':
0.07; 'byte': 0.09; 'message-id:@ciao.gmane.io': 0.09; 'overhead':
0.09; 'received:ciao.gmane.io': 0.09; 'received:gmane.io': 0.09;
'received:list': 0.09; 'terry': 0.09; 'char': 0.16; 'characters.':
0.16; 'cpython': 0.16; 'encoding': 0.16; 'from:addr:udel.edu':
0.16; "isn't.": 0.16; 'micropython': 0.16; 'recall': 0.16;
'received:116.202': 0.16; 'received:116.202.254': 0.16;
'received:116.202.254.214': 0.16; 'wrote:': 0.16; 'python': 0.16;
'uses': 0.19; 'pm,': 0.20; 'to:addr:python-list': 0.23; '>>>':
0.26; 'space': 0.26; 'binary': 0.27; 'wrong': 0.27; 'bit': 0.28;
'input': 0.29; 'text': 0.29; 'header:User-Agent:1': 0.31; 'there':
0.31; 'but': 0.31; 'else,': 0.32; 'half': 0.32; 'using': 0.33;
'header:In-Reply-To:1': 0.33; 'same': 0.34; 'image': 0.37;
'really': 0.37; 'something': 0.38; 'use': 0.38; 'does': 0.38;
'time.': 0.63; 'subject:this': 0.63; 'per': 0.64; 'your': 0.64;
'cost': 0.64; 'depending': 0.65; 'produce': 0.69; 'interpreted':
0.69; 'received:116': 0.71; 'characters': 0.84; 'os.': 0.84;
'strings': 0.84; 'subject:really': 0.84; 'flexible': 0.91;
'largest': 0.96
X-Injected-Via-Gmane: http://gmane.org/
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101
Thunderbird/78.10.2
In-Reply-To: <d0e29810-858a-8a32-fda6-a68c63224606@mrabarnett.plus.com>
Content-Language: en-US
X-Mailman-Approved-At: Tue, 25 May 2021 18:25:50 -0400
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <s8jtd7$e0d$1@ciao.gmane.io>
X-Mailman-Original-References: <21fb6c5f-97a4-654b-887f-2c31a549bcbe@adminart.net>
<hd6qag98c37mvqurlu3mfcvie38o63kn6n@4ax.com>
<d0e29810-858a-8a32-fda6-a68c63224606@mrabarnett.plus.com>
 by: Terry Reedy - Tue, 25 May 2021 22:23 UTC

On 5/25/2021 1:25 PM, MRAB wrote:
> On 2021-05-25 16:41, Dennis Lee Bieber wrote:

>>     In Python 3, strings are UNICODE, using 1, 2, or 4 bytes PER
>> CHARACTER

This is CPython 3.3+ specific. Before than, it depended on the OS. I
believe MicroPython uses utf-8 for strings.

>> (I don't recall if there is a 3-byte version).

There isn't. It would save space but cost time.

>> If your input bytes are all
>> 7-bit ASCII, then they map directly to a 1-byte per character string.

If your input bytes all have the upper bit 0 and they are interpreted as
encoding ascii characters then they map to overhead + 1 byte per char

>>> sys.getsizeof(b''.decode('ascii'))
49
>>> sys.getsizeof(b'a'.decode('ascii'))
50
>>> sys.getsizeof(11*b'a'.decode('ascii'))
60

>> If
>> they contain any 8-bit upper half character they may map into a 2-byte
>> per character string.

See below.

> In CPython 3.3+:
>
> U+0000..U+00FF are stored in 1 byte.
> U+0100..U+FFFF are stored in 2 bytes.
> U+010000..U+10FFFF are stored in 4 bytes.

In CPython's Flexible String Representation all characters in a string
are stored with the same number of bytes, depending on the largest
codepoint.

>>> sys.getsizeof('\U00011111')
80
>>> sys.getsizeof('\U00011111'*2)
84
>>> sys.getsizeof('a\U00011111')
84

>> Bytes in Python 3 are just a binary stream, which needs an
>> encoding to produce characters.

Or any other Python object.

>> Use the wrong encoding (say ISO-Latin-1) when the
>> data is really UTF-8 will result in garbage.

So does decoding bytes as text when the bytes encode something else,
such as an image ;-).

--
Terry Jan Reedy

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor