Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

6 May, 2024: The networking issue during the past two days has been identified and appears to be fixed. Will keep monitoring.


devel / comp.lang.python / Re: imaplib: is this really so unwieldy?

SubjectAuthor
* Re: imaplib: is this really so unwieldy?hw
`- Re: imaplib: is this really so unwieldy?Greg Ewing

1
Re: imaplib: is this really so unwieldy?

<mailman.342.1621963308.3087.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=13328&group=comp.lang.python#13328

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: hw...@adminart.net (hw)
Newsgroups: comp.lang.python
Subject: Re: imaplib: is this really so unwieldy?
Date: Tue, 25 May 2021 19:21:39 +0200
Lines: 218
Message-ID: <mailman.342.1621963308.3087.python-list@python.org>
References: <21fb6c5f-97a4-654b-887f-2c31a549bcbe@adminart.net>
<YKzFm7gR+5eKzov7@cskk.homeip.net>
<d4d433ff-5de6-31df-4234-e93feea454fc@adminart.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: news.uni-berlin.de isa5SpNqhzXDpcCq5THpegznBQ7RqqV4fXkCfdJ5niRg==
Return-Path: <hw@adminart.net>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=pass
reason="2048-bit key; unprotected key"
header.d=adminart.net header.i=@adminart.net header.b=IpDxJGuW;
dkim-adsp=pass; dkim-atps=neutral
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; '(which': 0.04;
'containing': 0.05; 'fairly': 0.05; 'library.': 0.05; 'string':
0.05; 'exit': 0.07; 'that?': 0.07; 'translate': 0.07;
'underlying': 0.07; 'used.': 0.07; 'wrong.': 0.07; 'python.':
0.07; 'anyway,': 0.09; 'apparently': 0.09; 'byte': 0.09;
'choice.': 0.09; 'convert': 0.09; 'docs,': 0.09; 'ended': 0.09;
'forced': 0.09; 'numeric': 0.09; 'ok,': 0.09; 'other.': 0.09;
'overhead': 0.09; 'parse': 0.09; 'prints': 0.09; 'reasons:': 0.09;
'rfc': 0.09; "shouldn't": 0.09; 'text.': 0.09; 'url-
ip:151.101.0.223/32': 0.09; 'url-ip:151.101.128.223/32': 0.09;
'url-ip:151.101.192.223/32': 0.09; 'url-ip:151.101.64.223/32':
0.09; 'which,': 0.09; 'looks': 0.11; "can't": 0.14; 'import':
0.14; 'talks': 0.14; 'problem': 0.15; '"message': 0.16; '10:23,':
0.16; '[1]:': 0.16; 'awkward': 0.16; 'bytes)': 0.16; 'cameron':
0.16; 'charset': 0.16; 'eg:': 0.16; 'encoding': 0.16; 'encoding,':
0.16; 'everywhere,': 0.16; 'expressions': 0.16; 'far,': 0.16;
'fetch': 0.16; 'ideal.': 0.16; 'immersed': 0.16; 'indicating':
0.16; 'interesting.': 0.16; 'mean:': 0.16; 'module,': 0.16;
'montanaro': 0.16; 'num': 0.16; 'programmed': 0.16; 'readonly':
0.16; 'received:(client did not present a certificate)': 0.16;
'results,': 0.16; 'says:': 0.16; 'simpson': 0.16; 'something.':
0.16; 'specify': 0.16; 'strings,': 0.16; 'these.': 0.16;
'things,': 0.16; 'turns': 0.16; 'unicode': 0.16; 'url-
ip:4.31.198/24': 0.16; 'url-ip:4.31/16': 0.16; 'url-ip:4/8': 0.16;
'url:doc': 0.16; 'url:ietf': 0.16; 'url:project': 0.16;
'url:pypi': 0.16; 'worlds': 0.16; 'wrapper': 0.16; 'wrote:': 0.16;
'says': 0.16; 'that.': 0.16; 'python': 0.16; 'figure': 0.18;
'uses': 0.19; "aren't": 0.20; 'libraries': 0.20; 'maybe': 0.20;
'name.': 0.20; "i've": 0.22; 'exception': 0.23; "what's": 0.23;
'to:addr:python-list': 0.23; 'code': 0.24; 'probably': 0.24;
'received:de': 0.24; 'section': 0.26; 'seems': 0.26; 'extract':
0.27; 'library': 0.27; 'port': 0.27; 'wrong': 0.27; 'else': 0.27;
'function': 0.28; 'single': 0.28; 'done': 0.28; 'default': 0.28;
'module': 0.28; 'error': 0.28; "didn't": 0.29; 'requests': 0.29;
'text': 0.29; 'header:User-Agent:1': 0.31; 'there': 0.31; 'seem':
0.31; 'stuff': 0.31; 'takes': 0.31; 'change.': 0.69;
'combination': 0.69; 'deeply': 0.69; 'manner': 0.69; 'you.': 0.70;
'below': 0.71; 'knowing': 0.71; 'content': 0.72; 'little': 0.75;
"you'll": 0.75; 'supposed': 0.77; 'treat': 0.77; 'breaking': 0.78;
'client': 0.79; 'trial': 0.84; '(such': 0.84; 'conservative':
0.84; 'decode': 0.84; 'handed': 0.84; 'programme,': 0.84;
'sequence.': 0.84; 'strings': 0.84; 'subject:really': 0.84;
'want.': 0.84; 'weird': 0.84; 'doing.': 0.91; 'largely': 0.91;
'return,': 0.91
ARC-Seal: i=1; a=rsa-sha256; t=1621963300; cv=none;
d=strato.com; s=strato-dkim-0002;
b=NpxhH3GPcmwsXUkkqYvd9Z4wWMMCE8sRX7k6oxrSaQ3lA/mB30iEERW8+K3dFJH7CT
O6Elt/wGGNDB4QJ04wl3uuUM/2BxmCdqDPSK85O+Bj4vhPFezR7x7ZdDyecoiG2IdynU
RLvaPaVVcE+nYUG0Ai0ZNQE60pU0T2ZFf+SyLIhOQ/2gH809x0iUGlj7g7J5m1M/suO+
WRAwrH1jlM3wiI0DeB4edjlU0mvQQEzTPJeiyrfWrlN1JEiMQJGC+REFC1cZowDtHhJb
T+9ln/wM/sV3U7DuYtmQhPAxiOpqZfUC1COaChXhoA+lHSDSX8kttpvdo1qtcHKOtoGv
bsgw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; t=1621963300;
s=strato-dkim-0002; d=strato.com;
h=In-Reply-To:Date:Message-ID:From:References:To:Subject:Cc:Date:From:
Subject:Sender;
bh=hI9ovu9+JPTEVY2IioH0MZ5NxYevVNzG7LUIQW4wly0=;
b=iZbfsHtkWFXayoxTCYsF2kyF73fGXKU6bAqK7xHiywBiMsYLciHXgQXmO96Y5iH88N
1lmDFEiz2rXtOQhCZchPnjp0+2Ps45PwRM1a+UK5uJzIjW6W1m2M4//fP1nWJCLBAbV6
sLoMC4SGnnYyVD72jaOOVqM47a20CKxpOnGGukQTyyPdAtp8JJpqS/rovt3ag649S4wL
5ixSULSZybd4KPUpzN+yBZqfQVC+4OtoFSFKjt2bGgUmaYEHuLuNeKQCp9g0mOnklmAy
g4Mx2RZ/vSY2OgIZLJv0Z6piD2QFdPZ+Bk4iKy3la0BMY6o/aOxGO2Do4Dnee4PP8ojy
U1Fg==
ARC-Authentication-Results: i=1; strato.com;
dkim=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1621963300;
s=strato-dkim-0002; d=adminart.net;
h=In-Reply-To:Date:Message-ID:From:References:To:Subject:Cc:Date:From:
Subject:Sender;
bh=hI9ovu9+JPTEVY2IioH0MZ5NxYevVNzG7LUIQW4wly0=;
b=IpDxJGuWQmxGV48nsvLIR16VY3FLvLbVIJkqnsj4c7vLQO6JoOYXOdxkCw55qatcxg
WUnlWuZ0vR8F1bfPZcLmrPfLjDbA1UbUezpp5kMTs+KeoS8rbnG4D+LrTa6H/hYi8bQ9
ue7p1fqqzV9lQisohNGt2CBa3pdwjmeyM5OR39I2JCtySqq3VBbmPRZj1HjJOlKKOB18
tvg3wJRRylDq9cVXLgVvgW6W+416VvZNHm8pK+VXfigfoNYVMSUy9W+wA9vwnhUpI7uD
fgu4gN1JVucF73rIvGWgV73QrRqRLWZbx/1dqXLXurj7WJFcoJl0GrHD5wGzK9TJE/rx
fhhA==
Authentication-Results: strato.com;
dkim=none
X-RZG-AUTH: ":O2kGeEG7b/pS1Ey9Rna9iAZFrfz26y6zbtmqiE/f0+LThi5xYO8s9RmVPkxvrwQpRPenk8HVxDf4aQ=="
X-RZG-CLASS-ID: mo00
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
Thunderbird/78.10.1
In-Reply-To: <YKzFm7gR+5eKzov7@cskk.homeip.net>
Content-Language: de-DE
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <d4d433ff-5de6-31df-4234-e93feea454fc@adminart.net>
X-Mailman-Original-References: <21fb6c5f-97a4-654b-887f-2c31a549bcbe@adminart.net>
<YKzFm7gR+5eKzov7@cskk.homeip.net>
 by: hw - Tue, 25 May 2021 17:21 UTC

On 5/25/21 11:38 AM, Cameron Simpson wrote:
> On 25May2021 10:23, hw <hw@adminart.net> wrote:
>> I'm about to do stuff with emails on an IMAP server and wrote a program
>> using imaplib which, so far, gets the UIDs of the messages in the
>> inbox:
>>
>>
>> #!/usr/bin/python
>
> I'm going to assume you're using Python 3.

Python 3.9.5

>> import imaplib
>> import re
>>
>> imapsession = imaplib.IMAP4_SSL('imap.example.com', port = 993)
>>
>> status, data = imapsession.login('user', 'password')
>> if status != 'OK':
>> print('Login failed')
>> exit
>
> Your "exit" won't do what you want. I expect this code to raise a
> NameError exception here (you've not defined "exit"). That _will_ abort
> the programme, but in a manner indicating that you're used an unknown
> name. You probably want:
>
> sys.exit(1)
>
> You'll need to import "sys".

Oh ok, it seemed to be fine. Would it be the right way to do it with
sys.exit()? Having to import another library just to end a program
might not be ideal.

>> messages = imapsession.select(mailbox = 'INBOX', readonly = True)
>> typ, msgnums = imapsession.search(None, 'ALL')
>
> I've done little with IMAP. What's in msgnums here? Eg:
>
> print(type(msgnums), repr(msgnums))
>
> just so we all know what we're dealing with here.

<class 'list'> [b'']

>> message_uuids = []
>> for number in str(msgnums)[3:-2].split():
>
> This is very strange. Did you see the example at the end of the module
> docs, it has this example code:
>
> import getpass, imaplib
>
> M = imaplib.IMAP4()
> M.login(getpass.getuser(), getpass.getpass())
> M.select()
> typ, data = M.search(None, 'ALL')
> for num in data[0].split():
> typ, data = M.fetch(num, '(RFC822)')
> print('Message %s\n%s\n' % (num, data[0][1]))
> M.close()
> M.logout()

Yes, and I don't understand it. 'print(msgnums)' prints:

[b'']

when there are no messages and

[b'1 2 3 4 5']

So I was guessing that it might be an array containing a single a string
and that refering to the first element of the array turns into a string
with which split() can used. But 'print(msgnums[0].split())' prints

[b'1', b'2', b'3', b'4', b'5']

so I can only guess what that's supposed to mean: maybe an array of many
bytes? The documentation[1] clearly says: "The message_set options to
commands below is a string [...]"

I also need to work with message uids rather than message numbers
because the numbers can easily change. There doesn't seem to be a way
to do that with this library in python.

So it's all guesswork, and I gave up after a while and programmed what I
wanted in perl. The documentation of this library sucks, and there are
worlds between it and the documentation for the libraries I used with perl.

That doesn't mean I don't want to understand why this is so unwieldy.
It's all nice and smooth in perl.

[1]: https://docs.python.org/3/library/imaplib.html

> It is just breaking apart data[0] into strings which were separated by
> whitespace in the response. And then using those same strings as keys
> for the .fecth() call. That doesn't seem complex, and in fact is blind
> to the format of the "message numbers" returned. It just takes what it
> is handed and uses those to fetch each message.

That's not what the documentation says.

>> status, data = imapsession.fetch(number, '(UID)')
>> if status == 'OK':
>> match = re.match('.*\(UID (\d+)\)', str(data))
> [...]
>> It's working (with Cyrus), but I have the feeling I'm doing it all
>> wrong because it seems so unwieldy.
>
> IMAP's quite complex. Have you read RFC2060?
>
> https://datatracker.ietf.org/doc/html/rfc2060.html

Yes, I referred to it and it didn't become any more clear in combination
with the documentation of the python library.

> The imaplib library is probably a fairly basic wrapper for the
> underlying protocol which provides methods for the basic client requests
> and conceals the asynchronicity from the user for ease of (basic) use.

Skip Montanaro seems to say that the byte problem comes from the change
from python 2 to 3 and there is a better library now:
https://pypi.org/project/IMAPClient/

But the documentation seems even more sparse than the one for imaplib.
Is it a general thing with python that libraries are not well documented?

>> Apparently the functions of imaplib return some kind of bytes while
>> expecting strings as arguments, like message numbers must be strings.
>> The documentation doesn't seem to say if message UIDs are supposed to
>> be integers or strings.
>
> You can go a long way by pretending that they are opaque strings. That
> they may be numeric in content can be irrelevant to you. treat them as
> strings.

That's what I ended up doing.

>> So I'm forced to convert stuff from bytes to strings (which is weird
>> because bytes are bytes)
>
> "bytes are bytes" is tautological.

which is a good thing

> You're getting bytes for a few
> reasons:
>
> - the imap protocol largely talks about octets (bytes), but says they're
> text. For this reason a lot of stuff you pass as client parameters are
> strings, because strings are text.
>
> - text may be encoded as bytes in many ways, and without knowing the
> encoding, you can't extract text (strings) from bytes
>
> - the imaplib library may date from Python 2, where the str type was
> essentially a byte sequence. In Python 3 a str is a sequence of
> Unicode code points, and you translate to/from bytes if you need to
> work with bytes.
>
> Anyway, the IMAP response are bytes containing text. You get a lot of
> bytes.

Well, ok, but it's not helpful that b is being inserted like everywhere,
and I have to keep asking myself what I'm looking at because bytes are
bytes.

Since the documentation is so bad, I had to figure it out by trial and
error and by printing stuff and making guesses and assumptions. That's
just not the way to program something.

> When you go:
>
> text = str(data)
>
> that is _assuming_ a particular text encoding stored in the data. You
> really ought to specify an encoding here. If you've not specified the
> CHARSET for things, 'ascii' would be a conservative choice. The IMAP RFC
> talks about what to expect in section 4 (Data Formats). There's quite a
> lot of possible response formats and I can understand imaplib not
> getting deeply into decoding these.

UTF8 is the default since quite a while now. Why doesn't it just use that?

>> and to use regular expressions to extract the message-uids from what
>> the functions return (which I shouldn't have to because when I'm asking
>> a function to give me a uid, I expect it to return a uid).
>
> No, you're asking the IMAP _protocol_ to return you UIDs. The module
> itself doesn't parse what you ask for in the fetch results, and
> therefore it can't decode the response (data bytes) into some higher
> level thing (such as UIDs in your case, but you can ask for all sorts of
> weird stuff with IMAP).

Then its documentation should at least specify what the library does.
And perhaps it shouldn't specify that some of its functions expect their
parameters to be strings rather than what other functions return,
requiring guesswork and conversations of the data because the functions
kinda aren't compatabile with each other.

> So having passed '(UID)' to the SEARCH request, you now need to parse
> the response.

First I have to guess what the response might be ... And once I manged
that, there's still no way to do something with a message by its uid.

>> This so totally awkward and unwieldy and involves so much overhead
>> that I must be doing this wrong. But am I? How would I do this right?
>
> Well, you _could_ get immersed in the nitty gritty of the IMAP protocol
> and the imaplib module, _or_ you could see if someone else has done some
> work to make this easier by writing a higher level library. A search at
> pypi.org for "imap" found a lot of stuff. The package named "imap-tools"
> looks promising. Try that.


Click here to read the complete article
Re: imaplib: is this really so unwieldy?

<ih5cdsFoapsU1@mid.individual.net>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=13337&group=comp.lang.python#13337

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From: greg.ew...@canterbury.ac.nz (Greg Ewing)
Newsgroups: comp.lang.python
Subject: Re: imaplib: is this really so unwieldy?
Date: Wed, 26 May 2021 10:35:39 +1200
Lines: 18
Message-ID: <ih5cdsFoapsU1@mid.individual.net>
References: <21fb6c5f-97a4-654b-887f-2c31a549bcbe@adminart.net>
<YKzFm7gR+5eKzov7@cskk.homeip.net>
<d4d433ff-5de6-31df-4234-e93feea454fc@adminart.net>
<mailman.342.1621963308.3087.python-list@python.org>
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: individual.net sXjEpNAqtwG9KJ+MNEr+zAmH+17UB5xGH7t18PxOmpRf6mW15R
Cancel-Lock: sha1:02akKZjZpv0nj0L41v2/r5YZpPQ=
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:78.0)
Gecko/20100101 Thunderbird/78.4.0
In-Reply-To: <mailman.342.1621963308.3087.python-list@python.org>
Content-Language: en-US
 by: Greg Ewing - Tue, 25 May 2021 22:35 UTC

On 26/05/21 5:21 am, hw wrote:
> On 5/25/21 11:38 AM, Cameron Simpson wrote:
>> You'll need to import "sys".
>
> aving to import another library just to end a program
> might not be ideal.

The sys module is built-in, so the import isn't really
loading anything, it's just giving you access to a
namespace.

But if you prefer, you can get the same result without
needing an import using

raise SystemExit(1)

--
Greg

1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor