Rocksolid Light

Welcome to novaBBS (click a section below)

mail  files  register  newsreader  groups  login

Message-ID:  

6 May, 2024: The networking issue during the past two days has been identified and fixed.


devel / comp.lang.python / Re: imaplib: is this really so unwieldy?

SubjectAuthor
o Re: imaplib: is this really so unwieldy?Cameron Simpson

1
Re: imaplib: is this really so unwieldy?

<mailman.351.1621984061.3087.python-list@python.org>

  copy mid

https://www.novabbs.com/devel/article-flat.php?id=13341&group=comp.lang.python#13341

  copy link   Newsgroups: comp.lang.python
Path: i2pn2.org!i2pn.org!news.swapon.de!fu-berlin.de!uni-berlin.de!not-for-mail
From: cs...@cskk.id.au (Cameron Simpson)
Newsgroups: comp.lang.python
Subject: Re: imaplib: is this really so unwieldy?
Date: Wed, 26 May 2021 08:25:49 +1000
Lines: 252
Message-ID: <mailman.351.1621984061.3087.python-list@python.org>
References: <d4d433ff-5de6-31df-4234-e93feea454fc@adminart.net>
<YK15bfs3LoHIdhd8@cskk.homeip.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Trace: news.uni-berlin.de 2fZtcYgTf0Y+Yf6O2U1MtwYHAi5zHsP6UguWruVPjSKQ==
Return-Path: <cameron@cskk.id.au>
X-Original-To: python-list@python.org
Delivered-To: python-list@mail.python.org
Authentication-Results: mail.python.org; dkim=none reason="no signature";
dkim-adsp=none (unprotected policy); dkim-atps=neutral
X-Spam-Status: OK 0.000
X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'this:': 0.03; 'def': 0.04;
'containing': 0.05; 'fairly': 0.05; 'library.': 0.05; 'parameter':
0.05; 'random': 0.05; 'string': 0.05; 'developer.': 0.07; 'exit':
0.07; 'lets': 0.07; 'mechanism': 0.07; 'modules': 0.07; 'that?':
0.07; 'thing.': 0.07; 'used.': 0.07; 'python.': 0.07; ':-)': 0.09;
'asynchronous': 0.09; 'byte': 0.09; 'choice.': 0.09; 'convert':
0.09; 'effectively': 0.09; 'elsewhere': 0.09; 'forced': 0.09;
'functions,': 0.09; 'ok,': 0.09; 'parse': 0.09; 'rfc': 0.09;
'text.': 0.09; 'url-ip:151.101.0.223/32': 0.09; 'url-
ip:151.101.128.223/32': 0.09; 'url-ip:151.101.192.223/32': 0.09;
'url-ip:151.101.64.223/32': 0.09; 'user.': 0.09; 'yes.': 0.09;
'cheers,': 0.10; 'looks': 0.11; 'import': 0.14; 'problem': 0.15;
'"end': 0.16; '"message': 0.16; '(b)': 0.16; '(because': 0.16;
'10:23,': 0.16; '>>is': 0.16; '_not_': 0.16; 'accessed': 0.16;
'arbitrary': 0.16; 'aside': 0.16; 'assumptions': 0.16; 'async':
0.16; 'basic.': 0.16; 'boilerplate': 0.16; 'cameron': 0.16;
'circumstance': 0.16; 'complete,': 0.16; 'discipline': 0.16;
'docs.': 0.16; 'does,': 0.16; 'eg:': 0.16; 'encoding': 0.16;
'encoding,': 0.16; 'fetch': 0.16; 'fetches': 0.16; 'from:addr:cs':
0.16; 'from:addr:cskk.id.au': 0.16; 'from:name:cameron simpson':
0.16; 'heavily': 0.16; 'ideal.': 0.16; 'indicating': 0.16;
'main()': 0.16; 'mean:': 0.16; 'message-id:@cskk.homeip.net':
0.16; 'montanaro': 0.16; 'nothing.': 0.16; 'numpy.': 0.16;
'outer': 0.16; 'parsing': 0.16; 'programmed': 0.16; 'protocol,':
0.16; 'protocol.': 0.16; 'pypi.org.': 0.16; 'received:10.10':
0.16; 'received:l': 0.16; 'relatively': 0.16; 'responses,': 0.16;
'says:': 0.16; 'semantics': 0.16; 'simpson': 0.16; 'skip:> 10':
0.16; 'something.': 0.16; 'specify': 0.16; 'these.': 0.16;
'things,': 0.16; 'turns': 0.16; 'url-ip:4.31.198/24': 0.16; 'url-
ip:4.31/16': 0.16; 'url-ip:4/8': 0.16; 'url:doc': 0.16;
'url:ietf': 0.16; 'url:project': 0.16; 'url:pypi': 0.16; 'worlds':
0.16; 'wrapper': 0.16; 'wrote:': 0.16; 'that.': 0.16; 'python':
0.16; 'figure': 0.18; 'uses': 0.19; 'all,': 0.19; 'libraries':
0.20; 'maybe': 0.20; 'version': 0.22; 'classes': 0.23;
'exception': 0.23; 'object': 0.23; "what's": 0.23; 'to:addr
:python-list': 0.23; 'list,': 0.23; 'command': 0.24; 'documented':
0.77; 'received:172.16': 0.77; 'supposed': 0.77; 'breaking': 0.78;
'client': 0.79; '(c)': 0.81; '....': 0.81; 'trial': 0.84;
'benefits.': 0.84; 'conservative': 0.84; 'consuming': 0.84;
'handed': 0.84; 'historic': 0.84; 'hope.': 0.84; 'itself.': 0.84;
'ordinary': 0.84; 'programme,': 0.84; 'standalone': 0.84;
'strings': 0.84; 'stuff,': 0.84; 'subject:really': 0.84;
'surprised': 0.84; 'want.': 0.84; 'weird': 0.84; 'powerful': 0.86;
'viewed': 0.90; 'tend': 0.91; 'magic': 0.93; 'subsequent': 0.95
X-RG-Spam: Unknown
X-RazorGate-Vade: gggruggvucftvghtrhhoucdtuddrgeduledrvdekvddgtdelucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuuffpveftpgfvgffnuffvtfetpdfqfgfvnecuuegrihhlohhuthemucegtddtnecunecujfgurhepfffhvffukfggtggujggffhesthdtredttdervdenucfhrhhomhepvegrmhgvrhhonhcuufhimhhpshhonhcuoegtshestghskhhkrdhiugdrrghuqeenucggtffrrghtthgvrhhnpefgkefgkeduveehffefjefggfeglefgledukeduteekieehieffgfelieejvddtueenucffohhmrghinhepphihphhirdhorhhgpdhivghtfhdrohhrghenucfkphepuddruddvledrudektddrvdegheenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhephhgvlhhopehsohgvkhhrihhsrdifrghukhdrtghskhhkrdhhohhmvghiphdrnhgvthdpihhnvghtpedurdduvdelrddukedtrddvgeehpdhmrghilhhfrhhomhepoegtrghmvghrohhnsegtshhkkhdrihgurdgruheqpdhrtghpthhtohepoehphihthhhonhdqlhhishhtsehphihthhhonhdrohhrgheq
X-RazorGate-Vade-Verdict: clean 0
X-RazorGate-Vade-Classification: clean
X-RG-VS-CLASS: clean
X-Authentication-Info: Submitted using ID cskk@bigpond.com
Mail-Followup-To: python-list@python.org
Content-Disposition: inline
In-Reply-To: <d4d433ff-5de6-31df-4234-e93feea454fc@adminart.net>
User-Agent: Mutt/2.0.3 (2020-12-04)
X-BeenThere: python-list@python.org
X-Mailman-Version: 2.1.34
Precedence: list
List-Id: General discussion list for the Python programming language
<python-list.python.org>
List-Unsubscribe: <https://mail.python.org/mailman/options/python-list>,
<mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive: <https://mail.python.org/pipermail/python-list/>
List-Post: <mailto:python-list@python.org>
List-Help: <mailto:python-list-request@python.org?subject=help>
List-Subscribe: <https://mail.python.org/mailman/listinfo/python-list>,
<mailto:python-list-request@python.org?subject=subscribe>
X-Mailman-Original-Message-ID: <YK15bfs3LoHIdhd8@cskk.homeip.net>
X-Mailman-Original-References: <d4d433ff-5de6-31df-4234-e93feea454fc@adminart.net>
 by: Cameron Simpson - Tue, 25 May 2021 22:25 UTC

On 25May2021 19:21, hw <hw@adminart.net> wrote:
>On 5/25/21 11:38 AM, Cameron Simpson wrote:
>>On 25May2021 10:23, hw <hw@adminart.net> wrote:
>>>if status != 'OK':
>>> print('Login failed')
>>> exit
>>
>>Your "exit" won't do what you want. I expect this code to raise a
>>NameError exception here (you've not defined "exit"). That _will_ abort
>>the programme, but in a manner indicating that you're used an unknown
>>name. You probably want:
>>
>> sys.exit(1)
>>
>>You'll need to import "sys".
>
>Oh ok, it seemed to be fine. Would it be the right way to do it with
>sys.exit()? Having to import another library just to end a program
>might not be ideal.

To end a programme early, yes. (sys.exit() actually just raises a
particular exception, BTW.)

I usually write a distinct main function, so in that case one can just
"return". After all, what seems an end-of-life circumstance in a
standalone script like yours is just an "end this function" circumstance
when viewed as a function, and that also lets you _call_ the main
programme from some outer thing. Wouldn't want that outer thing
cancelled, if it exists.

My usual boilerplate for a module with a main programme looks like this:

import sys
......
def main(argv):
... main programme, return like any other function ...
.... other code for the module - functions, classes etc ...
if __name__ == '__main__':
sys.exit(main(sys.argv))

which (a) puts main(0 up the top where it can be seen, (b) makes main()
an ordinary function like any other (c) lets me just import that module
elsewhere and (d) no globals - everything's local to main().

The __name__ boilerplate at the bottom is the magic which figures out if
the module was imported (__name__ will be the import module name) or
invoked from the command line like:

python -m my_module cmd-line-args...

in which case __name__ has the special value '__main__'. A historic
mechanism which you will convince nobody to change.

You'd be surprised how useful it is to make almost any standalone
programme a module like this - in the medium term it almost always pays
off for me. Even just the discipline of shoving all the formerly-global
variables in the main function brings lower-bugs benefits.

>>I've done little with IMAP. What's in msgnums here? Eg:
>> print(type(msgnums), repr(msgnums))
>>just so we all know what we're dealing with here.
>
><class 'list'> [b'']
>
>>>message_uuids = []
>>>for number in str(msgnums)[3:-2].split():
>>
>>This is very strange. [...]
>Yes, and I don't understand it. 'print(msgnums)' prints:
>
>[b'']
>
>when there are no messages and
>
>[b'1 2 3 4 5']

Chris has addressed this. msgnums is list of the data components of the
IMAP response. By going str(msgnums) you're not getting "the message
numbers as text" you're getting what printing a list prints. Which is
roughly Python code: the brakcets and the repr() of each list member.

Notice that the example code accessed msgnums[0] - that is the first
data component, a bytes. That you _can_ convert to a string (under
assumptions about the encoding).

By getting the "str" form of a list, you're forced into the weird [3:-2]
hack to ttrim the ends. But it is just a hack for a transcription
mistake, not a sane parse.

>So I was guessing that it might be an array containing a single a
>string and that refering to the first element of the array turns into
>a string with which split() can used. But 'print(msgnums[0].split())'
>prints
>
>[b'1', b'2', b'3', b'4', b'5']

msgnums[0] is bytes. You can do most str things with bytes (because that
was found to be often useful) but you get bytes back from those
operations as you'd hope.

>so I can only guess what that's supposed to mean: maybe an array of
>many bytes? The documentation[1] clearly says: "The message_set
>options to commands below is a string [...]"

But that is the parameter to the _call_: your '(UID)' parameter.

>I also need to work with message uids rather than message numbers
>because the numbers can easily change. There doesn't seem to be a way
>to do that with this library in python.

By asking for UIDs you're getting uids. Do they not work in subsequent
calls?

>So it's all guesswork, and I gave up after a while and programmed what
>I wanted in perl. The documentation of this library sucks, and there
>are worlds between it and the documentation for the libraries I used
>with perl.

I think you're better of looking for another Python imap library. The
imaplib was basic functionality to (a) access the rpotocol in basic form
and (b) conceal the async stuff, since IMAP is an asynchronous protocol.

You can in fact subclass it to do better things. Other library might do
thatm or they might have written their own protocol implementations.

>That doesn't mean I don't want to understand why this is so unwieldy.
>It's all nice and smooth in perl.

But using what library? Something out of CPAN? Those are third party
libraries, not Perl's presupplied stuff. The equivalent for Python is
pypi.org. Look there.

>>It is just breaking apart data[0] into strings which were separated by
>>whitespace in the response. And then using those same strings as keys
>>for the .fecth() call. That doesn't seem complex, and in fact is blind
>>to the format of the "message numbers" returned. It just takes what it
>>is handed and uses those to fetch each message.
>
>That's not what the documentation says.

The _example code_ is blind to them, whatever the semantics of the docs.
It just gets the uids and fetches with them. Aside from .split(),
there's no parsing or deep understanding of the uid.

>>IMAP's quite complex. Have you read RFC2060?
>>
>> https://datatracker.ietf.org/doc/html/rfc2060.html
>
>Yes, I referred to it and it didn't become any more clear in
>combination with the documentation of the python library.

IMAP's like that :-)

>>The imaplib library is probably a fairly basic wrapper for the
>>underlying protocol which provides methods for the basic client requests
>>and conceals the asynchronicity from the user for ease of (basic) use.
>
>Skip Montanaro seems to say that the byte problem comes from the
>change from python 2 to 3 and there is a better library now:
>https://pypi.org/project/IMAPClient/

And someone mentioned imaplib2. There are several choices.

>But the documentation seems even more sparse than the one for imaplib.
>Is it a general thing with python that libraries are not well
>documented?

That depends on the library - it is of course at the whim of the
developer. Heavily used powerful libraries are usually well documented,
eg numpy. A random hacker's published module? Might have nothing.

Wrappers for protocols like IMAP might have a bit of doco and expect the
useful to infer stuff from knowledge of the protocol itself.

[...]
>>Anyway, the IMAP response are bytes containing text. You get a lot of
>>bytes.
>
>Well, ok, but it's not helpful that b is being inserted like
>everywhere,

Only when you _print_ them. That "b" is an idicator that this is a bytes
object being printed in a stringlike form because that is often a useful
representation. Nothings inserting a "b" in the data itself.

>and I have to keep asking myself what I'm looking at because bytes are
>bytes.

If you've got b'abc', that is a printout of a 3 byte string. _Not_ the
string itself.

>Since the documentation is so bad, I had to figure it out by trial and
>error and by printing stuff and making guesses and assumptions.
>That's just not the way to program something.

No. But _many_ modules are what the original author needed to get
something done, and neither complete nor perfectly documented. Life's
too short. Well used module tend to become more complete, elegant and
documented over time, _if_ people other than the author use them.

>>When you go:
>>
>> text = str(data)
>>
>>that is _assuming_ a particular text encoding stored in the data. You
>>really ought to specify an encoding here. If you've not specified the
>>CHARSET for things, 'ascii' would be a conservative choice. The IMAP RFC
>>talks about what to expect in section 4 (Data Formats). There's quite a
>>lot of possible response formats and I can understand imaplib not
>>getting deeply into decoding these.
>
>UTF8 is the default since quite a while now. Why doesn't it just use that?


Click here to read the complete article
1
server_pubkey.txt

rocksolid light 0.9.81
clearnet tor