Which is best in Python: urllib2, PycURL or mechanize?

Question

Ok so I need to download some web pages using Python and did a quick investigation of my options.

Included with Python:

urllib - seems to me that I should use urllib2 instead. urllib has no cookie support, HTTP/FTP/local files only (no SSL)

urllib2 - complete HTTP/FTP client, supports most needed things like cookies, does not support all HTTP verbs (only GET and POST, no TRACE, etc.)

Full featured:

mechanize - can use/save Firefox/IE cookies, take actions like follow second link, actively maintained (0.2.5 released in March 2011)

PycURL - supports everything curl does (FTP, FTPS, HTTP, HTTPS, GOPHER, TELNET, DICT, FILE and LDAP), bad news: not updated since Sep 9, 2008 (7.19.0)

New possibilities:

urllib3 - supports connection re-using/pooling and file posting

Deprecated (a.k.a. use urllib/urllib2 instead):

httplib - HTTP/HTTPS only (no FTP)

httplib2 - HTTP/HTTPS only (no FTP)

The first thing that strikes me is that urllib/urllib2/PycURL/mechanize are all pretty mature solutions that work well. mechanize and PycURL ship with a number of Linux distributions (e.g. Fedora 13) and BSDs so installation is a non issue typically (so that's good).

urllib2 looks good but I'm wondering why PycURL and mechanize both seem very popular, is there something I am missing (i.e. if I use urllib2 will I paint myself in to a corner at some point?). I'd really like some feedback on the pros/cons of these things so I can make the best choice for myself.

Edit: added note on verb support in urllib2

What does "best" mean? Best with respect to what? Fastest? Largest? Best use of Cookies? What do you need to do?
httplib isn't "deprecated". It is a lower level module that urllib2 is built on top of. you can use it directly, but it is easier via urllib2
What Corey said, e.g. urllib3 is a layer on top of httplib. Also, httplib2 is not deprecated - in fact it's newer than urllib2 and fixes problems like connection reuse (same with urllib3).
There is a newer library called requests. See docs.python-requests.org/en/latest/index.html

score 22 · Answer 1

urllib2 is found in every Python install everywhere, so is a good base upon which to start.
PycURL is useful for people already used to using libcurl, exposes more of the low-level details of HTTP, plus it gains any fixes or improvements applied to libcurl.
mechanize is used to persistently drive a connection much like a browser would.

It's not a matter of one being better than the other, it's a matter of choosing the appropriate tool for the job.

Which is best in Python: urllib2, PycURL or mechanize?

Which is best in Python: urllib2, PycURL or mechanize?

6 Answers