Wednesday, March 16, 2022

Removing Dead Batteries From the Python Standard Library

PEP 594 (via Hacker News):

Back in the early days of Python, the interpreter came with a large set of useful modules. This was often referred to as “batteries included” philosophy and was one of the cornerstones to Python’s success story. Users didn’t have to figure out how to download and install separate packages in order to write a simple web server or parse email.

Times have changed. With the introduction of PyPI (née Cheeseshop), setuptools, and later pip, it became simple and straightforward to download and install packages.

[…]

On the other hand, Python’s standard library is piling up with cruft, unnecessary duplication of functionality, and dispensable features.

[…]

The modules in this PEP have been selected for deprecation because their removal is either least controversial or most beneficial.

I’m going to miss cgi/cgitb. It’s not high-performance, but it’s simple and easy to deploy an endpoint with a single file. There doesn’t seem to be an obvious replacement.

Update (2022-03-23): Jake Edge (via Hacker News):

Comparing that table with the one in our article on the introduction of the PEP shows that the broad strokes are the same, but the details have changed somewhat. The removals were meant to be largely non-controversial, so if good reasons to keep a module were raised—and the maintenance burden was low—it was retained.

Chris Siebenmann:

Some of our CGIs are purely informational; they present some dynamic information on a web page, and don’t take any parameters or otherwise particularly interact with people. These CGIs tend to use cgitb so that if they have bugs, we have some hope of catching things. When these CGIs were written, cgitb was the easy way to do something, but these days I would log tracebacks to syslog using my good way to format them.

[…]

Others of our CGIs are interactive, such as the CGIs we use for our self-serve network access registration systems. These CGIs need to extract information from submitted forms, so of course they use the ever-popular cgi.FieldStorage class. As far as I know there is and will be no standard library replacement for this, so in theory we will have to do something here. Since we don’t want file uploads, it actually isn’t that much work to read and parse a standard POST body, or we could just keep our own copy of cgi.py and use it in perpetuity.

Chris Siebenmann:

Unfortunately there are some dark sides to cgi.FieldStorage (apart from any bugs it may have), and in fairness I should discuss them. Overall, cgi.FieldStorage is probably safe for internal usage, but I would be a bit wary of exposing it to the Internet in hostile circumstances. The ultimate problem is that in the name of convenience and just working, cgi.FieldStorage is pretty trusting of its input, and on the general web one of the big rules of security is that your input is entirely under the control of an attacker.

1 Comment RSS · Twitter

My preference, Tcl, which has always been a "little language", suffers needlessly because of its lack of included batteries, IMO, so it's really fascinating to read the discussions. I just hope Python doesn't over-minimalise, even though I'm not a fan of the language. And I agree that Python really benefits by being useful out of the box, and having a package manager you could use to install useful modules. Unfortunately, though better than nothing, I think PIP is terrible (especially for binary modules) and given the choice I always prefer to use OS packages/ports to avoid its many frailties.

Leave a Comment