Friday, June 23, 2017

Modern Python Dictionaries

Raymond Hettinger (slides):

Python’s dictionaries are stunningly good. Over the years, many great ideas have combined together to produce the modern implementation in Python 3.6.

This fun talk uses pictures and little bits of pure python code to explain all of the key ideas and how they evolved over time.

Includes newer features such as key-sharing, compaction, and versioning.

Brandon Rhodes:

Since my “Mighty Dictionary” talk at PyCon 2010, the Python dictionary has evolved dramatically. Come learn about all of the the improvements, up to and including the re-architecture that has just landed with Python 3.6! The talk will discuss iterable views, the dictionary’s dedicated comprehension syntax, random key ordering, the special key-sharing dictionary designed to underlie object collections, and, most famously of all, the new “compact dictionary” that cuts dictionary storage substantially — and carries a fascinating side-effect.

Each new feature that the talk discusses will be motivated by considering the trade-offs inherent in hash table data structure design, and followed up with hints about how you can now use the dictionary even more effectively in your own code!

Via Jake VanderPlas (via Hacker News):

One piece both mention is the addition in Python 3.6 of a private dictionary version to aid CPython optimization efforts […] “every dictionary has a version number, and elsewhere in memory a master version counter.”

PEP 509:

Python is hard to optimize because almost everything is mutable: builtin functions, function code, global variables, local variables, … can be modified at runtime. Implementing optimizations respecting the Python semantics requires to detect when “something changes”: we will call these checks “guards”.

The speedup of optimizations depends on the speed of guard checks. This PEP proposes to add a private version to dictionaries to implement fast guards on namespaces.

Previously: PyPy’s Hash Table Implementation, Python’s Dictionary Implementation.

2 Comments RSS · Twitter

Just watched Raymond Hettinger's talk. Most of it is basic algorithm for dictionary key internals, but it's a good summary, up to a few rather new concepts at the end. He didn't explain how Python 3.6 managed to end up with keys that are ordered in the original insertion order, though, or did he?

Ah - it turns out that the space-saving step of using another indirection is also the key to having ordered keys, as explained in Brandon Rhodes' talk.

Leave a Comment