Thursday, April 17, 2025

Performance of the Python 3.14 Tail-Call Interpreter

Nelson Elhage (via Hacker News):

Unfortunately, as I will document in this post, these impressive performance gains turned out to be primarily due to inadvertently working around a regression in LLVM 19. When benchmarked against a better baseline (such GCC, clang-18, or LLVM 19 with certain tuning flags), the performance gain drops to 1-5% or so depending on the exact setup.

[…]

Historically, the optimization of replicating the bytecode dispatch into each opcode has been cited to speed up interpreters anywhere from 20% to 100%. However, on modern processors with improved branch predictors, more recent work finds a much smaller speedup, on the order of 2-4%.

[…]

Still, nix was clearly enormously helpful here, and on net it definitely made this kind of multi-version exploration and debugging much saner than any other approach I can imagine.

1 Comment RSS · Twitter · Mastodon


Python is slow.

It's just an iron law of nature. :)

Leave a Comment