Base64 Encoding and Decoding With SIMD Instructions
Daniel Lemire (via Hacker News):
Alfred Klomp showed a few years ago that you could do much better using vector instructions. Wojciech Muła, myself and a few others (i.e., Howard and Kurz) decided the seriously revisit the problem. Muła has a web page on the topic.
We found that, in the end, you could speed up the problem by a factor of ten and use about 0.2 cycles per byte on recent Intel processors using vector instructions. That’s still more than a copy, but much less likely to ever be a bottleneck. I should point out that this 0.2 cycles per byte includes error handling: the decoder must decode and validate the input (e.g., if illegal characters are found, the decoding should be aborted).
Our research code is available so you can reproduce our results. Our paper is available from arXiv and has been accepted for publication by ACM Transactions on the Web.