Saturday, November 12, 2016

Uber’s JSON Compression

Todd Hoff:

The whole experience is described in loving detail in the article: How Uber Engineering Evaluated JSON Encoding and Compression Algorithms to Put the Squeeze on Trip Data. They came up with a matrix of 10 encoding protocols (Thrift, Protocol Buffers, Avro, MessagePack, etc) and 3 compression libaries (Snappy, zlib, Bzip2). The target environment was Python. Uber went to an IDL approach to define and verify their JSON protocol, so they ended up only considering IDL solutions.

[…]

The conclusion: MessagePack with zlib. Encoding time: 4231 ms. Decoding: 715 ms. There was a 78% reduction in size relative to the JSON zlib combination.

[…]

Something to consider: don’t use JSON for messaging. The compression/decompression times are still dog slow. If you are going to use an IDL, which every grown up project eventually moves to for reliability and security reasons, consider not using JSON for messaging. Go for a binary protocol from the start.

Comments RSS · Twitter

Leave a Comment