Tuesday, September 5, 2023

How Various Image Formats Compress One-Pixel Images

Jon Sneyers (2016, tweet, Reddit):

However, actual image formats tend to have a “header” that contains quite a bit more information. First of all, the first few bytes of any image format contain a fixed identifier that is only there to say “Hey! I’m a file in this particular file format!”. This fixed sequence of bytes is also known as the magic number.


Headers can contain all sorts of meta-information about an image. Some of it is format-specific information to indicate what kind of subformat is used, and is necessary to decode the pixels correctly. Some of it might not be necessary to decode the pixels, but is still useful to know how to render them – e.g. color profiles, orientation, gamma, or dots-per-pixel.


Besides headers, image formats may have other kinds of “overhead”. They may contain all kinds of markers and checksums, intended to make the format more robust in case of transmission errors or other forms of corruption. Also, sometimes some kind of padding is required, to ensure that the data gets aligned properly.

One-pixel images – the smallest possible images – reveal exactly how much “overhead” there is in an image format.

Jon Sneyers:

In this second part of the blog post, we go to the other extreme: extremely predictable images.

The most predictable image is a large rectangle in a single color.


The uncompressed PBM format obviously has a file size that is (asymptotically) linear in the number of pixels (1 bit per pixel in this case). But JPEG and lossy WebP are also linear in the number of pixels (quadratic in the width of the square) – just with a better constant factor. In other words, they seem to have some inevitable cost per pixel. For JPEG, it looks like you need at least 2 bits per 8×8 macroblock.


The PNG curve goes in a more or less straight line, with some ‘coughs’ and ‘jumps’ around powers of two (1024, 2048, 4096) which might be due to the changing behavior of the underlying zlib compression at such boundary points.

Comments RSS · Twitter · Mastodon

Leave a Comment