Understanding image compression

There are so many ways you can compress an image before serving it on the web, and there are a lot of tools that can help you with that. However, how does image compression actually works? and why there are many types of image compression?

In this talk, Andi will explain how different image compression algorithms work, and in which cases they will be best used for, also the reason why we need all of them instead of just using a single compression algorithm.

Understanding image compression

Andi Tjong, Developer Atlassian

There are many different forms of image compression. Andi is going to look at

  • the different formats
  • how important image compression is
  • why the different types exist
  • what are their use cases

Starting with the question “how important is it?”.

Example: RGB photo with ~2m pixels. Uncompressed this would be ~6megs, but with JPEG compression it could be ~40kb – a ~99% saving.

About half of web page data is images, so reducing that size has a significant effect.

Why are there so many compression algorithms? It’s partly history – people find new and better ways to compress data, or have a specific use case to solve. There also isn’t a universal definition of compression – “first million digits of pi” is a compression of 1m actual digits! It’s a different representation of the same information. You need to understand the content before you can understand how best to compress it.

Image formats – will focus on the most common formats PNG, JPG and GIF.

Graphics Interchange Format (GIF)

  • created in 1987, last update in 1989
  • good for low colour count images – logos, cartoons
  • bad for photos as they introduce a lot of noise
  • uses Quantization and LZW (lossless compression)

Quantisation reduces images to 256 colours; it may also apply dithering to soften the hard edges quantisation can introduce. When GIF was created, this was suitable for the kind of hardware people were using.

The other big use case for GIF is animation – but the images get very big. In fact a lot of “gifs” you see online are actually videos.

GIFs only have intraframe compression; while videos can use interframe compression – the video only stores the difference between each frame.

So in the end, you should probably use video for animation and PNG for static images.

The Portable Network Graphics (PNG) format was created in 1996 to get around the LZW compression patent. Has several modes:

  • Indexed, aka PNG8
  • Greyscale
  • RGB, aka PNG24
  • Greyscale + alpha
  • RGBARGB + alpha, aka PNG32

Key PNG use cases:

  • save lossless image. PNG uses filtering + deflate compression (same algorithm as GZIP), both are lossless
  • image with low colour count – PNG-8, replaces GIF for small static images with smaller file size. Use this format if you can as you get good results.
  • partial transparency – PNG-32 or PNG8+alpha. GIFs can’t do partial transparency, PNG can.

Joint Photographics Experts Group (JPEG)

  • created in 1992
  • uses a lossy compression method
  • based on Discrete Cosine Transform (DCT)

JPG use case is food for photographic images, images with high colour counts.
Don’t use JPG for low colour count images as it introduces artefacts/noise, use PNG instead.

How JPG works

  • colour model conversion
  • chroma subsampling
  • block splitting – splits images into 8×8 blocks, transforms to numbers
  • DCT – uses a pattern grid to reproduce the 8×8 grid
  • quantisation
  • entropy coding

In short it tries to remove the details that we don’t miss as much. But because it works in blocks, it adds noise to images with crisp lines – because those lines calculate poorly in the compression blocks.

What about other image formats? WebP, JPEG 200, HEIC, AVIF, BPG… they have better compression so why aren’t they being used? Basically they’re not supported in browsers.

By understanding the different image compression algorithms, you can understand which format to use (and which tools can help you reduce file sizes).