Understanding image compression

Andi Tjong at Code 2020

Transcript

(upbeat music) - Hi everyone, my name is Andi and I'm a developer at Atlassian.

So today I'm gonna be talking about image compression. So to begin with, there are actually a lot of ways you can compress an image before saving it on a website.

So for example, by using different image format, you're actually using different image compression. However how important is image compression? And why are there many types of image format or compression? What are their use cases and how it works? So this questions will be the main focus of my talk today. I'm gonna start with the first question is, how important is image compression? So imagine I want to send you this picture of cat, and this cat picture contain roughly around 2 million pixels.

And then if I'm sending all these pixels as a raw binary value, so which mean I'm storing the pixel value as a one byte, it's gonna roughly equal something like this. So one pixel will be equal to one byte per channel, and since it's using RGB channel, that means there are three channels.

So in total, I will need to send six megabytes to send these pictures.

However, if I'm using JPEG compression to compress this image, I can save this image as small as 40.3 kilobytes. So that's roughly around 99.35% of savings, and that's a really big saving.

So that's why image compression is important because you can save a lot of money and data by doing compression before you're sending it. And on top of that, 50% of the bytes on the web pages are actually coming from image.

So that's a big amount of proportion, and if we manage to optimise this part, we will be able to reduce the webpages size significantly. Moving on to the next question, why are there many types of image format? So as you can see here in this screen, these are some of the image format that you may have seen before, and as you can see there part some of them. So every single of this image format are using a different combination of data compression algorithm.

So which means there are more data compression algorithms out there.

So the question is, why are there multiple compression algorithms? So the first reason is because historical reasons. As the time goes on, people invent better ways to store data, and then it's also created for a specific use case. And partly the reason of this is because there is no universal compression algorithm. Compression is both art and artificial intelligence problem.

It requires you to understand the content, if you want to maximise the compression, bond. And if the compressor is actually intelligent enough, it can understand the content.

It can compress this 1 million digit long into five English letter words, five English words, which is first million digit of pi. Now this counts as a compression because I can transform 1 million digit into a five word, which mean, I'm reducing the size that I need to save this number with another representation. And by understanding the content, you will have more opportunity to actually minimise the size of the content that you want to compress.

So that's why compression is kinda like an AI problem. So I'm gonna move on to the next section, which is image formats.

So in this talk, I'm gonna be focusing mostly on the most popular image format on our website.

So as you can see here on the slides, I have PNG, JPEG and GIF.

So these are the image format that I'll be focusing on. So the first image format, GIF graphics interchange format. I'm gonna be talking about the use case of GIF and then are when not to use it.

And then why is it.

So a bit of history of GIF.

GIF is created in the 1987, and then the latest release in the 1989.

So that's like 31 years ago.

So GIF this actually pretty old image format. So the first use case of GIF is, if you want to use it for low colour count image, for example, logo or cartoon, because it's not so good for high colour image because of the noise.

To give you an example here.

I have a comparison of image in the low colour count, and then in the high colour count, as you can see here on the left side, the low colour count image has a sharp colour. While on the right side, it's pretty noisy. And why is it, why GIF is only good for a low colour count to answer his question, I'm gonna go through a bit of how GIF works. So in the GIF, there is two steps.

The first one is Quantization.

And then the next one is LZW algorithm, which is a simple lossless data compression algorithm. Now in the first step in the quantization, GIF will convert a lot of colours into maximum 256 colours inside the pallet. So it roughly looks something like this.

If I have a picture with 6 million colours, it's going to convert it into maximum 256 colours inside the pallet. So that's why GIF is not so good for low colour count, because it converts that into just maximum 256 colours. And on top of that, there is another step that we can apply in the GIF, which is called Dithering.

So it's gonna, it's going to apply intentional noise to the image. So this is our image previously.

And then if we apply Dithering, it's going to look something like this.

It matches to the, it matches the original image closer, but at the same time it introduced this noise. So this is the reason why high colour count image in GIF has noises.

And the reason for quantization is simply because GIF created in the early computer ages.

So hardware wasn't that good back then.

So that's why.

And back then 256 colour was good enough, but nowadays it's not the limitation anymore. And the other use case for GIF, it's probably the most popular one.

You want to use it for animation.

However, keep in mind when you're using GIF for animation, the file size is much bigger compared to the other format.

In fact, some GIF like images in the website that you have seen, for example, in Reddit or Giphy they're actually video are WebP it's because GIF size is just so big.

So, this is an example of the similar looking image in GIF and MP4 format.

As you can see here, the GIF is 18 times bigger than MP4.

So that's a lot of size wasted by saving animation in GIF. And the reason for this is because, GIF only has this concept called Interframe compression, where every frame is recorded individually, and then it only compresses internally inside the frame. So each frame doesn't talk to each other while in the video format, there is this concept called Interframe compression, where it only stores the differences between every frame. So that's why by doing this video format save a lot of size. So always try to use other format if you want to store your animation.

So to summarise everything about GIF GIF is an old image format.

So it has the maximum 256 scholars in the quantization step because legacy issue. It has animation support, however always prefer other image format, if possible. So use the WebP or video for animation and for other use cases, try to use PNG because PNG is actually created to replace GIF. Which is PNG is the next image format and I'm going to talk about, so PNG portable network graphics a bit of history again, it's created in the 1996, and it's actually created because pattern issues with GIF.

So GIF was using LZW compression algorithm, and then it has a patent issue.

So that's why PNG has created as the non patent replacement.

So before I jump into the use case of PNG, there are this thing that I need to tell you about PNG modes.

So there are actually five different types of PNG inside the PNG itself, and that's called Excel format and there are five of them what is stored per pixel.

So the first one we have index, which is maximum 256 colours.

This is normally called PNG-8, and this is pretty much similar to GIF where you have an image and then you have a pallet with maximum of 256 colours. On the second one, we have grayscale image, which is every pixel will be equal to number zero or until 255.

So 3% of grayscale image.

The third one is the RGB with maximum 16 million colour. And then this is normally called PNG-24.

So every pixel in this image format will equal to red, plus green, plus blue channel.

And then the fourth one is grayscale plus alpha. The fifth one is RGB plus alpha, which is RGB-8 or PNG-32.

So it's simply by adding alpha channel on top of the existing pixel mode, which is the transparency channel.

Now, one thing you will, when do you want to use PNG. Use-case number one is, when you want to save an image in lossless format. So, which means there is no data discarded when you are compressing the image.

And the reason for that is because this is how PNG works.

PNG has two steps.

The first one is filtering and then the other one is deflate compression. So deflate compression is actually the same algorithm that is used in GZIP compression and zip files.

So that's why if you're trying to zip, GZIP a single PNG file, it's not going to work well because they are both using the same algorithm inside it. So this two algorithms, data compression algorithms, they're lossless compression.

So that's why the output will be lossless.

So if you want to use, if you want to have lossless image, then use PNG.

The next use case for PNG is you want to use it for image with low colour count.

For example, logo or carton.

However, this is best if you are using PNG-8 format for this.

So it's actually replacing GIF use cases for static image, simply because it has the similar characteristic, which is maximum 256 colours with a smaller file size. So this is the same looking image compared between GIF and PNG-8.

And as you can see here, the GIF is 2.8 times bigger than the PNG-8. And simply this is because PNG compresses better than GIF and GIF is older so it has more limitations in algorithm. The next use case for PNG is probably one of the most popular one, which is, you want to use it for partial or full transparency.

However, this only applies for PNG-32 or PNG-8 plus alpha format.

So this is what I mean by partial transparency. And then this is the other opposite, which is it's called binary transparency, which is zero or one transparency, and GIF only supports the later one.

So which mean either a pixel can be transparent or not transparent, while in the PNG it's not the case. And this is simply because PNG has all of our channel. So it has zero and they'll do 155 transparency levels.

Now there are five pixel modes for PNG, which one do you want to use for the website? So I have a similar looking image here.

One is in the PNG-32, and then the other one is in the PNG-8.

And if you see here, the size of PNG-32 is 2.5 times bigger than PNG-8.

So, always try to use PNG-8 for the web purposes, because it has the smaller, smallest file size. And PNG-32 is not always being used for web purposes. So that's why it's bigger.

For example, if you want to use it for 3D modelling, it's better to use PNG-32.

So that's why, if we want to optimise it for web purposes use PNG-8. And in fact, those like ImageAlpha.

If we have seen it before, that's what it's doing. So it's converting.

It's going to convert your PNG-24 or PNG-32 image format into PNG-8 plus Alpha format. So that's why it can keep the size down for you. However, this is a lossy operation.

Just keep in mind because you are limiting the total colour from 16 million, to just 256 colours.

So to summarise everything about PNG, PNG has five modes.

And PNG-8 is always the smallest file size. So I'll try to use it as much as possible for web purposes. It's super lossless compression.

It has partial transparency support.

However it doesn't have animation like in GIF. There is another format called APNG.

However it's not that popular, so you can use WebP or video for animation. The next one I have is JPEG joint photographic expert group.

So a bit of history again, JPEG is created in the 1992, and then it's uses a lossy compression method. So that means during the compression process, there are some that are discarded.

It's the core of JPEG is called Discrete Cosine Transform or DCT.

And it's created in the 1972, by Nassir Ahmed. So use cases for JPEG.

You always want to use JPEG for photographic image, for images with the high colour counts, because for the low colour count image, it's not particularly a good format, for example, for logo or cartoon, it's simply because the file size is bigger and then there's noise artefacts.

So here's another example of JPEG versus PNG. So JPEG, as you can see here in the photographic looking image, the JPEG size is, 14 times smaller than PNG-8.

So that's a lot of difference in terms of size when you are saving photographic image.

So you always want to use JPEG for that.

While on the other hand for low colour count image like this, as you can see here, JPEG is 1.3 times bigger than PNG-8.

It's not really that big difference, but if you zoom in, you will actually see there is a noise in a JPEG image while it doesn't exist in the PNG image.

So always try to use PNG if you want to use it for low colour count image.

But why is it, why JPEG is only good for high colour count and not for the low colour account? So to answer that I'm gonna go through how it works, and this is how JPEG works, which is more complicated than the other image formats. But we are just gonna focus mainly on these three, which is the Block Splitting, Discrete Cosine Transform and Quantization.

So block splitting is pretty straightforward. JPEG will take an image and then it will split it into bunch of eight by eight pixel blocks.

And then for every eight by eight pixel blocks, it's going to transform it into numbers.

So zero equal black colour and then 255 will equal to white.

And then the next step is Discrete Cosine Transform, which is the car of JPEG.

So in Discrete Cosine Transform, there is this pattern table and it contains 64 total patterns.

So by using the combination of these patterns like multiplying on top of each other, it will be able to construct our original eight by eight pixel block.

So this is how it works.

First, we will take the first pattern multiplied by number take the second pattern multiplied by another number and so on and so on for all the patterns.

And then after the multiply process, JPEG is going to sum of them, and then it's going to become our original eight by eight pixel block.

So using this formula.

So this formula is basically the Discrete Cosine Transform formula.

We will be able to get this number.

So it's roughly looks something like this, where we have a number for every single pattern. And then using that formula, we will be able to get the number for every single pattern. And then, it's going to be put inside a matrix called DCT coefficient metrix.

Now, at this point, it's going to move on to the next step, which is quantization.

So quantization is the process to reduce high frequency components because our eyes are less sensitive to that. So this is what I mean by that.

So if you see in a previous pattern table, the more you go to the bottom right.

The more detailed it is.

So it equals to high frequency and our eyes are less sensitive to things on the bottom right.

So in a sense, it's trying to eliminate the value on the bottom right here. So, it's going to eliminate the value in the red circle and trying to make it into zero.

And it's pretty much doing it by doing a deficient and rounding off.

So this is our origin DCT coefficient metrix. And then it's going to divide it by a specific constants that will depend on your JPEG quality.

And then it's simply doing in division by dividing the top left value into another top left value, and then rounding off and then do it for all the value until the bottom right value.

And then we will end up with numbers, something like this, which is called Quantized DCT coefficients. And as you can see here, there are a lot of zeros on the bottom right. So this is actually the intern of quantization process. It's trying to eliminate eliminate the value on the bottom right corner.

Now, at this point, I'm going to go back to our original question, which is JPEG is good for photographic image because as you can see here, if I zoom in the origin eight by eight pixel block, It's kind of like an array of random numbers. However, after it's gone through JPEG algorithm, it's got, it's become something like this, where the number are much smaller and a lot of repeated zeros.

So which mean it has a better compression.

So by you doing this for every block, JPEG managed to save a lot of space saving photographic image.

And then for the other thing is also true.

For the low colour count image.

When you zoom into the low colour count image, as you can see here, the number of the, array of a low colour count is pretty arranged already.

There are only two failures, either zero or 255.

And then if I put this numbers into the same process, it's going to end up something like this, where the result actually more random than the original one. So more random data, will result in the worst compression. So that's why JPEG has the worst compression in terms of low colour count image.

And remember in the beginning about the noise that I mentioned in the JPEG compression algorithm. So this is simply because JPEG works with blocks. It doesn't understand the concept of line.

So as you can see in here in this image, the colour of the line will be scattered across the block. So that's why, that's the reason why the noise happen in the JPEG compression algorithm.

So to summarise everything, JPEG is a lossy image format, and it actually happens when in the quantization step, when we are rounding off the value, by doing a rounding off, we are discarding some of the decimals, which man it's a lossy compression.

And the best use case for JPEG is for high colour count image, simply because it can create pattern from random numbers, which result in a better compression.

So always use JPEG for photographic image.

And how about other field image formats? There are actually some image format out there with better compression are coming up.

So for example, we have WebP, WebP is credited 2010 by Google, and it's actually a Swiss army image format. It can replace the usage of JPEG, PNG and GIF for web purposes.

And it can be smaller in file size too.

And other than WebP, we also have other things like JPEG 2000, HIEC, AVIF, BPG and so on and so on.

So this new image formats have been, have better compression.

However, why are we not using them? So if you notice the pattern here, the most common image formats have been there since 1980s and 1990s.

So this applies for JPEG, PNG and GIF.

Because like this newer image format that have better compression, they are not supported natively by browser. And it's going to take a while.

For example, like WebP, this is the WebP support up until now, WebP hasn't been supported in Safari.

And it's only when I'm recording this stuff that we have seen that Safari is going to support WebP in the next release in the version 14.

So yeah, so that's the reason why we are not using the better image compression and the conclusion for this though.

There are, there is no universal compression algorithm because compression is an app problem.

So that's why, because there is no universal one.

Each image format is created for a specific use case. So choose the best one available for your use case. And by knowing this compressions, you actually know the best way to utilise specific image formats.

So for example, for PNG, you want to use PNG-8 for the web purposes and to help you with that use the online tools to help you optimise images.

So those that I mentioned before like, ImageAlpha, and then there is another one like ImageOptim, pngquant, and so on and so on. They are there to help you to optimise your image for web purposes, because image are not only used for web image can be used outside for the web, but if you want to use it for web purposes, it's better, you optimise them using the tools available out there. So that's pretty much of my talk today.

And thank you.

Understanding image compression

You may also be interested in

More presentations from Code 2020