jcmp: My image compression format (w/ wavelets)
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
This repo is archived. You can view files and clone it, but cannot push or open issues/pull-requests.

42 lines
4.3 KiB

\section{Image Compression}
\label{sec:img}
As indicated in section~\ref{sec:intro} we would like to use the wavelets for compression. If the image is appropriately transformed to the wavelet domain we can throw away coefficients with a small absolute value. These values influence the image only subtely. In this section we will investigate the possibilities of using this method for compression and look at the different difficulties. We will restrict ourselves to gray images.
\subsection{Compression}
In this subsection we will consider an image $X$ to be a real valued $n \times m$-matrix: $X = \{x_{i,j} \| 0 \leq i \leq n,\, 0 \leq j \leq m\}$ (again $n$ and $m$ are powers of two). To determine the compression rate we have to make some assumptions on sizes of certain types. We assume a real value to be 8 bytes, and an index to be 4 bytes. This assumption restricts the size of the matrix ($0 \leq n < 2^32$ and $0 \leq m < 2^32$). Thus the original image $X$ has a size of $8 \times mn$ bytes, as we can store the matrix as an contiguous array (we ignore the fact that we should also store the size). As discussed in section~\ref{sec:dau} we can apply the Daubechies wavelet transform in both directions, which results in a matrix $Y = \{y_{i,j} \| 0 \leq i \leq n,\, 0 \leq j \leq m\}$ (again of the same size). The idea is to throw away all element $y_{i,j}$ with $|y_{i,j}| \leq \tau$ for some threshold parameter $\tau$. The result is the compressed image:
\[ C = \{ (y_{i,j}, i, j) \| 0 \leq i \leq n, 0 \leq j \leq m, |y_{i,j}| \leq \tau \}. \]
Note that we now have to store the location of the coefficients bigger than the threshold $\tau$. Assume that we did not throw away $c$ of the $nm$ coefficients. Then the compressed size is $(8+4+4) \times c = 16 \times c$. So this compression becomes useful when $c \leq \frac{1}{2}nm$. This might seem like a big loss in quality. But as we can see in figure~\ref{fig:compression}, the wavelets are capable of capturing the important shapes and edges of the image.
\todo{
\label{fig:compression}
Make images with no compression, $c = \frac{1}{2}$, $c = \frac{1}{4}$ and $c = \frac{1}{8}$.
}
As the wavelet transform is invertible we can decompress $C$ to obtain a approximation of $X$.
\subsection{Practical difficulties}
We made a big assumption in the previous subsection which does not hold at all in the real world. Namely that an image $X$ is a real valued $n \times m$-matrix. Most images are quantized to take values in $\[0, 255\]$, i.e. images consists of pixels of a single byte. This means that the size of an image $X$ is simply $nm$ bytes. Our algorithm only works for real valued data, so we can convert these bytes to the reals and perform our algorithm to obtain $Y$. In figure~\ref{fig:wavelet_distribution} we see how the values of $X$ and $Y$ are distributed. The values of $X$ are nicely distributed, whereas $Y$ has a totally different distribution. Also note that a lot of the coefficients are concentrated around $0$, this means that we can throw away a lot. However this blow-up from 1 byte to 8 bytes is still to big, so we would like to quantize the remaining values too. For a suitable quantization $f: \R -> \[0, 255\]$ the compressed image is now:
\[ C = \{ (f(y_{i,j}), i, j) \| 0 \leq i \leq n, 0 \leq j \leq m, |y_{i,j}| \leq \tau \}, \]
with a size of $9c$ instead of $mn$. In figure~\ref{fig:comrpession_algo} the different steps of the algorithm are depicted.
\todo{
\label{fig:compression_algo}
Make images of the distribution of $X \to Y \to Y' \to f(Y')$.
}
The only thing left is to specify the quantization function $f$. We want it to have three properties. First of all we would like it to be invertible in order to decompress the image. Of course this is impossible, but we can ask for a function $g$ such that $|y - g(f(y))|$ is somewhat minimized. Secondly, we want $f$ to make the distribution of $Y$ more uniform. And at last we do not want $f$ to account the values we will throw away. In figure~\ref{fig:quantization} three such functions are considered. We will explain the linear quantization and refer to the source code for the other two variants.
\todo{
Explain linear quantization
}
\todo{
\label{fig:quantization}
Make images with linear, squareroot and log quantization
}
In the next section we will see how well this compression performs in terms of running time.