Browse Source

[Report] FIxes some latex issues. Adds image in intro

master
Joshua Moerman 11 years ago
parent
commit
bf31455b5f
  1. 17
      wavelet_report/Makefile
  2. 4
      wavelet_report/dau.tex
  3. 46
      wavelet_report/images.tex
  4. 3137
      wavelet_report/images/fourier_concept.svg
  5. 2
      wavelet_report/img.tex
  6. 42
      wavelet_report/intro.tex
  7. 16
      wavelet_report/par.tex
  8. 6
      wavelet_report/preamble.tex

17
wavelet_report/Makefile

@ -1,5 +1,5 @@
.PHONY: report
.PHONY: report fast images
# We don want to pollute the root dir, so we use a build dir
# http://tex.stackexchange.com/questions/12686/how-do-i-run-bibtex-after-using-the-output-directory-flag-with-pdflatex-when-f
@ -11,3 +11,18 @@ report:
pdflatex -output-directory=build report.tex
pdflatex -output-directory=build report.tex
cp build/report.pdf ./
fast:
mkdir -p build
cp references.bib build/
pdflatex -output-directory=build report.tex
cp build/report.pdf ./
images:
mkdir -p build
pdflatex -output-directory=build images.tex
pdflatex -output-directory=build images.tex
scp build/images.pdf moerman@stitch.science.ru.nl:~/wvlt_images.pdf
ssh moerman@stitch.science.ru.nl 'pdf2svg wvlt_images.pdf wvlt_images.svg'
scp moerman@stitch.science.ru.nl:~/wvlt_images.svg ./images.svg

4
wavelet_report/dau.tex

@ -84,12 +84,12 @@ When implementing this transform, we don't have to perform the even-odd sort. In
Assume we have a function \texttt{apply\_wn\_pn(x, n, s)} which computes $W_n P_n (x_0, x_s, \ldots, x_{s(n-1)})$ in place\footnote{Implementing this is not so hard, but it wouldn't make this section nicer.}. The whole algorithm then can nicely be expressed as
\begin{lstlistings}
\begin{lstlisting}
wavelet(x, n) =
for i = 1 to n/4
apply\_wn\_pn(x, n/i, i)
i = i*2
\end{lstlistings}
\end{lstlisting}
For future reference we also define the following computation: \texttt{apply\_wn(x, y0, y1, n, s)} which computes $W_n (x_0, \ldots, y_0, y_1)$. Note that \texttt{apply\_wn\_pn(x, n, s)} can now be expressed as \texttt{apply\_wn(x, x0, xs, n, s)}.

46
wavelet_report/images.tex

@ -0,0 +1,46 @@
\documentclass[a4paper, 11pt]{amsart}
\input{style}
\input{preamble}
\title{Parallel wavelet transform}
\author{Joshua Moerman}
\begin{document}
\tikzstyle{plain_line}=[]
\begin{figure}
\centering
\begin{subfigure}[b]{0.5\textwidth}
\begin{tikzpicture}
\begin{groupplot}[group style={group size=1 by 4}, clip=false, yticklabels={,,}, height=3cm, width=\textwidth, xmin=0, xmax=128, ymin=-1, ymax=1, domain=0:128]
\nextgroupplot
\addplot[plain_line] coordinates {(0,0) (1,0) (1,1) (2,1) (2,0) (128,0)}; \legend{$e_1$}
\nextgroupplot \addplot[plain_line] coordinates {(0,0) (2,0) (2,1) (3,1) (3,0) (128,0)}; \legend{$e_2$}
\nextgroupplot \addplot[plain_line] coordinates {(0,0) (3,0) (3,1) (4,1) (4,0) (128,0)}; \legend{$e_3$}
\nextgroupplot \addplot[plain_line] {0.8*sin(1*360*x/128) + 0.2*sin(3*360*x/128) + 0.08*sin(5*360*x/128)};
\end{groupplot}
\end{tikzpicture}
\caption{Representing a signal on the standard basis.}
\end{subfigure}~
\begin{subfigure}[b]{0.5\textwidth}
\begin{tikzpicture}
\begin{groupplot}[group style={group size=1 by 4}, yticklabels={,,}, height=3cm, width=\textwidth, xmin=0, xmax=128, ymin=-1, ymax=1, domain=0:128]
\nextgroupplot \addplot[plain_line] {sin(1*360*x/128)}; \legend{$f_1$}
\nextgroupplot \addplot[plain_line] {sin(3*360*x/128)}; \legend{$f_3$}
\nextgroupplot \addplot[plain_line] {sin(5*360*x/128)}; \legend{$f_5$}
\nextgroupplot \addplot[plain_line] {0.8*sin(1*360*x/128) + 0.2*sin(3*360*x/128) + 0.08*sin(5*360*x/128)};
\end{groupplot}
\end{tikzpicture}
\caption{Representing a signal on the Fourier basis.}
\end{subfigure}
\caption{We can represent the same signal on different basis. Note that the Fourier representation is smaller in this case.}
\label{fig:basicplot}
\end{figure}
$$ 0.088 + 0.174 \times 0.257 $$
$$ 0.798 \times 0.201 + 0.081 $$
$$ = \ldots + $$
\end{document}

3137
wavelet_report/images/fourier_concept.svg

File diff suppressed because it is too large

After

Width:  |  Height:  |  Size: 141 KiB

2
wavelet_report/img.tex

@ -18,7 +18,7 @@ As the wavelet transform is invertible we can decompress $C$ to obtain a approxi
\subsection{Practical difficulties}
We made a big assumption in the previous subsection which does not hold at all in the real world. Namely that an image $X$ is a real valued $n \times m$-matrix. Most images are quantized to take values in $\[0, 255\]$, i.e. images consists of pixels of a single byte. This means that the size of an image $X$ is simply $nm$ bytes. Our algorithm only works for real valued data, so we can convert these bytes to the reals and perform our algorithm to obtain $Y$. In figure~\ref{fig:wavelet_distribution} we see how the values of $X$ and $Y$ are distributed. The values of $X$ are nicely distributed, whereas $Y$ has a totally different distribution. Also note that a lot of the coefficients are concentrated around $0$, this means that we can throw away a lot. However this blow-up from 1 byte to 8 bytes is still to big, so we would like to quantize the remaining values too. For a suitable quantization $f: \R -> \[0, 255\]$ the compressed image is now:
We made a big assumption in the previous subsection which does not hold at all in the real world. Namely that an image $X$ is a real valued $n \times m$-matrix. Most images are quantized to take values in $[0, 255]$, i.e. images consists of pixels of a single byte. This means that the size of an image $X$ is simply $nm$ bytes. Our algorithm only works for real valued data, so we can convert these bytes to the reals and perform our algorithm to obtain $Y$. In figure~\ref{fig:wavelet_distribution} we see how the values of $X$ and $Y$ are distributed. The values of $X$ are nicely distributed, whereas $Y$ has a totally different distribution. Also note that a lot of the coefficients are concentrated around $0$, this means that we can throw away a lot. However this blow-up from 1 byte to 8 bytes is still to big, so we would like to quantize the remaining values too. For a suitable quantization $f: \R -> [0, 255]$ the compressed image is now:
\[ C = \{ (f(y_{i,j}), i, j) \| 0 \leq i \leq n, 0 \leq j \leq m, |y_{i,j}| \leq \tau \}, \]
with a size of $9c$ instead of $mn$. In figure~\ref{fig:comrpession_algo} the different steps of the algorithm are depicted.

42
wavelet_report/intro.tex

@ -4,20 +4,33 @@
We start this paper by motivating the need for wavelets. As a starting point of signal processing we first consider the well known Fourier transform. As an example we will be using a 1-dimensional signal of length $128$. As this section is mainly for the motivations we will not be very precise or give concrete algorithms.
\subsection{Recalling the Fourier transform}
Recall the Fourier transform; given an input signal $x = \sum_{i=1}^{128} x_i e_i$ (written on the standard basis $\{e_i\}_i$) we can compute Fourier coefficients $x'_i$ such that $x = \sum_{i=1}^{128} x'_i f_i$. As we're not interested in the mathematics behind this transform, we will not specify the basis $\{f_i\}_i$. Conceptually the Fourier transform is a basis transformation:
Recall the Fourier transform; given an input signal $x = \sum_{i=0}^{127} x_i e_i$ (written on the standard basis $\{e_i\}_i$) we can compute Fourier coefficients $x'_i$ such that $x = \sum_{i=0}^{127} x'_i f_i$. As we're not interested in the mathematics behind this transform, we will not specify the basis $\{f_i\}_i$. Conceptually the Fourier transform is a basis transformation:
$$ SampleDomain \to FourierDomain. $$
Furthermore this transformation has an inverse. Real world applications of this transform often consists of going to the Fourier domain, applying some (easy to compute) function and go back to sample domain. This happens often as measurements often happen at intervals and thus generate samples, but in research people are often interested in the global signal represented by the signals.
Furthermore this transformation has an inverse. Real world applications of this transform often consists of going to the Fourier domain, applying some (easy to compute) function and go back to sample domain.
In figure~\ref{fig:fourier_concepts} an input signal of length $128$ is expressed on the standard basis, and on the Fourier basis (simplified, for illustrational purposes). We see that this signal is better expressed in the Fourier domain, as we only need three coefficients instead of all $128$.
\todo{
fig:fourier\_concepts
spelling out a sum of basis elements in both domains
}
We see that we might even do compression based on these Fourier coefficients. Instead of sending all samples, we just send only a few coefficients from which we are able to approximate the original input. However there is a shortcoming to this. Consider the following scenario. A sensor on Mars detects a signal, transforms it and sends the coefficients to earth. During the transmission one of the coefficients is corrupted. This results in a wave across the whole signal. The error is \emph{non-local}. If, however, we decided to send the original samples, a corrupted sample would only affect a small part of the signal, i.e. the error is \emph{local}. This is illustrated in figure~\ref{fig:fourier_error}.
\tikzstyle{plain_line}=[]
\begin{figure}
\begin{tabular}{c|c}
\begin{subfigure}[b]{0.5\textwidth}
\centering
\includegraphics[scale=0.9]{fourier_concept1}
\caption{Representing a signal on the standard basis.}
\end{subfigure}&
\begin{subfigure}[b]{0.5\textwidth}
\centering
\includegraphics[scale=0.9]{fourier_concept2}
\caption{Representing a signal on the Fourier basis.}
\end{subfigure}
\end{tabular}
\caption{We can represent the same signal on different basis. Note that the Fourier representation is smaller in this case.}
\label{fig:fourier_concepts}
\end{figure}
The figure also shows us that we might do compression based on these Fourier coefficients. Instead of storing all samples, we just store only a few coefficients from which we are able to approximate the original input. However there is a shortcoming to this. Consider the following scenario. A sensor far away detects a signal, transforms it and sends the Fourier coefficients to earth. During the transmission one of the coefficients is corrupted. This results in a wave across the whole signal. The error is \emph{non-local}. If, however, we decided to send the original samples, a corrupted sample would only affect a small part of the signal, i.e. the error is \emph{local}. This is illustrated in figure~\ref{fig:fourier_error}.
\todo{
fig:fourier\_error
@ -28,11 +41,14 @@ We see that we might even do compression based on these Fourier coefficients. In
\subsection{The simplest wavelet transform}
At the heart of the Fourier transform is the choice of the basis elements $f_i$. With a bit of creativity we can cook up different basis elements with different properties. To illustrate this we will have a quick look at the so-called \emph{Haar wavelets}. In our case where $n=128$ we can define the following $128$ elements:
$$ h_0 = \sum_{i=1}^{128} e_i,
h_1 = \sum_{i=1}^{64} e_i - \sum_{i=65}^{128} e_i,
h_2 = \sum_{i=1}^{32} e_i - \sum_{i=33}^{64} e_i,
h_2 = \sum_{i=65}^{96} e_i - \sum_{i=97}^{128} e_i, \ldots,
h_{2^n + j} = \sum_{i=2^{6-n}j+1}^{2^{6-n}(j+1)} e_i - \sum_{i=2^{6-n}(j+1)+1}^{2^{6-n}(j+2)} e_i (j < 2^n) $$
\begin{align}
h_0 &= \sum_{i=1}^{128} e_i, \\
h_1 &= \sum_{i=1}^{64} e_i - \sum_{i=65}^{128} e_i, \\
h_2 &= \sum_{i=1}^{32} e_i - \sum_{i=33}^{64} e_i, \\
h_2 &= \sum_{i=65}^{96} e_i - \sum_{i=97}^{128} e_i, \\
\ldots, & \\
h_{2^n + j} = \sum_{i=2^{6-n}j+1}^{2^{6-n}(j+1)} e_i - \sum_{i=2^{6-n}(j+1)+1}^{2^{6-n}(j+2)} e_i \quad (j < 2^n).
\end{align}
We will refer to these elements as \emph{Haar wavelets}. To give a better feeling of these wavelets, some of them are plotted in figure~\ref{fig:haar_waveleta} on the standard basis. There is also an effective way to express a signal represented on the standard basis on this new basis. Again our example can be written on this new basis, and again we see that the first coefficient already approximates the signal and that the other coefficients refine it.

16
wavelet_report/par.tex

@ -11,25 +11,25 @@ The BSP cost model as defined in \cite{biss} depends on three variables $r, g, l
\subsection{Many communications steps}
The data $\vec{x} = x_0, \ldots, x_{n-1}$ is distributed among the processors with a block distribution, so processor $\proc{s}$ has the elements $\vec{x'} = x_{sb}, \ldots, x_{sb+b-1}$. The first step of the algorithm consists of computing $\vec{x}^{(1)} = S_n W_n P_n \vec{x}$. We can already locally compute the first $b-2$ elements $x^{(1)}_{sb}, \ldots, x^{(1)}_{sb+b-3}$. For the remaining two elements $x^{(1)}_{sb+b-2}$ and $x^{(1)}_{sb+b-1}$ we need the first two elements on processor $s+1$. In the consequent steps a similar reasoning holds, so we derive a stub for the algorithm:
\begin{lstlistings}
\begin{lstlisting}
for i=1 to b/2
y_0 <- get x_{(s+1)b} from processor s+1
y_1 <- get x_{(s+1)b+2^i} from processor s+1
apply_wn(x, y_0, y_1, b/i, i)
i = i*2
\end{lstlistings}
\end{lstlisting}
We stop after $i=\frac{b}{2}=\frac{n}{2p}$ because for $i=b$ we would need three elements from three different processors. To continue, each processor has to send two elements to some dedicated processor (say processor zero). This processor then ends the algorithm by applying the wavelet transform of size $p$ (here $p$ needs to be a power of two). Note that we only have to do this when $p \geq 4$. The last part of the algorithm is given by:
\begin{lstlistings}
\begin{lstlisting}
put x_{sb} in processor 0
if s = 0
wavelet(y, p)
x_{sb} <- get y_{2s} from processor 0
\end{lstlistings}
\end{lstlisting}
Let us analyse the cost of the first part of the algorithm. There are $\logt{b}$ steps to perform where in each step two elements are sent and two elements received, which amounts to a $2$-relation. Furthermore $b/i$ elements are computed. So the communication part costs $\logt{b}\times(2g+l)$ flops and the computational part $14 \times b$ (see section~\ref{sec:dau}). The final part consists of two $(p-1)$-relation and a computation of $14p$ flops. So in total we have:
\[ 14\frac{n}{p} + 14p + \logt(\frac{n}{p})(2g + l) + 2(p-1)g + 2l \text{ flops}.\]
@ -38,12 +38,12 @@ Let us analyse the cost of the first part of the algorithm. There are $\logt{b}$
\subsection{One communication step}
Depending on $l$ it might be very costly to have so many super steps. We can reduce this by requesting all the data of the next processor to calculate all values redundantly. Let us investigate when this is fruitful. The algorithm is as follows:
\begin{lstlistings}
\begin{lstlisting}
for j=1 to b-1
get x_{(s+1)b + j} from processor s+1
for i=1 to b/4
apply_wn(x, 0, 0, 2b/i, i)
\end{lstlistings}
\end{lstlisting}
Of course we are not able to fully compute the wavelet on the second halve of our array (i.e. on the elements we received), because we provide zeroes. But that is okay, as we only need the first few elements, which are computed correctly. However, we must stop one stage earlier because of this. So in this case we let the dedicated processor calculate a wavelet of length $2p$. The costs are given by:
\[ 28\frac{n}{p} + 28p + \frac{n}{p}g + l + 4(p-1)g + 2l. \]
@ -56,7 +56,7 @@ We can give an algorithm which combines the two methods. Instead of calculating
To calculate $x^{(m)}_0$ we need $N_m = 3 \times 2^m-2$ elements (and we get $x^{(m)}_{2^{m-1}}$ for free). So in order to calculate all elements $x^{(m)}_{sb}, \ldots, x^{(m)}_{sb+b-1}$ on one processor we need $D_m = b - 2^m + N_m = b + 2^{m+1} - 2$ elements to start with ($b-2^m$ is the last index we are computing at stage $m$ and the index $b - 2^{m-1}$ comes for free). So in order to compute $m$ steps with a single communication, we need the get $C_m = D_m - b = 2^{m+1} - 2$ elements from the next processor. The algorithm for a fixed $m$ becomes (given in pseudocode):
\begin{lstlistings}
\begin{lstlisting}
steps = log_2(b)
big_steps = steps/m
r = steps - m*big_steps
@ -68,7 +68,7 @@ if r > 0
get 2^{r+1}-2 elements from processor s+1
for i=1 to r
apply_wn on x and the data we received
\end{lstlistings}
\end{lstlisting}
Again we end in the same style as above by letting a dedicated processor do a wavelet transform of length $p$. Note that for $m=1$ we get our first algorithm back and for $m = \logt(b)-1$ we get more or less our second algorithm back. The costs for this algorithm are given by:
\[ 14D_m + 14p + \frac{1}{m}\logt(\frac{n}{p})(2C_mg+l) + 2(p-1)g + 2l. \]

6
wavelet_report/preamble.tex

@ -5,11 +5,15 @@
% floating figures
\usepackage{float}
\usepackage{listings}
\usepackage{tikz}
\usepackage{pgfplots}
\usepgfplotslibrary{groupplots}
\pgfplotsset{compat=newest}
\usepackage{graphicx}
\graphicspath{ {./images/} }
\usepackage{caption}
\usepackage{subcaption}
@ -24,6 +28,7 @@
% \newcommand{\vec}[1]{\mathbf{#1}}
\newcommand{\BigO}[1]{\mathcal{O}(#1)}
\newcommand{\proc}[1]{#1}
\newcommand{\R}{\mathbb{R}}
\newcommand{\todo}[1]{
\addcontentsline{tdo}{todo}{\protect{#1}}
@ -33,5 +38,6 @@
\theoremstyle{plain}
\newtheorem{theorem}{Theorem}[section]
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{notation}[theorem]{Notation}
\newcommand*{\thead}[1]{\multicolumn{1}{c}{\bfseries #1}}