You are on page 1of 21

Application to image compression

This is a picture of a famous mathematician: Emmy Noether compressed in different ways

Introduction When retrieved from the Internet, digital images take a considerable
amount of time to download and use a large amount of computer memory. The Haar
wavelet transform that we will discuss in this application is one way of compressing
digital images so they take less space when stored and transmitted. As we will see
later, the word ``wavelet stands for an orthogonal basis of a certain vector space.

The basic idea behind this method of compression is to treat a digital image as an
array of numbers i.e., a matrix. Each image consists of a fairly large number of little
squares called pixels(picture elements). The matrix corresponding to a digital image
assigns a whole number to each pixel. For example, in the case of a 256x256 pixel
gray scale image, the image is stored as a 256x256 matrix, with each element of the
matrix being a whole number ranging from 0 (for black) to 225 (for white). The JPEG
compression technique divides an image into 8x8 blocks and assigns a matrix to each
block. One can use some linear algebra techniques to maximize compression of the
image and maintain a suitable level of detail

Vector transform using Haar Wavelets Before we explain the transform of a


matrix, let us see how the wavelets transform vectors (rows of a matrix). Suppose
is one row of an 8x8 image matrix. In general, if the data string has length equal to 2k,
then the transformation process will consist of k steps. In the above case, there will be
3 steps since 8=23.
We perform the following operations on the entries of the vector r:
1. Divide the entries of r into four pairs: (420, 680), (448, 708), (1260, 1410),
(1600, 600).
2. Form the average of each of these pairs:

These will form the first four entries of the next step vector r1.

3. Subtract each average from the first entry of the pair to get the numbers:
.
These will form the last four entries of the next step vector r1.
4. Form the new vector:

Note that the vector r1 can be obtained from r by multiplying r on the right by the
matrix:

The first four coefficients of r1 are called the approximation coefficients and the last
four entries are called the detail coefficients.
For our next step, we look at the first four entries of r1 as two pairs that we take their
averages as in step 1 above. This gives the first two entries: 564, 1470 of the new
vector r2. These are our new approximation coefficients. The third and the fourth
entries of r2 are obtained by subtracting these averages from the first element of each
pair. This results in the new detail coefficients: -14, -130. The last four entries
of r2 are the same as the detail coefficients of r1:

Here the vector r2 can be obtained from r1 by multiplying r1 on the right by the
matrix:

For the last step, average the first two entries of r2, and as before subtract the answer
from the first entry. This results in the following vector:

As before, r3 can be obtained from r1 by multiplying r2 on the wright by the matrix:

As a consequence, one gets r3 immediately from r using the following equation

Let

Note the following:

The columns of the matrix W1 form an orthogonal subset of R8 (the vector space of
dimension 8 over R); that is these columns are pair wise orthogonal (try their dot
products). Therefore, they form a basis of R8. As a consequence, W1 is invertible. The
same is true for W2 and W3.
As a product of invertible matrices, W is also invertible and its columns form
an orthogonal basis of R8. The inverse of W is given by:

The fact the W is invertible allows us to retrieve our image from the compressed form
using the relation

Suppose that A is the matrix corresponding to a certain image. The Haar transform is
carried out by performing the above operations on each row of the matrix A and
then by repeating the same operations on the columns of the resulting matrix. The
row-transformed matrix is AW. Transforming the columns of AW is obtained by
multiplying AW on the left by the matrix WT (the transpose of W). Thus, the Haar
transform takes the matrix A and stores it as WTAW. Let S denote the transformed
matrix:

Using the properties of inverse matrix, we can retrieve our original matrix:

This allows us to see the original image (decompressing the compressed image).
Let us try an example.
Example Suppose we have an 8x8 image represented by the matrix

The row-transformed matrix is

Transforming the columns of L is obtained as follows

The point of doing Haar wavelet transform is that areas of the original matrix that
contain little variation will end up as zero elements in the transformed matrix. A
matrix is considered sparse if it has a high proportion of zero entries. Sparse
matrices take much less memory to store. Since we cannot expect the transformed
matrices always to be sparse, we decide on a non-negative threshold value known as
, and then we let any entry in the transformed matrix whose absolute value is less
than to be reset to zero. This will leave us with a kind of sparse matrix. If is zero,
we will not modify any of the elements.
Every time you click on an image to download it from the Internet, the source
computer recalls the Haar transformed matrix from its memory. It first sends the
overall approximation coefficients and larger detail coefficients and a bit later the
smaller detail coefficients. As your computer receives the information, it begins
reconstructing in progressively greater detail until the original image is fully
reconstructed.
Linear algebra can make the compression process faster, more efficient

Let us first recall that an nxn square matrix A is called orthogonal if its columns form
an orthonormal basis of Rn, that is the columns of A are pairwise orthogonal and the
length of each column vector is 1. Equivalently, A is orthogonal if its inverse is equal
to its transpose. That latter property makes retrieving the transformed image via the
equation

much faster.

Another powerful property of orthogonal matrices is that they preserve magnitude. In


other words, if v is a vector of Rn and A is an orthogonal matrix, then ||Av||=||v||. Here
is how it works:

This in turns shows that ||Av||=||v||. Also, the angle is preserved when the
transformation is by orthogonal matrices: recall that the cosine of the angle between
two vectors u and v is given by:

so, if A is an orthogonal matrix, is the angle between the two vectors Au and Av,
then

Since both magnitude and angle are preserved, there is significantly less distortion
produced in the rebuilt image when an orthogonal matrix is used. Since the

transformation matrix W is the product of three other matrices, one can


normalize W by normalizing each of the three matrices. The normalized version
of W is

Remark If you look closely at the process we described above, you will notice that
the matrix W is nothing but a change of basis for R8. In other words, the columns
of W form a new basis (a very nice one) of R8. So when you multiply a
vector v (written in the standard basis) of R8 by W, what you get is the coordinates
of v in this new basis. Some of these coordinates can be neglected using our
threshold and this what allows the transformed matrix to be stored more easily and
transmitted more quickly.
Compression ratio If we choose our threshold value to be positive (i.e. greater than
zero), then some entries of the transformed matrix will be reset to zero and therefore
some detail will be lost when the image is decompressed. The key issue is then to
choose wisely so that the compression is done effectively with a minimum
damage to the picture. Note that the compression ratiois defined as the ratio of
nonzero entries in the transformed matrix (S=WTAW) to the number of nonzero
entries in the compressed matrix obtained from S by applying the threshold .
http://aix1.uottawa.ca/~jkhoury/haar.htm

http://www.whydomath.org/node/wavlets/hwt.html
Image Compression:
How Math Led to the JPEG2000 Standard

Haar Wavelet Transformation


Introduction
The easiest of all discrete wavelet transformations is the Discrete Haar Wavelet Tranformation (HWT). Let's motivate it's
construction with the following example:
Suppose you had the eight numbers 100, 200, 44, 50, 20, 20, 4, 2 (these could be grayscale intensities) and you wanted to send
an approximation of the list to a friend. Due to bandwidth constraints (this is a really old system!), you are only allowed to send your
friend four values. What four values would you send your friend that might represent an approximation of the eight given values?
There are obviously many possible answers to this question, but one of the most common solutions is to take the eight numbers,
two at a time, and average them. This computation would produce the four values 150, 47, 20, and 3. This list would represent an
approximation to the original eight values.
Unfortunately, if your friend receives the four values 150, 47, 20, and 3, she has no chance of producing the original eight values
from them - more information is needed. Suppose you are allowed to send an additional four values to your friend. With these
values and the first four values, she should be able to reconstruct your original list of eight values. What values would you send her?
Suppose we sent our friend the values 50, 3, 0, and -1. How did we arrive at these values? They are simply the directed distances
from the pairwise average to the second number in each pair: 150 + 50 = 200, 47 + 3 = 50, 20 + 0 = 20, and 3 + (-1) = 2. Note that if
we subtract the values in this list from the pairwise averages, we arrive at the first number in each pair: 150 - 50 = 100, 47 - 3 = 44,
20 - 0 = 20, and 3 - (-1) = 4. So with the lists (150,47,20,3) and (50,3,0,-1), we can completely reconstruct the original list
(100,200,44,50,20,20,4,2).
Given two numbers a and b, we have the following transformation:

(a, b)

( (b + a)/2, (b - a)/2 )

We will call the first output the average and the second output the difference.
So why would we consider sending (150,47,20,3 | 50, 3, 0, -1) instead of (100,200,44,50,20,20,4,2)? Two reasons quickly come to
mind. The differences in the transformed list tell us about the trends in the data - big differences indicate large jumps between
values while small values tell us that there is relatively little change in that portion of the input. Also, if we are interested in lossy
compression, then small differences can be converted to zero and in this way we can improve the efficiency of the coder. Suppose
we converted the last three values of the transformation to zero. Then we would transmit (150, 47, 20, 3 | 50, 0, 0, 0). The recipient
could invert the process and obtain the list
(150-50, 150+50, 47-3, 47+3, 20-0, 20+0, 3-0, 3+0) = (100,200,44,50,20,20,3,3)
The "compressed" list is very similar to the original list!
Matrix Formulation
For an even-length list (vector) of numbers, we can also form a matrix product that computes this transformation. For the sake of
illustration, let's assume our list (vector) is length 8. If we put the averages as the first half of the output and differences as the
second half of the output, then we have the following matrix product:

W8 v=

1 2 0 0 0 1 2 0 0 0 1 2 0 0 0 1 2 0 0 0 0 1 2 0 0 0 1 2 0 0 0 1 2 0 0 0 1

2 0 0 0 0 1 2 0 0 0 1 2 0 0 0 1 2 0 0 0 1 2 0 0 0 0 1 2 0 0 0 1 2 0 0 0 1 2 0 0 0 1 2

v1 v2 v3 v4 v5 v6 v7 v8

(v1+v2)

2 (v3+v4) 2 (v5+v6) 2 (v7+v8) 2 (v2v1) 2 (v4v3) 2 (v6v5) 2 (v8v7) 2

=y

Inverting the Process


Inverting is easy - if we subtract y5 from y1, we obtain v1. If we add y5 and y1, we obtain v2. We can continue in a similar manner
adding and subtracting pairs to completely recover v . We can also write the inverse process as a matrix product. We have:

W81 y=
1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0
0 1 1

(v1+v2) 2 (v3+v4) 2 (v5+v6) 2 (v7+v8) 2 (v2v1)

2 (v4v3) 2 (v6v5) 2 (v8v7) 2

v1 v2 v3 v4 v5 v6 v7 v8
=v

The matrix W8

satisfies another interesting property - we can compute the inverse by doubling the transpose! That is,
W81 =2W8T

For those of you who have taken a linear algebra course, you may remember that orthogonal matrices Usatisfy U1=UT . We almost
have that with our transformation. Indeed if we construct W8=

W81 = = = = = = (

2W8 )1 (1

2W8 , we have

2)W81

(1

2)2W8T

2W8T

2W8T )T W8T

Haar Wavelet Transform Defined


We will define the HWT as the orthogonal matrix described above. That is, for N even, the Discrete Haar Wavelet Transformation is
defined as

WN=
2 200

2 20
2 200

2 20

0
00

2 20

2 200

2 200
2 200

2 2000

2 200

2 200

2 2

and the inverse HWT is WN1=WNT.


Analysis of the HWT
The first N/2 rows of the HWT produce a weighted average of the input list taken two at a time. The weight factor is
N/2 row of the HWT produce a weighted difference of the input list taken two at a time. The weight factor is also

2 . The last
2.

We define the Haar filter as the numbers used to form the first row of the transform matrix. That is, the Haar filter is h= h0 h1 =
2 2
2 2 . This filter is also called a lowpass filter - since it averages pairs of numbers, it tends to reproduce (modulo the
2)
two values that are similar and send to 0 to numbers that are (near) opposites of each other. Note also that the sum of the filter
values is

2.

We call the filter that is used to build the bottom half of the HWT a highpass filter. In this case, we have g= g0 g1 =

2 2

2 . Highpass filters process data exactly opposite of lowpass filters. If two numbers are near in value, the highpass filter will return

a value near zero. If two numbers are (near) opposites of each other, then the highpass filter will return a weighted version of one of
the two numbers.
Fourier Series From the Filters
An important tool for constructing filters for discrete wavelet transformations is Fourier series. To analyze a given filter h=(h0 h1 h2
hL), engineers will use the coefficients to form a Fourier series
H( )=h0+h1ei +h2e2i + +hLeLi
and then plot the absolute value of this series. It turns out that we can identify lowpass filters and highpass filters from these graphs.
The plots for the filters for the HWT H( )=2 2+2 2ei and G( )=2 2+2 2ei appear below:

H(

Note that the first graph has value

G(

2 at 0 and H( )=0 . The graph for the highpass filter is just the opposite - G(0)=0 and G( )

= 2 . This is typical of lowpass and highpass filters. We can also put other conditions on these graphs and that is often how more
sophisticated lowpass/highpass filter pairs for the DWT are defined.
HWT and Digital Images
How do we apply the HWT to a digital grayscale image? If the image is stored in matrix A with even dimensions M x N, then the
natural thing to try is to compute WMA. We can view this matrix multiplication as WM applied to each column of A so the output
should be an M x N matrix where each column is M/2 weighted averages followed by M/2 weighted differences. The plots below
illustrate the process:

A digital image. Fullsize version

W160A. Fullsize version

We have used the Haar matrix to process the columns of image matrix A. It is desirable to process therows of the image as well. We
proceed by multiplying WMA on the right by WNT. Transposing the wavelet matrix puts the filter coefficients in the columns and
multiplication on the right by WNT means that we will be dotting the rows of WMA with the columns of WNT (columns of WN). So the two
dimensional HWT is defined as:
B=WMAWNT
The process is illustrated below.

The two-dimensional HWT. Fullsize version

Analysis of the Two-Dimensional HWT


You can see why the wavelet transformation is well-suited for image compression. The two-dimensional HWT of the image has most
of the energy conserved in the upper left-hand corner of the transform - the remaining three-quarters of the HWT consists primarily
of values that are zero or near zero. The transformation is local as well - it turns out any element of the HWT is constructed from
only four elements of the original input image. If we look at the HWT as a block matrix product, we can gain further insight about the
transformation.
Suppose that the input image is square so we will drop the subscripts that indicate the dimension of the HWT matrix. If we use H to
denote the top block of the HWT matrix and G to denote the bottom block of the HWT, we can express the transformation as:

B=WAWT= H G A H G T= H G A HT GT = HA GA
HT GT = HAHT GAHT HAGT GAGT
We now see why there are four blocks in the wavelet transform. Let's look at each block individually. Note that the matrix H is
constructed from the lowpass Haar filter and computes weighted averages while Gcomputes weighted differences.
The upper left-hand block is HAHT - HA averages columns of A and the rows of this product are averaged by multiplication with HT.
Thus the upper left-hand corner is an approximation of the entire image. In fact, it can be shown that elements in the upper left-hand
corner of the HWT can be constructed by computing weighted averages of each 2 x 2 block of the input matrix. Mathematically, the
mapping is

acbd
2 ( a + b + c + d )/4
The upper right-hand block is HAGT - HA averages columns of A and the rows of this product are differenced by multiplication
with GT. Thus the upper right-hand corner holds information about vertical in the image - large values indicate a large vertical
change as we move across the image and small values indicate little vertical change. Mathematically, the mapping is

acbd
2 ( b + d - a - c)/4
The lower left-hand block is GAHT - GA differences columns of A and the rows of this product are averaged by multiplication with HT.
Thus the lower left-hand corner holds information about horizontal in the image - large values indicate a large horizontal change as
we move down the image and small values indicate little horizontal change. Mathematically, the mapping is

acbd
2 ( c + d - a - b )/4
The lower right-hand block is differences across both columns and rows and the result is a bit harder to see. It turns out that this
product measures changes along 45-degree lines. This is diagonal differences. Mathematically, the mapping is

acbd
2 ( b + c - a - d )/4
To summarize, the HWT of a digital image produces four blocks. The upper-left hand corner is an approximation or blur of the
original image. The upper-right, lower-left, and lower-right blocks measure the differences in the vertical, horizontal, and diagonal
directions, respectively.

Iterating the Process


If there is not much change in the image, the difference blocks are comprised of (near) zero values. If we apply quantization and
convert near-zero values to zero, then the HWT of the image can be effectively coded and the storage space for the image can be
drastically reduced. We can iterate the HWT and produce an even better result to pass to the coder. Suppose we compute the HWT
of a digital image. Most of the high intensities are contained in the blur portion of the transformation. We can iterate and apply the
HWT to the blur portion of the transform. So in the composite transformation, we replace the blur by itstransformation! The process
is completely invertible - we apply the inverse HWT to the transform of the blur to obtain the blur. Then we apply the inverse HWT to
obtain the original image. We can continue this process as often as we desire (and provided the dimensions of the data are divisible
by suitable powers of two). The illustrations below show two iterations and three iterations of the HWT.

Energy distribution for the image and HWTs.

Two iterations of the HWT. Fullsize version

Three iterations of the HWT Fullsize version

The iterated HWT is an effective tool for conserving the energy of a digital image. The plot below shows the energy distribution for
the original image (green), one iteration of the HWT (brown), and three iterations of the HWT (orange). The horizontal scale is pixels
(there are 38,400 pixels in the thumbnail of the image). For a given pixel value p, the height represents the percentage of energy
stored in the largest p pixels of the image. Note that the HWT gets to 1 (100% of the energy) much faster than the original image
and the iterated HWT is much better than either the HWT or the original image.
Summary
The HWT is a wonderful tool for understanding how a discrete wavelet tranformation works. It is not desirable in practice because
the filters are too short - since each filter is length two, the HWT decouples the data to create values of the transform. In particular,
each value of the transform is created from a 2 x 2 block from the original input. If there is a large change between say row 6 and
row 7, the HWT will not detect it. The HWT also send integers to irrational numbers and for lossless image compression, it is crucial
that the transform send integers to integers. For these reasons, researchers developed more sophisticated filters. Be sure to check
out the other subsections to learn more other types of wavelet filters.

http://www.cs.ucf.edu/~mali/haar/
An Introduction to Wavelets and the Haar Transform
by Musawir Ali

In this article, I will present an introduction to wavelets and the 1D Haar


Transform. Then I will show how the 1D Haar Transform can easily be
extended to 2D. This article assumes that the reader has general knowledge
about basis functions. If not, please look at my introductory article on basis
functions before reading this one.

What are Wavelets?

Wavelets are a set of non-linear bases. When projecting (or approximating) a


function in terms of wavelets, the wavelet basis functions are chosen
according to the function being approximated. Hence, unlike families
of linear bases where the same, static set of basis functions are used
for every input function, wavelets employ a dynamic set of basis functions
that represents the input function in the most efficient way. Thus wavelets are
able to provide a great deal of compression and are therefore very popular in
the fields of image and signal processing.

Lets get in to the details of how this dynamic set of basis functions is chosen
and how the input function is transformed into wavelets.

Lets suppose that we have a string of numbers: ( 2, 2, 2, 2 ) and we want to


transmit this over a network. Of course, we would like to do this in the fastest
possible way, which implies that we want to send the least amount of data
possible. So lets consider our options. Trivially, one of our options is to just
send all the four numbers, i.e., send the first '2', then the second '2', then the
third, and lastly the fourth. In doing so, we are implicitly choosing the
following basis:

<1, 0, 0, 0>

<0, 1, 0, 0>
<0, 0, 1, 0>
<0, 0, 0, 1>

But as you would suspect, this is not the best way of doing things. Can we do
better? The trick is to choose a basis that represents our data efficiently and
in a very compact fashion. Notice that our data is pretty uniform; in fact it is
just a constant signal of 2. We would like to exploit this uniformity. If we
choose the basis vector <1, 1, 1, 1>, we can represent our data by just one
number! We would only have to send the number 2 over the network, and our
entire data string could be reconstructed by just multiplying (or weighting)
with the basis vector <1, 1, 1, 1>. This is great, but we still need three more
basis vectors to complete our basis since the space in our example is 4
dimensional. Remember, that all basis vectors have to be orthogonal (or
perpendicular). This means that if you take the dot (or scalar) product of any
two basis vectors, the result should be zero. So our task is to find a vector
that is orthogonal to <1, 1, 1, 1>. One such vector is <1, 1, -1, -1>. If you
take the dot product of these two vectors, the result is indeed zero.
Graphically, these two vectors look like this:

<1,1,1,1>

<1,1,-1,-1>

Notice that graphically these basis vectors look like waves, hence the name
wavelets. Now that we have two basis vectors, we need two more. Haar
constructed the remaining basis vectors by a process of dilation and shifting.
Dilation basically means squeezing; therefore the remaining basis vectors
were constructed by squeezing and shifting. If we squeeze the vector <1, 1, 1, -1>, we get <1, -1, 0, 0>. The 1, 1 pair gets squeezed in to a single 1, and
similarly the -1, -1 pair becomes a single -1. Next, we perform a shift on the
resultant basis vector and get: <0, 0, 1, -1> which is our final basis vector.
Graphically, these two vectors look like this:

<1,-1,0,0>

<0,0,1,-1>

We now have a complete basis for our four dimensional space, comprised of
the following basis vectors or wavelets.

<1, 1, 1, 1>
<1, 1, -1, -1>
<1, -1, 0, 0>
<0, 0, 1, -1>

Take time to convince yourself that all four of these vectors are perpendicular
to each other (take the dot product and see if it is zero). Even though these
basis vectors are orthogonal, they are not orthonormal. However, we can
easily normalize them by calculating the magnitude of each of these vectors
and then dividing their components by that magnitude.

<1, 1, 1, --> magnitude = sqrt((1)2 + (1)2 + --> <1/2, 1/2, 1/2,


1>
(1)2 + (1)2) = 2
1/2>
<1, 1, -1, --> magnitude = sqrt((1)2 + (1)2 + --> <1/2, 1/2, -1/2, -1>
(-1)2 + (-1)2) = 2
1/2>
<1, -1, 0, --> magnitude = sqrt((1)2 + (-1)2 + --> <1/2, -1/2, 0,
0>
(0)2 + (0)2) = 2
0>
<0, 0, 1, --> magnitude = sqrt((0)2 + (0)2 + --> <0, 0, 1/2, -1>
(1)2 + (-1)2) = 2
1/2>

Now that we have our basis, let us look at an example of how we can project

an input vector in to wavelets. This is also known as the 1D Haar Transform.

1D Haar Transform

Suppose our input vector is: <4, 2, 5, 5>. To project this in to wavelets, we
simply take a dot product of the input vector with each of the basis vectors.

dot ( <4, 2, 5, 5> , <1/2, 1/2, 1/2,


1/2> ) = 8
dot ( <4, 2, 5, 5> , <1/2, 1/2, -1/2, 1/2> ) = -2
dot ( <4, 2, 5, 5> , <1/2, -1/2, 0,
0> ) = 2/2
dot ( <4, 2, 5, 5> , <0, 0, 1/2, 1/2> ) = 0

Thus the input vector got transformed in to <8, -2, 2/2, 0>. Notice the 4th
component is 0! This means that we do not need the 4th basis vector, we can
reconstruct our original input vector with just the first three basis vectors. In
other words, we dynamically chose 3 basis vectors from a possible 4
according to our input.
8 * <1/2, 1/2, 1/2, 1/2> = <4, 4, 4, 4>
-2 * <1/2, 1/2, -1/2, = <-1, -1, 1, 1>
1/2>
2/2 * <1/2, -1/2, 0,
= <1, -1, 0, 0>
0>
add the vectors

= <4, 2, 5, 5>

Until now, we had a 4 component input vector, and a corresponding set of 4


component basis vectors. But what if we have a larger input vector, say with
8 components? Would we need to find the new basis vectors? The answer is

no. We can use the smaller basis that we already have. In fact, we can use
the simplest wavelet basis which consists of: <1/2, 1/2> and <1/2, 1/2>. These are the smallest wavelets, notice you cannot squeeze them any
further. However in choosing these smaller basis vectors for larger input, we
can no longer do the Haar wavelet transform in one pass as we did earlier. We
will have to recursively transform the input vector until we get to our final
result. As an example, let us use the simple, 2 component basis to transform
the 4 component input vector that we had in our previous example. The
algorithm is outlined below, and our example is traced along side.

Example
Input Vector:

<4,2,5,5>

Transformed Vector (basis


coefficients):

<?,?,?,?> (we are calculating


this)

Basis Vectors (wavelets):

<1/2, 1/2>, <1/2, 1/2>

If size of input vector is less than Size of input vector = 4


1. size of basis vector, place input in Size of basis vector = 2
transformed vector and exit
4 > 2, so continue
2.

Break input vector in to parts of


size of the wavelet:

Take dot product of first basis


3.
with every part:

<4,2> , <5,5>
dot(<4,2> , <1/2, 1/2>)
= 6/2
dot(<5,5> , <1/2, 1/2>)
= 10/2

dot(<4,2> , <1/2, -1/2>)


Take dot product of second basis = 2/2
4.
with every part:
dot(<5,5> , <1/2, -1/2>)
=0
Result of 3. is the new Input
Vector:

<6/2, 10/2>

Result of 4. fills part of the


Transformed Vector:

< ?, ?, 2/2, 0>

Size of input vector = 2


Size of basis vector = 2
2 = 2, so continue

5. Go to 1.

dot(<6/2, 10/2> , <1/2,


1/2>) = 8
dot(<6/2, 10/2> , <1/2,
-1/2>) = -2
Input Vector:

<8>

Updated Transformed Vector:

< ?, -2, 2/2, 0>


Size of input vector = 1
Size of basis vector = 2
1 < 2, so place input in
transformed vector and exit

Transformed Vector (final result): <8, -2, 2/2, 0>

This algorithm is very simple. If you think about it, all it does is take the sums
and differences of every pair of numbers in the input vector and divides them
by square root of 2. Then, the process is repeated on the resultant vector of
the summed terms. Following is an implementation of the 1D Haar Transform
in C++

/* Inputs: vec = input vector, n = size of input vector */


void haar1d(float *vec, int n)
{
int i=0;
int w=n;
float *vecp = new float[n];
for(i=0;i<n;i++)
vecp[i] = 0;
while(w>1)
{
w/=2;
for(i=0;i<w;i++)
{
vecp[i] = (vec[2*i] + vec[2*i+1])/sqrt(2.0);
vecp[i+w] = (vec[2*i] - vec[2*i+1])/sqrt(2.0);

}
for(i=0;i<(w*2);i++)
vec[i] = vecp[i];
}
delete [] vecp;
}

2D Haar Transform
The 1D Haar Transform can be easily extended to 2D. In the 2D case, we
operate on an input matrix instead of an input vector. To transform the input
matrix, we first apply the 1D Haar transform on each row. We take the
resultant matrix, and then apply the 1D Haar transform on each column. This
gives us the final transformed matrix. The source code for both the 1D and 2D
Haar transform can be downloaded here. The 2D Haar transform is used
extensively in image compression.

You might also like