Professional Documents
Culture Documents
Project
Submitted in partial fulfillment of the requirements
For the degree of
BACHELOR OF ENGINEERING
By
RAJESH PRASANNAKUMAR
PRIYANK SAXENA
RAJESH SWAMINATHAN
UNIVERSITY OF MUMBAI
2004-05
p
INTRODUCTION
Often signals we wish to process are in the time-domain, but in order to process them
more easily other information, such as frequency, is required. Mathematical transforms
translate the information of signals into different representations. For example, the
Fourier transform converts a signal between the time and frequency domains, such that
the frequencies of a signal can be seen. However the Fourier transform cannot provide
information on which frequencies occur at specific times in the signal as time and
frequency are viewed independently. To solve this problem the Short Term Fourier
Transform (STFT) introduced the idea of windows through which different parts of a
signal are viewed. For a given window in time the frequencies can be viewed. However
Heisenburg.s Uncertainty Principle states that as the resolution of the signal improves in
the time domain, by zooming on different sections, the frequency resolution gets worse.
Ideally, a method of multiresolution is needed, which allows certain parts of the signal to
be resolved well in time, and other parts to be resolved well in frequency.
The power and magic of wavelet analysis is exactly this multiresolution. Images contain
large amounts of information that requires much storage space, large transmission
bandwidths and long transmission times. Therefore it is advantageous to compress the
image by storing only the essential information needed to reconstruct the image. An
image can be thought of as a matrix of pixel (or intensity) values. In order to compress
the image, redundancies must be exploited, for example, areas where there is little or no
change between pixel values. Therefore images having large areas of uniform colour will
have large redundancies, and conversely images that have frequent and large changes in
colour will be less redundant and harder to compress.
Wavelet analysis can be used to divide the information of an image into approximation
and detail subsignals. The approximation subsignal shows the general trend of pixel
values, and three detail subsignals show the vertical, horizontal and diagonal details or
changes in the image. If these details are very small then they can be set to zero without
significantly changing the image. The value below which details are considered small
enough to be set to zero is known as the threshold. The greater the number of zeros the
greater the compression that can be achieved. The amount of information retained by an
image after compression and decompression is known as the energy retained and this is
proportional to the sum of the squares of the pixel values. If the energy retained is 100%
then the compression is known as lossless. as the image can be reconstructed exactly.
This occurs when the threshold value is set to zero, meaning that the detail has not been
changed. If any values are changed then energy will be lost and this is known as lossy
compression. Ideally, during compression the number of zeros and the energy retention
will be as high as possible. However, as more zeros are obtained more energy is lost, so a
balance between the two needs to be found.
The first part of the report introduces the background of wavelets and compression in
more detail. This is followed by a review of a practical investigation into how
compression can be achieved with wavelets and the results obtained. The purpose of the
investigation was to find the effect of the decomposition level, wavelet and image on the
number of zeros and energy retention that could be achieved. For reasons of time, the set
of images, wavelets and levels investigated was kept small. Therefore only one family of
wavelets, the Daubechies and Haar wavelets, was used. The final part of the report
discusses image properties and thresholding, two issues which have been found to be of
great importance in compression.
CHAPTER 2
REVIEW OF LITERATURE
2.1 THE NEED FOR WAVELETS
Often signals we wish to process are in the time-domain, but in order to process them
more easily other information, such as frequency, is required. A good analogy for this
idea is given by Hubbard[4]. In order to do this calculation we would find it easier to first
translate the numerals in to our number system, and then translate the answer back into a
roman numeral. The result is the same, but taking the detour into an alternative number
system made the process easier and quicker. Similarly we can take a detour into
frequency space to analysis or process a signal.
Figure 2.1 The different transforms provided different resolutions of time and frequency.
In Fourier analysis a signal is broken up into sine and cosine waves of different
frequencies, and it effectively re-writes a signal in terms of different sine and cosine
waves. Wavelet analysis does a similar thing, it takes a mother wavelet, then the signal is
translated into shifted and scale versions of this mother wavelet.
2.1.2 The Continuous Wavelet Transform (CWT)
The continuous wavelet transform is the sum over all time of scaled and shifted versions
of the mother wavelet ȥ. Calculating the CWT results in many coefficients C, which are
functions of scale and translation.
ҏ
The translation, IJ, is proportional to time information and the scale, s, is proportional to
the inverse of the frequency information. To find the constituent wavelets of the signal,
the coefficients should be multiplied by the relevant version of the mother wavelet. The
scale of a wavelet simply means how stretched it is along the x-axis, larger scales are
more stretched:
The translation is how far it has been shifted along the x-axis. Figure 2.3 shows a
wavelet, figure 2.4 shows the same mother wavelet translated by k:
Figure 2.3 The translated wavelet.
The coefficients produced can form a matrix at the different scale and translation values;
the higher coefficients suggest a high correlation between the part of the signal and that
version of the wavelet. Figure 2.7 shows a signal and a plot of the corresponding CWT
coefficient matrix.
The colours in the coefficients matrix plot show the relative sizes of the coefficients. The
signal is very similar to the wavelet in light areas, dark area shows that the corresponding
time and scale versions of the wavelet were dissimilar to the signal.
Figure 2.5 Screen print from Matlab Wavelet Toolbox GUI. The top graph shows the
signal to be analysed with the CWT. The bottom plot shows the coefficients at
corresponding scale and times. The horizontal axis is time, the vertical axis is scale.
(i) a high-pass filter, high frequency information is kept, low frequency information is
lost.
(ii) a low pass filter, low frequency information is kept, high frequency information is
lost.
then the signal is effectively decomposed into two parts, a detailed part (high frequency),
and an
approximation part (low frequency). The subsignal produced from the low filter will have
a highest
frequency equal to half that of the original. According to Nyquist sampling this change in
requency
range means that only half of the original samples need to be kept in order to perfectly
reconstruct the signal. More specifically this means that upsampling can be used to
remove every second sample. The scale has now been doubled. The resolution has also
been changed, the filtering made the frequency resolution better, but reduced the time
resolution.
The approximation subsignal can then be put through a filter bank, and this is repeated
until the required level of decomposition has been reached. The ideas are shown in figure
below:
Figure 2.6
The DWT is obtained by collecting together the coefficients of the final approximation
subsignal and all the detail subsignals.
Overall the filters have the effect of separating out finer and finer detail, if all the details
are ¶added¶ together then the original signal should be reproduced. Using a further
analogy from Hubbard[4] this decomposition is like decomposing the ratio 87/7 into parts
of increasing detail, such that:
87 / 7 = 10 + 2 + 0.4 + 0.02 + 0.008 + 0.0005
The detailed parts can then be re-constructed to form 12.4285 which is an approximation
of the original number 87/7.
Images require much storage space, large transmission bandwidth and long transmission
time. The only way currently to improve on these resource requirements is to compress
images, such that they can be transmitted quicker and then decompressed by the receiver.
In image processing there are 256 intensity levels (scales) of grey. 0 is black and 255 is
white. Each level is represented by an 8-bit binary number so black is 00000000 and
white is 11111111. An image can therefore be thought of as grid of pixels, where each
pixel can be represented by the 8-bit binary value for grey-scale.
The resolution of an image is the pixels per square inch. (So 500dpi means that a pixel is
1/500th of an inch). To digitise a one-inch square image at 500 dpi requires 8 x 500 x500
= 2 million storage bits. Using this representation it is clear that image data compression
is a great advantage if many images are to be stored, transmitted or processed.
According to [6] "Image compression algorithms aim to remove redundancy in data in a
way which makes image reconstruction possible." This basically means that image
compression algorithms try to exploit redundancies in the data; they calculate which data
needs to be kept in order to reconstruct the original image and therefore which data can
be ¶thrown away¶. By removing the redundant data, the image can be represented in a
smaller number of bits, and hence can be compressed.
1. Find image data properties; grey-level histogram, image entropy, correlation functions
etc..
2. Find an appropriate compression technique for an image of those properties.
(i) The colormap is a matrix of values representing all the colours in the image.
(ii) The image matrix contains indexes corresponding to the colour map colormap.
A colormap matrix is of size Nx3, where N is the number of different colours in the
image. Each row represents the red, green, blue components for a colour.
In 2D, the images are considered to be matrices with N rows and M columns. At every
level of
decomposition the horizontal data is filtered, then the approximation and details produced
from this are filtered on columns.
Figure 2.8 2D wavelet analysis
At every level, four sub-images are obtained; the approximation, the vertical detail, the
horizontal detail and the diagonal detail. Below the Saturn image has been decomposed to
one level. The wavelet analysis has found how the image changes vertically, horizontally
and diagonally.
MATLAB has two interfaces which can be used for compressing images, the command
line and the GUI. Both interfaces take an image decomposed to a specified level with a
specified wavelet and calculate the amount of energy loss and number of zero¶s. When
compressing with orthogonal wavelets the energy retained is [7]:
p
WAVELET TRANSFORMS
The low frequency components (smooth variations) constitute the base of an image, and
the high frequency components (the edges which give the detail) add upon them to refine
the image, thereby giving a detailed image. Hence, the smooth variations are demanding
more importance than the details.
Separating the smooth variations and details of the image can be done in many ways. One
such way is the decomposition of the image using a Discrete Wavelet Transform (DWT).
Digital image compression is based on the ideas of sub band decomposition or discrete
wavelet transform (DWT). In fact, wavelets refer to a set of basis functions, which is
defined recursively from a set of scaling coefficients and scaling functions. The DWT is
defined using these scaling functions and can be used to analyze digital images with
superior performance than classical short-time Fourier-based techniques, such as the
DCT. The basic difference wavelet-based and Fourier-based techniques is that short-time
Fourier-based techniques use a fixed analysis window, while wavelet-based techniques
can be considered using a short window at high spatial frequency data and a long window
at low spatial frequency data. This makes DWT more accurate in analyzing image signals
at different spatial frequency, and thus can represent more precisely both smooth and
dynamic regions in image. The compressor includes forward wavelet transform,
Quantizer, and Lossless entropy encoder. The corresponding decompressed is formed by
Lossless entropy decoder, de-Quantizer, and an inverse wavelet transform. Wavelet-
based image compression has good compression results in both rate and distortion sense.
Image Compression is different from data compression (binary data). When we apply
techniques used for binary data compression to the images, the results are not optimal.
In Lossless compression the data (binary data such as executables, documents etc) are
compresses such that when decompressed, it give an exact replica of the original data.
They need to be exactly reproduced when decompressed. On the other hand, images need
not be reproduced 'exactly'. An approximation of the original image is enough for most
purposes, as long as the error between the original and the compressed image is tolerable.
Lossy compression techniques can be used in this area. This is because images have
certain statistical properties, which can be exploited by encoders specifically designed for
them. Also, some of the finer details in the image can be sacrificed for the sake of saving
a little more bandwidth or storage space.
In images the neighboring pixels are correlated and therefore contain redundant
information. Before we compress an image, we first find out the pixels, which are
correlated. The fundamental components of compression are redundancy and irrelevancy
reduction. Redundancy means duplication and Irrelevancy means the parts of signal that
will not be noticed by the signal receiver, which is the Human Visual System (HVS).
There are three types of redundancy can be identified:
Spatial Redundancy i.e. correlation between neighboring pixel values.
Spectral Redundancy i.e. correlation between different color planes or spectral bands.
Temporal Redundancy i.e. correlation between adjacent frames in a sequence of images
(in video applications).
Image compression focuses on reducing the number of bits needed to represent an image
by removing the spatial and spectral redundancies. Since this project is about still image
compression, therefore temporal redundancy is not relevant.
For decoding
i) Up samples
ii) Re composition of signal with filter bank.
Before going into details of wavelet theory and its application to image compression, I
will go ahead and describe few terms most commonly used in this report.
A block diagram of a wavelet based image compression system is shown in Figure 3.1.
At the heart of the analysis (or compression) stage of the system is the forward discrete
wavelet transform (DWT). Here, the input image is mapped from a spatial domain, to a
scale-shift domain. This transform separates the image information into octave frequency
subbands. The expectation is that certain frequency bands will have zero or negligible
energy content; thus, information in these bands can be thrown away or reduced so that
the image is compressed without much loss of information.
The DWT coefficients are then quantized to achieve compression. Information lost
during the quantization process cannot be recovered and this impacts the quality of the
reconstructed image. Due to the nature of the transform, DWT coefficients exhibit spatial
correlation, that are exploited by quantization algorithms like the embedded zero-tree
wavelet (EZW) and set partitioning in hierarchical trees (SPIHT) for efficient
quantization. The quantized coefficients may then be entropy coded; this is a reversible
process that eliminates any redundancy at the output of the quantizer.
ccccccccccccccccccccccc c c
c
ccccc
In the synthesis (or decompression) stage, the inverse discrete wavelet transform recovers
the original image from the DWT coefficients. In the absence of any quantization the
reconstructed image will be identical to the input image. However, if any information
was discarded during the quantization process, the reconstructed image will only be an
approximation of the original image. Hence this is called lossy compression. The more an
image is compressed, the more information is discarded by the quantizer; the result is a
reconstructed image that exhibits increasingly more artifacts. Certain integer wavelet
transforms exist that result in DWT coefficients that can be quantized without any loss of
information. These result in lossless compression, where the reconstructed image is an
exact replica of the input image. However, compression ratios achieved by these
transforms are small compared to lossy transforms (e.g. 4:1 compared to 40:1).
3.5.1 Multiresolution
c
The discrete wavelet transform decomposes the L2(R) space into a set of subspaces Vj ,
where,
and
Figure 2.2 illustrates these nested subspaces. Subspace Vj is spanned by the set of basis
functions given by , which are derived from dyadic shifts and
scales of a unit norm function . Every function when mapped onto
these subspaces, can be written as a linear combination of these basis functions as:
where,
is called the scaling function, and aj,k are the scaling coefficients. This however, is not an
orthogonal expansion of x(t) since subspaces Vj are nested. Also, the functions
where c(k) is the projection of on the basis functions of V1, and is given by
where,
The 2-D DWT (analysis) can be expressed as the set of equations shown below. The
The synthesis bank performs the 2-D IDWT to reconstruct . 2-D IDWT is
given in equation below.
A single stage of a 2-D filter bank is shown in Figure 3.3. First, the rows of the input
image are filtered by the highpass and lowpass filters. The outputs from these filters are
downsampled by two, and then the columns of the outputs are filtered and downsampled
to decompose the image into four subbands. The synthesis stage performs upsampling
and filtering to reconstruct the original image.
c cc
ccccccc cc! c
Multiple levels of decomposition are achieved by iterating the analysis stage on only the
LL band. For l levels of decomposition, the image is decomposed into 3l + 1 subbands.
Figure 3.4 shows the filter bank structure for three levels of decomposition of an input
image, and Figure 3.5 shows the subbands in the transformed image.
Figure 3.6 shows the three-level decomposition of the image ³Lighthouse´ using the
biorthogonal 9/7 wavelet. Subband decomposition separates high frequency details in the
image (edges of fence, grass, telescope) from the low frequency information (sky, wall of
the lighthouse).Vertical details appear in the LH* subbands, horizontal details appear in
HL*subbands and diagonal details can be found in the HH* subbands.
c
Figure 3.5: Three level wavelet decomposition of an image.
The histograms for coefficients in all subbands shown in, illustrate the advantage of
transform coding. The original 512 × 512 lighthouse image had few zero pixel values.The
three level DWT of the image separated the energy content in the image into ten
nonoverlapping frequency subbands. There are still 512×512 DWT coefficients, but most
of these coefficients (especially in the HL*, LH* and HH* subbands) are zero, or have
values very close to zero. These coefficients can be efficiently coded in the quantizer and
entropy coder, resulting in significant compression. The coefficient distribution for the
LLLLLL subband resembles that of the original image. This is indicative of the fact that
the LLLLLL band is just an approximation of the original image at a coarser resolution.
Coefficients in the LLLLLL band have the highest variance (energy) suggesting that in
this image most of the energy is contained at low frequencies. This indeed is the case for
most natural images.
Figure 3.6: Three level decomposition of µlighthouse¶ using biorthogonal 9/7.
ccccc cc
cccccc c
c
The wavelet coefficients are calcalculated along with the new average time series
values. The coefficients represent the average change over the window. If the
windows width is two this would be:
for (i = 0; i < n; i = i + 2)
ci = (vi - vi+1)/2;
The graph below shows the coefficient spectrums. As before the z-axis represents the
log2 of the window width. The y-axis represents the time series change over the
window width. Somewhat counter intutitively, the negative values mean that the time
series is moving upward
Positive values mean the the time series is going down, since vi is greater than vi+1. Note
that the high frequency coefficient spectrum (log2(windowWidth) = 1) reflects the
noisiest part of the time series. Here the change between values fluctuates around zero.
ðc It is conceptually simple.
ðc It is fast.
ðc It is memory efficient, since it can be calculated in place without a temporary
array.
ðc It is exactly reversible without the edge effects that are a problem with other
wavelet trasforms.
The Haar transform also has limitations, which can be a problem for some
applications.
In generating each set of averages for the next level and each set of coefficients, the
Haar transform performs an average and difference on a pair of values. Then the
algorithm shifts over by two values and calculates another average and difference on
the next pair.
The high frequency coefficient spectrum should reflect all high frequency changes.
The Haar window is only two elements wide. If a big change takes place from an
even value to an odd value, the change will not be reflected in the high frequency
coefficients.
For example, in the 64 element time series graphed below, there is a large drop
between elements 16 and 17, and elements 44 and 45.
Figure : 4.3 frequency coefficient spectrum
Since these are high frequency changes, we might expect to see them reflected in the
high frequency coefficients. However, in the case of the Haar wavelet transform the
high frequency coefficients miss these changes, since they are on even to odd
elements.
The surface below shows three coefficient spectrum: 32, 16 and 8 (where the 32
element coefficient spectrum is the highest frequency). The high frequency spectrum
is plotted on the leading edge of the surface. the lowest frequency spectrum (8) is the
far edge of the surface.
Figure: 4.4 the low frequency spectrum(8)
Note that both large magnitude changes are missing from the high frequency
spectrum (32). The first change is picked up in the next spectrum (16) and the second
change is picked up in the last spectrum in the graph (8).
Many other wavelet algorithms, like the Daubechies wavelet algorithm, use
overlapping windows, so the high frequency spectrum reflects all changes in the time
series. Like the Haar algorithm, Daubechies shifts by two elements at each step.
However, the average and difference are calculated over four elements, so there are
no "holes".
The graph below shows the high frequency coefficient spectrum calculated from the
same 64 element time series, but with the Daubechies D4 wavelet algorithm. Because
of the overlapping averages and differences the change is reflected in this spectrum.
The 32, 16 and 8 coefficient spectrums, calculated with the Daubechies D4 wavelet
algorithm, are shown below as a surface. Note that the change in the time series is
reflected in all three-coefficient spectrums
Figure : 4.5 the high frequency spectrum (32).
CHAPTER 5
K c
cccccccK c
cc
The Daubechies wavelet transform is named after its inventor (or would it be
discoverer?), the mathematician Ingrid Daubechies. The Daubechies D4 transform
has four wavelet and scaling function coefficients. The scaling function coefficients
are
Each step of the wavelet transform applies the scaling function to the data input. If the
original data set has N values, the scaling function will be applied in the wavelet
transform step to calculate N/2 smoothed values. In the ordered wavelet transform the
smoothed values are stored in the lower half of the N element input vector.
The wavelet function coefficient values are:
g0 = h3
g1 = -h2
g2 = h1
g3 = -h0
Each step of the wavelet transform applies the wavelet function to the input data. If
the original data set has N values, the wavelet function will be applied to calculate
N/2 differences (reflecting change in the data). In the ordered wavelet transform the
wavelet values are stored in the upper half of teh N element input vector. The scaling
and wavelet functions are calculated by taking the inner product of the coefficients
and four data values. The equations are shown below:
Each iteration in the wavelet transform step calculates a scaling function value and a
wavelet function value. The index is incremented by two with each iteration, and
new scaling and wavelet function values are calculated.In the case of the forward
transform, with a finite data set (as opposed to the mathematician's imaginary infinite
data set), will be incremented until it is equal to N-2. In the last iteration the inner
product will be calculated from calculated from s[N-2], s[N-1], s[N] and s[N+1].
Since s[N] and s[N+1] don't exist (they are byond the end of the array), this presents a
problem. This is shown in the transform matrix below.
Note that this problem does not exist for the Haar wavelet, since it is calculated on
only two elements, s[i] and s[i+1]. A similar problem exists in the case of the inverse
transform. Here the inverse transform coefficients extend beyond the beginning of the
data, where the first two inverse values are calculated from s[-2], s[-1], s[0] and s[1].
This is shown in the inverse transform matrix below.
1.c Treat the data set as if it is periodic. The beginning of the data sequence repeats
following the end of the sequence (in the case of the forward transform) and the
end of the data wraps around to the beginning (in the case of the inverse
transform).
2.c Treat the data set as if it is mirrored at the ends. This means that the data is
reflected from each end, as if a mirror were held up to each end of the data
sequence.
3.c Gram-Schmidt orthogonalization. Gram-Schmidt orthoganalization calculates
special scaling and wavelet functions that are applied at the start and end of the
data set.
Zeros can also be used to fill in for the missing elements, but this can introduce
significant error.
i = 0;
for (j = 0; j < n-3; j = j + 2) {
tmp[i] = a[j]*h0 + a[j+1]*h1 + a[j+2]*h2 + a[j+3]*h3;
tmp[i+half] = a[j]*g0 + a[j+1]*g1 + a[j+2]*g2 + a[j+3]*g3;
i++;
}
The inverse transform works on N data elements, where the first N/2 elements are
smoothed values and the second N/2 elements are wavelet function values. The inner
product that is calculated to reconstruct a signal value is calculated from two
smoothed values and two wavelet values. Logically, the data from the end is wrapped
around from the end to the start. In the comments the "coef. val" refers to a wavelet
function value and a smooth value refers to a scaling function value.
we get the following ordered wavelet transform result (printed to show the scaling
value, followed by the bands of wavelet coefficients)
The final scaling value in the Daubechies D4 transform is not the average of the data
set (the average of the data set is 25.9375), as it is in the case of the Haar transform.
This suggests to me that the low pass filter part of the Daubechies transform (i.e., the
scaling function) does not produce as smooth a result as the Haar transform.
p
c
ccccccccc
p
p
/* A DIB contains a two-dimensional array of elements called pixels. A pixel consists of
1, 4, 8, 16, 24, or 32 contiguous bits, depending on the color resolution of the DIB. For
16-bpp, 24-bpp, and 32-bpp DIBs, each pixel represents an RGB color. Each pixel is an
index into this color table. The ï structure contains the offset to the
image bits . The ï structure contains the bitmap dimensions, the
bits per pixel, compression information for both 4-bpp and 8-bpp bitmaps, and the
number of color table entries. */
#ifndef _INSIDE_VISUAL_CPP_CDIB
#define _INSIDE_VISUAL_CPP_CDIB
HANDLE m_hFile;
HANDLE m_hMap;
LPVOID m_lpvFile;
HPALETTE m_hPalette;
public:
CDib();
CDib(CSize size, int nBitCount); // builds BITMAPINFOHEADER
~CDib();
int GetSizeImage() {return m_dwSizeImage;}
int GetSizeHeader()
{return sizeof(BITMAPINFOHEADER) + sizeof(RGBQUAD) *
m_nColorTableEntries;}
CSize GetDimensions();
BOOL AttachMapFile(const char* strPathname, BOOL bShare = FALSE);
BOOL CopyToMapFile(const char* strPathname);
BOOL AttachMemory(LPVOID lpvMem, BOOL bMustDelete = FALSE,
HGLOBAL hGlobal = NULL);
BOOL Draw(CDC* pDC, CPoint origin, CSize size); // until we implemnt
CreateDibSection
HBITMAP CreateSection(CDC* pDC = NULL);
UINT UsePalette(CDC* pDC, BOOL bBackground = FALSE);
BOOL MakePalette();
BOOL SetSystemPalette(CDC* pDC);
BOOL Compress(CDC* pDC, BOOL bCompress = TRUE); // FALSE means
decompress
HBITMAP CreateBitmap(CDC* pDC);
BOOL Read(CFile* pFile);
BOOL ReadSection(CFile* pFile, CDC* pDC = NULL);
BOOL Write(CFile* pFile);
void Serialize(CArchive& ar);
void Empty();
private:
void DetachMapFile();
void ComputePaletteSize(int nBitCount);
void ComputeMetrics();
};
#endif // _INSIDE_VISUAL_CPP_CDIB
// CDIB.CPP
#include "stdafx.h"
#include "cdib.h"
#ifdef _DEBUG
#define new DEBUG_NEW
#undef THIS_FILE
static char THIS_FILE[] = __FILE__;
#endif
CDib::CDib()
{
m_hFile = NULL;
m_hBitmap = NULL;
m_hPalette = NULL;
m_nBmihAlloc = m_nImageAlloc = noAlloc;
Empty();
}
CDib::~CDib()
{
Empty();
}
CSize CDib::GetDimensions()
{
if(m_lpBMIH == NULL) return CSize(0, 0);
return CSize((int) m_lpBMIH->biWidth, (int) m_lpBMIH->biHeight);
}
BOOL CDib::MakePalette()
{
// makes a logical palette (m_hPalette) from the DIB's color table
// this palette will be selected and realized prior to drawing the DIB
if(m_nColorTableEntries == 0) return FALSE;
if(m_hPalette != NULL) ::DeleteObject(m_hPalette);
TRACE("CDib::MakePalette -- m_nColorTableEntries = %d\n",
m_nColorTableEntries);
LPLOGPALETTE pLogPal = (LPLOGPALETTE) new char[2 * sizeof(WORD)
+
m_nColorTableEntries * sizeof(PALETTEENTRY)];
pLogPal->palVersion = 0x300;
pLogPal->palNumEntries = m_nColorTableEntries;
LPRGBQUAD pDibQuad = (LPRGBQUAD) m_lpvColorTable;
for(int i = 0; i < m_nColorTableEntries; i++) {
pLogPal->palPalEntry[i].peRed = pDibQuad->rgbRed;
pLogPal->palPalEntry[i].peGreen = pDibQuad->rgbGreen;
pLogPal->palPalEntry[i].peBlue = pDibQuad->rgbBlue;
pLogPal->palPalEntry[i].peFlags = 0;
pDibQuad++;
}
m_hPalette = ::CreatePalette(pLogPal);
delete pLogPal;
return TRUE;
}
// helper functions
void CDib::ComputePaletteSize(int nBitCount)
{
if((m_lpBMIH == NULL) || (m_lpBMIH->biClrUsed == 0)) {
switch(nBitCount) {
case 1:
m_nColorTableEntries = 2;
break;
case 4:
m_nColorTableEntries = 16;
break;
case 8:
m_nColorTableEntries = 256;
break;
case 16:
case 24:
case 32:
m_nColorTableEntries = 0;
break;
default:
ASSERT(FALSE);
}
}
else {
m_nColorTableEntries = m_lpBMIH->biClrUsed;
}
ASSERT((m_nColorTableEntries >= 0) && (m_nColorTableEntries <= 256));
}
void CDib::ComputeMetrics()
{
if(m_lpBMIH->biSize != sizeof(BITMAPINFOHEADER)) {
TRACE("Not a valid Windows bitmap -- probably an OS/2 bitmap\n");
throw new CException;
}
m_dwSizeImage = m_lpBMIH->biSizeImage;
if(m_dwSizeImage == 0) {
DWORD dwBytes = ((DWORD) m_lpBMIH->biWidth * m_lpBMIH-
>biBitCount) / 32;
if(((DWORD) m_lpBMIH->biWidth * m_lpBMIH->biBitCount) % 32) {
dwBytes++;
}
dwBytes *= 4;
m_dwSizeImage = dwBytes * m_lpBMIH->biHeight; // no compression
}
m_lpvColorTable = (LPBYTE) m_lpBMIH + sizeof(BITMAPINFOHEADER);
}
void CDib::Empty()
{
// this is supposed to clean up whatever is in the DIB
DetachMapFile();
if(m_nBmihAlloc == crtAlloc) {
delete [] m_lpBMIH;
}
else if(m_nBmihAlloc == heapAlloc) {
::GlobalUnlock(m_hGlobal);
::GlobalFree(m_hGlobal);
}
if(m_nImageAlloc == crtAlloc) delete [] m_lpImage;
if(m_hPalette != NULL) ::DeleteObject(m_hPalette);
if(m_hBitmap != NULL) ::DeleteObject(m_hBitmap);
m_nBmihAlloc = m_nImageAlloc = noAlloc;
m_hGlobal = NULL;
m_lpBMIH = NULL;
m_lpImage = NULL;
m_lpvColorTable = NULL;
m_nColorTableEntries = 0;
m_dwSizeImage = 0;
m_lpvFile = NULL;
m_hMap = NULL;
m_hFile = NULL;
m_hBitmap = NULL;
m_hPalette = NULL;
}
void CDib::DetachMapFile()
{
if(m_hFile == NULL) return;
::UnmapViewOfFile(m_lpvFile);
::CloseHandle(m_hMap);
::CloseHandle(m_hFile);
m_hFile = NULL;
}
/*The Lifting Scheme version of the Haar transform uses a waveletfunction (predict
stage) that "predicts" that an odd element will have the same value as it preceeding even
element. The difference between this "prediction" and the actual odd value replaces the
odd element. The wavelet transform calculates a set of detail or difference coefficients in
the predict step. These are stored in the upper half of the array. The update step
calculates an average from the even-odd element pairs. The averages will replace the
even elements in the lower half of the array. */
protected:
/**
Haar predict step
*/
void predict( T& vec, int N, transDirection direction )
{
int half = N >> 1;
if (direction == forward) {
vec[j] = vec[j] - predictVal;
}
else if (direction == inverse) {
vec[j] = vec[j] + predictVal;
}
else {
printf("haar::predict: bad direction value\n");
}
}
}
if (direction == forward) {
vec[i] = vec[i] + updateVal;
}
else if (direction == inverse) {
vec[i] = vec[i] - updateVal;
}
else {
printf("update: bad direction value\n");
}
}
}
/* The normalization step assures that each step of the wavelet transform has the constant
"energy" */
void normalize( T& vec, int N, transDirection direction )
{
const double sqrt2 = sqrt( 2.0 );
int half = N >> 1;
if (direction == forward) {
vec[i] = sqrt2 * vec[i];
vec[j] = vec[j]/sqrt2;
}
else if (direction == inverse) {
vec[i] = vec[i]/sqrt2;
vec[j] = sqrt2 * vec[j];
}
else {
printf("normalize: bad direction value\n");
}
} // for
} // normalize
/*
One inverse wavelet transform step, with normalization
*/
void inverseStep( T& vec, const int n )
{
normalize( vec, n, inverse );
update( vec, n, inverse );
predict( vec, n, inverse );
merge( vec, n );
} // inverseStep
/*
One step in the forward wavelet transform, with normalization
*/
void forwardStep( T& vec, const int n )
{
split( vec, n );
predict( vec, n, forward );
update( vec, n, forward );
normalize( vec, n, forward );
} // forwardStep
}; // haar
#endif
c
c
c
c
c
c
c
c
c
c
c
c
c
c
class Daubechies
{
private:
/** forward transform scaling coefficients */
double h0, h1, h2, h3;
/** forward transform wave coefficients */
double g0, g1, g2, g3;
double Ih0, Ih1, Ih2, Ih3;
double Ig0, Ig1, Ig2, Ig3;
/*
Forward Daubechies D4 transform
*/
void transform( double* a, const int n )
{
if (n >= 4) {
int i, j;
const int half = n >> 1;
/**
Inverse Daubechies D4 transform
*/
void invTransform( double* a, const int n )
{
if (n >= 4) {
int i, j;
const int half = n >> 1;
const int halfPls1 = half + 1;
public:
Daubechies()
{
const double sqrt_3 = sqrt( 3 );
const double denom = 4 * sqrt( 2 );
//
// forward transform scaling (smoothing) coefficients
//
h0 = (1 + sqrt_3)/denom;
h1 = (3 + sqrt_3)/denom;
h2 = (3 - sqrt_3)/denom;
h3 = (1 - sqrt_3)/denom;
//
// forward transform wavelet coefficients
//
g0 = h3;
g1 = -h2;
g2 = h1;
g3 = -h0;
Ih0 = h2;
Ih1 = g2; // h1
Ih2 = h0;
Ih3 = g0; // h3
Ig0 = h3;
Ig1 = g3; // -h0
Ig2 = h1;
Ig3 = g1; // -h2
}
}; // Daubechies
p
cKcc
c
The general aim of the project was to investigate how wavelets could be used for image
compression. We believe that we have achieved this. We have gained an understanding
of what wavelets are, why they are required and how they can be used to compress
images. We understand the problems involved with choosing threshold values. While the
idea of thresholding is simple and effective, finding a good threshold is not an easy task.
We have also gained a general understanding of how decomposition levels, wavelets and
images change the division of energy between the approximation and detail subsignals.
So we did achieve an investigation into compression with wavelets.
The importance of the threshold value on the energy level was something of which we
did not have an appreciation before collecting the results. To be more specific we
understood that thresholding had an effect but didn.t realise the extent to which
thresholding could change the energy retained and compression rates. Therefore when the
investigation was carried out more attention was paid to choosing the wavelets, images
and decomposition levels than the thresholding strategy. Using global thresholding is not
incorrect, it is a perfectly valid solution to threshold, the problem is that using global
thresholding masked the true effect of the decomposition levels in particular on the
results. This meant that the true potential of a wavelet to compress an image whilst
retaining energy was not shown. The investigation then moved to local thresholding
which was better than global thresholding because each detail subsignal had its own
threshold based on the coefficients that it contained. This meant it was easier to retain
energy during compression. However even better local thresholding techniques could be
used. These techniques would be based on the energy contained within each subsignal
rather than the range of coefficient values and use cumulative energy profiles to find the
required threshold values. If the actual energy retained value is not important, rather a
near optimal trade off is required then a method called BayesShrink [12,13,14] could be
used. This method performs a denoising of the image which thresholds the insignificant
details and hence produces Zeros while retaining the significant energy.
We were perhaps too keen to start collecting results in order to analyse them when we
should have spent more time considering the best way to go about the investigation.
Having analysed the results it is clear that the number of thresholds analysed (only 10 for
each combination of wavelet, image and level) was not adequate to conclude which is the
best wavelet and decomposition level to use for an image. There is likely to be an optimal
value that the investigation did not find. So it was difficult to make quantative predictions
for the behaviour of wavelets with images, only the general trends could be investigated.
We feel however that the reason behind my problems with thresholding was that
thresholding is a complex problem in general. Perhaps it would have been better to do
more research into thresholding strategies and images prior to collecting results. This
would not have removed the problem of thresholding but allowed us to make more
informed choices and obtain more conclusive results.
At the start of the project our aim was to discover the effect of decomposition levels,
wavelets and images on the compression. We believe that we have discovered the effect
each have.
There are many extensions to the project, each of which would be a project by itself. The
first area would be finding the best thresholding strategy. How should the best thresholds
be decided? There are many different strategies that can be compared such as the Birge-
Massart method., .equal balance sparsity norm. and .remove near 0.. Perhaps certain areas
of the image could be thresholded differently based on edge detection rather than each
detail subsignal. 35 Wavelet packets could be investigated. These work in a similar way
to the wavelet analysis used in this investigation but the detail subsignals are decomposed
as well as the approximation subsignals. This would be advantageous if there tends to be
a lot of energy in the detail subsignals for an image.
How well a wavelet can compact the energy of a subsignal into the approximation
subsignal depends on the spread of energy in the image. An attempt was made to study
image properties but it is still unclear as to how to link image properties to a best wavelet
basis function. Therefore an investigation into the best basis to use for a given image
could be another extension. Only one family of wavelets was used in the investigation,
the Daubechies wavelets. However there are many other wavelets that could be used such
as Meyer, Morlet and Coiflet. Wavelets can also be used for more than just images, they
can be used for other signals such as audio signals. They can also be used for processing
signals not just compressing them. Although compression and denoising is done in
similar ways, so compressing signals also performs a denoising of the signal. Overall we
feel that we have achieved quite a lot given the time constraints considering that before
we could start investigating wavelet compression we had to first learn about wavelets and
how to use VC++. We feel that we have learned a great deal about wavelets, compression
and how to analyse images.
m m
1.c D. Donoho and I. Johnstone, Adapting to unknown smoothness via wavelet
shrinkage, JASA, v.19, pp. 1200-1224, 1995
4.c V. Cherkassky and F. Mulier, Learning from Data, Wiley, N.Y. 1998
6.c Amir Said and William A. Pearlman, A New Fast and Efficient Image Code
Based on Set Partitioning in Hierarchical Trees,
!"#"$vol. 6, pp. 243-250, June 1996
p
We take this opportunity to express our gratitude towards our guide Prof. RAJESH
KADU of Department of Computer Engineering, DATTA MEGHE COLLEGE OF
ENGINEERING, Navi Mumbai, for his constant encouragement and guidance.
RAJESH PRASANNAKUMAR
PRIYANK SAXENA
RAJESH SWAMINATHAN
m
Wavelet analysis is very powerful and extremely useful for compressing data such as
images. It¶s power comes from its multiresolution. Although other transforms have been
used, for example the DCT was used for the JPEG format to compress images, wavelet
analysis can be seen to be far superior, in that it doesn¶t create blocking artifacts. This is
because the wavelet analysis is done on the entire image rather than sections at a time. A
well-known application of wavelet analysis is the compression of fingerprint images by
the FBI.
The project coding was done in VC++, which could calculate a great number of results
for a range of images, Daubechies wavelets and decomposition levels. The set of results
calculated using global thresholding, proved to be more useful in understanding the
effects of decomposition levels, wavelets and images. However, this was still not the
optimal thresholding in that it is possible to get higher energy retention for a given
percentage of zeroes, by thresholding each detail subsignal in a different way. Changing
the decomposition level changes the amount of detail in the decomposition. Thus, at
higher decomposition levels, higher compression rates can be gained. However, more
energy of the signal is vulnerable to loss. The wavelet divides the energy of an image into
an approximation subsignal, and detail subsignals. Wavelets that can compact the
majority of energy into the approximation subsignal provide the best compression. This is
because a large number of coefficients contained within detailed subsignals can be safely
set to zero, thus compressing the image. However, little energy should be lost. Wavelets
attempt to approximate how an image is changing, thus the best wavelet to use for an
image would be one that approximates the image well. However, although this report
discusses some relevant image properties, there was not time to research or investigate
how to find the best wavelet to use for a particular image.
The image itself has a dramatic effect on compression. This is because it is the image¶s
pixel values that determine the size of the coefficients, and hence how much energy is
contained within each subsignal. Furthermore, it is the changes between pixel values that
determine the percentage of energy contained within the detail subsignals, and hence the
percentage of energy vulnerable to thresholding. Therefore, different images will have
different compressibility.
!"
#$%&#$!'#(
$)* +'$!*$*#,($)#-.*)&
The one-dimensional Fourier Transform (FT) and its inverse are specified by the
following equations
In the view of the above definitions, FT can be thought of as just a generalized version of
the Fourier series - the summation is replaced by the integral and a µcountable¶ set of
basis functions is replaced by a uncountable one, this time consisting of cosines for the
real part and sines for the complex part of the analyzed signal.
Due to the fact that each of the basis functions spans over the total length of the analyzed
signal, this creates what is known as time-frequency resolution problem
The important conclusion is that FT does give information about what frequencies are
present in the signal, however it lacks the ability to correlate the frequencies with the time
of their presence. This fact does not even come into consideration when only stationary
signals are processed, however as soon as non-stationary signals are to be analyzed, it
becomes a serious handicap