Professional Documents
Culture Documents
Introduction
Hardware for image processing - Basics
Eye Human vision sensor
Mars Orbiter depicts Marineris canyon on Mars.
This image is obtained using several images with so called stereo
camera.
Course information
Lecturer: Dr Igor urovi
Number of lectures: 3+1
Type of examination: depending on number of students
Literature: book
Other resources:
www.obradaslike.cg.ac.yu
PPT presentations
examples
test images
English textbooks available
etc
Covered topics
Introduction and image acquisition
Human vision, color models and image formation
Color and point transforms - histogram
Geometrical transforms
Interpolation
Image in spectral domains
Filtering
Basics of image reconstruction
Edge detection
Basics of image recognition
Other topics
Compression
Digital image protection
Motion picture processing
Stereo images
Superresolution
Computer graphics
etc.
course Multimedia
systems
out of scope of the
basic digital image
processing course
History
Photography appeared in XIX century.
Ideas for developing fax machines and usage of telegraphic lines for
image submission was born during the World War I.
The idea about TV development was born around 1930.
The key event in digital image processing was development of the
first electronic computing machines. These machines enabled simple
and fast image processing.
The second important event was astronomical exploration and race
to space.
JPL from California had the first task related to the digital image
processing within NASA USA space program.
Image acquisition
Usually we will
assume that source of
radiation is within
visible light frequency
domain but it could be
other electromagnetic
frequency bands (X-
rays, gamma rays,
radio waves),
ultrasound, vibrations
etc.
sensors receive signals
reflected from the
surface.
source of radiation
normal line
surface
Image Acquisition
Image reflected from the
surface f(x,y) is input in
optical system.
This is non-electronic part that
consists of lenses and other
similar parts; it can be modeled
as 2D linear space invariant
system with impulse response
h(x,y) that can be approximately
known in advance.
senzor transforms optical
signal to electrical
continuous time
electrical signal
transformed to digital
Optical
system
Sensor Digitaizer
Image Acquisition
f(x,y) can be considered as a power of the light (or
some other signal that is subject of visualization).
h(x,y) is impulse response of the optical system.
Output of the optical system is optical signal.
For linear space invariant systems output is given as:
b x y f h x y d d ( , ) ( , ) ( , ) =
z z
2D convolution (very similar to
the 1D convolution for linear time
invariant systems)
since b(x,y) andf(x,y) represent
optical power they are non-negative
Image Acquisition
Optical system can deform image.
Companies producing systems can estimate distortion
caused by h(x,y) and they can develop techniques to
reduce effects of distortion.
The second important element in optical system is sensor
that transforms optical signal into electrical equivalent.
We will not study sensors since they are subject to quite
fast development and we will not consider it within this
course.
Some details related to sensors are given in our textbook
but development in this area is quite fast and some of
them can already be too old.
Digitizer
Analog electric equivalent i(x,y) is transformed to the
digital version within two procedures:
Sampling
Quantization
i(x,y)
1 2 3 4 5 6 7
1
2
3
4
5
6
7
Based on image context
in point or in area value
of the luminance for
image pixel i(1,1) is
determined
Pixel (picture element) is elementary point of the image. The fact
that eye considers numerous close dots as a continual image is used.
Digitizer
Sampling phase is followed with digitalization
(digitalization is performed by using quantization to
closest value of the quantization steps multiple).
This integer can easily be represented by binary
number.
Number of quantization levels is commonly 2
k
where k
is integer.
i(n
1
,for fixed n
2
)
n
1
=
<
binary image
grayscale image
threshold
Binary image is used in industry application, edge detection etc. Threshold selection
will be discussed lately.
Three channel model
In Cartesian coordinates any color can be represented
using the vector with three coordinates (R,G,B).
However, T. Young (1802.) has concluded that color
can be determined using 3 independent information
(three independent coordinates). It is quite similar to
the RGB model.
However, instead of using Cartesian coordinates we can
also use polar or sphere coordinates and develop
corresponding color space.
Some other color models are also available in practice.
An overview of these color models that are different
from the RGB and CMYK follows.
Three channels models
Assume that we have three independent basic colors
(c
1
,c
2
,c
3
).
All other colors can be represented as a vector
(C
1
,C
2
,C
3
) where C
i
corresponds to the amount of the
basic color c
i
.
Chroma is defined as:
1 2 3
, ( 1, 2, 3)
i
i
C
h i
C C C
= =
+ +
An alternative for memorizing of colors is color space (h
1
,h
2
,Y)
where Y=C
1
+C
2
+C
3
is total luminance. This procedure is used in
development of numerous color models.
CIE color models
International Commission on Illumination CIE
developed in 1931 the RGB model (we call it the RGB
CIE since it is different from the computer model).
R
CIE
corresponds to wavelengths 700nm, G
CIE
=546.1nm and B
CIE
=435.8nm.
Referent white for this model is R
CIE
=G
CIE
=B
CIE
=1.
RGB CIE model does not contain all colors that can
be reproduced and from this reason it is defined via
linear transformation model called the XYZ model
that can represent all visible colors.
RGB CIE and XYZ CIE
relationships
XYZ are RGB linearly dependent and their transform
can be given as:
0.490 0.310 0.200
0.177 0.812 0.011
0.000 0.010 0.990
X R
Y G
Z B
=
X, Y, Z should be represent
wavelegth on which the cone and
rode cells are the most sensitive.
Y corresponds to the most
sensitive wavelength for rode cells.
Chrominent components can
be defined as:
x=X/(X+Y+Z)
y=Y/(X+Y+Z)
X=Y=Z=1 referent white
CIE chromatic diagram
Illuminance of numerous light
sources is still given with (x,y)
chromatic diagram. Then this
diagram is widely used for
illuminance system design.
Problem in this diagram is the
fact that there are color areas of
eliptic shape within diagram that
cannot be visible by humens.
Computer and CIE RGB models
Current monitor RGB model is developed as a
recommendation of the NTSC (National Television
Systems Commitee).
Relationship between RGB CIE and RGB NTSC is linear
and it can be given using the transformation matrix:
1.167 0.146 0.151
0.114 0.753 0.159
0.001 0.059 1.128
CIE
CIE
CIE
R R
G G
B B
=
Modification of the XYZ model
Three modifications are used for overcoming
drawbacks of the XYZ model:
UCS model (model with uniform chromatic scale)
UVW model
U=2X/3 V=Y W=(-X+3Y+Z)/2
4
15 3
X
u
X Y Z
=
+ +
6
15 3
X
v
X Y Z
=
+ +
luminance is the same as in the XYZ-model
Modification of the XYZ model
U
*
V
*
W
*
model is formed in such manner that referent
white be in origin
* 1/ 3
25(100 ) 17, 0.01 1 W Y Y =
* *
0
13 ( ) U W u u =
* *
0
13 ( ) V W v v =
(u
0
,v
0
) are coordinates of
referent white color
Colorimetry
Colorimetry is scientific area specialized for color
comparison.
For example in industry we can have some process that is
done when color of some object is the same or close to
some color known in advance.
Assume that current color in the RGB model is (R
1
,G
1
,B
1
)
while the target color is (R
2
,G
2
,B
2
). Distance between these
two colors can be described as:
2 2 2
1 2 1 2 1 2
( ) ( ) ( ) R R G G B B + +
Euclidian distance but some alternative
distances are also used.
Unfortunately, distance defined in this manner for RGB model does not produce
reliable results since similar colors could produce large distance and quite
different color relatively small distance.
Colorimetry
All models with linear dependency to the RGB suffers
from the same problem as the RGB.
Therefore, for colorimetry applications it is defined
the Lab color space.
Lab model can be defined in various manners but
here we adopted definition based on the XYZ model:
*
3
0
16 116 / L Y Y = +
*
3 3
0 0
500[ / / ] a X X Y Y =
*
3 3
0 0
200[ / / ] b Y Y Z Z =
luminance component
a
*
>0 means red while a
*
<0 means green
b
*
>0 means blue while b
*
<0 means yellow
Colorimetry
(X
0
,Y
0
,Z
0
) is referent white (almost always it is
(1,1,1)).
Euclidian distance in the Lab coordinates is assumed
good measure of color difference.
However, there are alternative approaches for
defining color difference measures.
HSL and related models
As we have already seen:
All colors can be represented using three independent colors;
Humans have two types of vision: night vision based on luminance
and daylight vision based on colors;
Colors can be represented using Cartesian coordinates (RGB cube)
but also they can be represented using spherical and polar
coordinates;
Before development of the color TV existed the black-white
(grayscale) TV. It was important to develop new system and that
people with old TV sets could follow new signal with the same
functionality as they did with the old one.
Numerous color models are developed for these purposes and
we will describe only probably the most popular the HSL.
HSL color model
H hue
S saturation
L - luminosity
H is represented with angle. In the HSL model angles
between 0 and 240 represent colors that can be seen by
humans while angles between 240-360 are UV colors.
Procedure for transformation of the RGB-model to the HSL
model is:
Step 1. Coordinate transform.
1
[2 ]
6
HS
x R G B =
1
[ ]
3
L R G B = + +
1
[ ]
2
HS
y G B =
RGBHSL
Step 2. From Cartesian coordinates (x
HS
,y
HS
) to polar
coordinates (radius is measure of saturation while angle is
measure of hue)
Obtained coordinate system (,,L) corresponds to the HSL
but commonly several additional operations are performed.
Step 3. Normalized saturation.
2 2
HS HS
x y = +
( , )
HS HS
x y =
max
3min( , , ) 3
1 1 min( , , )
R G B
S R G B
R G B L
= = =
+ +
RGBHSL
Step 4. Additional processing of angle (hue).
Step 5. Final relationship for H:
=
+
+
L
N
M
M
O
Q
P
P
arccos
. [( ) ( )]
( ) ( )( )
05
2
R G R B
R G R B G B
H
G B
G B
=
R
S
T
2
HSL model
Similarly it can
be performed
HSLRGB
transformation.
Determine this
transform for
self-exercise
and with usage
of the
textbook.
Popular
presentation of
the HSL color
model.
White
Red
Yellow
Blue
Black
Color models for video signal
They are similar to the HSL and similar relevant models.
Here, we have single component that corresponds to the
grayscale (so called black-white) image due to the
backward compatibility with older models for the TF
signal.
NTSC uses the YIQ color model where intensity is given
as:
Y=0.2999R+0.587G+0.114B with chrominent
components given as:
I R Y B Y R G B = = cos . ) sin . ( ) . . . 30 0877( 30 0493 0569 0271 0322
Q R Y B Y R G B = = sin . ) cos . ( ) . . . 30 0877( 30 0493 0211 0522 0311
Color models for video signals
PAL color model (YUV Y is the same as in the NTSC):
SECAM color model (YDrDb Y is the same as in the
NTSC):
U B R G B Y = = 0463 0147 0289 0493 . . . . ( )
V R R G R Y = = 0615 0515 0100 0877( . . . . )
D B R G B Y
b
= = 1333 0450 0883 1505 . . . . ( )
D B R G R Y
r
= + = 0217 0133 1116 1902( . . . . )
These models are quite similar to the HSL.
Exercise No.1
Realize relationship between RGB CIE and standard RGB model.
Here, we will realize several aspects of the problem:
We will create matrix that produces inverse transform from the RGB CIE to the RGB model.
We will determine limits in which we should perform discretization of the RGB CIE model.
Visualize channels of image for RGB and RGB CIE models.
A= [ 1. 167 - 0. 146 - 0. 151;
0. 114 0. 753 0. 159;
- 0. 001 0. 059 1. 128] ;
B=i nv( A)
B =
0. 8417 0. 1561 0. 0907
- 0. 1290 1. 3189 - 0. 2032
0. 0075 - 0. 0688 0. 8972
Exercise No.1
Limits of the RGB model are usually 0 and 1 along all coordinates but they are
different for the RGB CIE model.
Minimum of the R component in the CIE model is obtained for R=0, G=1, B=1 and it is
equal to 0.297 while the maximum is produced with R=1, G=0, B=0 and it exhibits
1.167.
Minimum of G in the CIE model follows for R=G=B=0 and it exhibits 0, while
maximum follows for R=G=B=1 and it exhibits 1.026.
B component achieves maximum for R=0, G=B=1 and it exhibits 1.186 while the
minimum is produced for R=1, G=B=0 and it exhibits -0.001.
For visualization of color channels we can use the following commands:
a=doubl e( i mr ead( ' spep. j pg' ) ) ;
b( : , : , 1) =1. 167*a( : , : , 1) - 0. 146*a( : , : , 2) - 0. 151*a( : , : , 3) ;
b( : , : , 2) =0. 114*a( : , : , 1) +0. 753*a( : , : , 2) +0. 159*a( : , : , 3) ;
b( : , : , 3) =- 0. 001*a( : , : , 1) +0. 059*a( : , : , 2) +1. 128*a( : , : , 3) ;
Channels can be represented with commands as the follows:
pcolor(flipud(b(:,:,1)),shading interp
For self-exercise
List of mini-projects and tasks for self-exercise:
1. Solve problems from the textbook.
2. Realize all color models given on these slides and in textbook
and create transformations between them. Visualize channels
for considered models.
3. Consider the following experiment. Assume that colors that
can not be printed in the CMYK model have any of channels
except black channel when it is represented with more than
90% of maximal value. Assume that in addition we have color
space with three alternative colors (for example rose, green
and orange). Colors can be printed in CMYK or in alternative
model with appropriate amount of black. Rules for printing in
the corresponding model are the same as for the CMY (it is
possible to print up to 90% of any color). How many colors
from the RGB model is possible to print the CMYK color and
how many in model with three additional alternative colors.
For self-exercised
List of miniprojects and tasks for self-exercise:
4. Create introduced color models and transformations between
these color models and RGB. Make images of cakes in process
of baking or some other similar kitchen experiment. The main
results of the first set of experiments should be: average
color of cake several minutes before we assume that it is
done and average color when cake is done. The second set of
experiments is performed after that. Try to determine
algorithm for automatic turning off the baking appliance
based on the first set of experiments and check if it is
performing well for the second set of experiments. Determine
number of correct and wrong results.
Digital Image Processing
Histogram
Point operation
Geometrical transforms
Interpolation
Histogram
Histogram is simple (but
very useful) image statistics.
H(X)=number of pixels with
luminance X
Sum histogram values of image for all luminances (here
grayscale image with bi bits/ pixel is considered) is
equal to number of pixels in image.
N M X H
X
=
=
255
0
) (
Example of histogram
Histogram has numerous applications. It is very useful in techniques that
use probabilistic model of image with probability density function of image
luminance (or at least estimation of the pdf). How to connect histogram
and probability density function?
Histogram Common shapes
L
P
H( ) P
L
P
H( ) P
L
P
H( ) P
unipolar histograms
for dark and bright
images
bipolar histogram can be used for
threshold determination and
obtaining of binary images (how?)
Histogram extension
Optical sensors very often concentrate image in very
narrow region of luminance.
Then software systems are usually employed to solve
this problem.
Problems are solved by using information obtained
using histogram.
Let image be contained into luminance domain of
[A,B] (estimation of A and B can be performed using
histogram).
Then assume that we want to extend the histogram
over entire 8-bit grayscale domain [0,255]:
255 255
( )
A
f X X
B A B A
=
Luminance of
original image
luminance of image
with extended
histogram
Histogram Extension - Example
original
after
histogram
extension
original
histogram
histogram
after
operation
Histogram equalization
Histogram equalization is also one of the
most common histogram operations.
In equalized image histogram is
approximately flat, i.e., the goal is to have
image with approximately the same number
of pixels for all luminances.
Then, we want to transform histogram to be
approximately uniform.
Images with equalized histogram have good
contrast and it is the main reason for
performing this operation.
Histogram Equalization
L
P
H( ) P
original histogram
L P
H(P)
goal - equalized histogram
This can be considered as the following problem: There is a random
variable with probability density function f
x
(x) (it can be estimated
using the histogram of original image). We are looking for transform
y=g(x) producing probability density function f
y
(y). In this case it is
proportional to the equalized histogram.
Histogram equalization
From probability theory follows:
where (x
1
,x
2
,...,x
N
) are real roots of equation y=g(x).
Assume that solution is unique (it is possible for
monotone functions g(x)).
1 2
1 2
( ) ( ) ( )
( ) ...
| '( ) || | '( ) || | '( ) ||
N
x x x N
y
x x x x x x
f x f x f x
f y
g x g x g x
= = =
= + + +
1
1
( )
( )
| '( ) ||
x
y
x x
f x
f y
g x
=
=
Histogram Equalization
Since f
y
(y) is constant it means that |g(x
1
)| is
proportional to f
x
(x
1
).
Assume that g(x) is monotone increasing function it
means g(x)=c f
x
(x).
Select c=1 (it means that output image would have
the same luminance domain as input one) :
constant value
1 1
( ) ( )
x
x
g x f x dx
=
I ntegral of probability density
function. This is monotone
function in its domain.
Histogram Equalization
Since image is not continual but discrete function
here we have no continual probability density
function but its discrete version (histogram).
MATLAB realization is quite simple:
I=imread('pout.tif');
a=imhist(I);
g=cumsum(a)/sum(a);
J=uint8(255*g(I));
Reading original image
function g
output image
Histogram Equalization - Example
0 100 200
0
1000
2000
3000
4000
0 100 200
0
1000
2000
3000
4000
original image
equalized image
(significantly
improved contrast)
corresponding
histograms
Obtained
density is not
uniform due to
discrete nature
of images
Histogram matching
Histogram equalization is operation that produces
uniform probability density function.
Similarly histogram can be matched to any desired
probability density function.
The procedure is the same for any monotone
function g(x) (increasing and decreasing) that is
satisfied in the equalization case.
Otherwise we need more complicated operation
involving segmentation of g(x) in monotone regions.
Applications of Histogram
All methods that uses probability density function estimate
are histogram based.
Improvement of image contrast (equalization).
Histogram matching.
Modifications of colors.
Histogram can be applied locally to image parts. For
example we have very bright object on dark background.
We can perform histogram-based operations on selected
region: object of background depending on our task.
Also, histogram can be calculated for parts of image or for
channels in color images.
Image Negative
There numerous operations that can be applied to
the image luminance. Some of them are applied to
each pixel in independent manner of other pixels.
These operations are called point operations.
One of the simplest of them is determination of the
image negative (or positive if we have image
negative).
This operation can be performed in different manner
depending on the image format.
Image negative
Binary image:
Negativ(n,m)=~Original(n,m)
Grayscale
Negativ(n,m)=2
k
-1-Original(n,m)
RGB
Operation for grayscale images
is performed for each image channel.
logical operation negation
number of bits used for
memory representation of
image pixels
Color Clipping
Color clipping is operation performed on colors but it
is not related to geometrical clipping. We are keeping
some image colors as in original image but other
colors are limited to some selected limits:
max max
max min
min min
( , )
( , ) ( , ) ( , )
( , )
a i j
b i j a i j a i j
a
c c
c c
i c j c
>
<
Brightening (Darkening)
There are several methods to perform these
operations.
For example, f(n,m)=g(n,m)+r would increase
luminance for r>0 while for r<0 image will darker.
The second technique: f(n,m)=g(n,m)xr brightening for
r>1 darkening for 0<r<1.
These techniques are not of high quality since they
have several drawbacks. The most common is the
following procedure:
f(n,m)=[2
k
-1] {g(n,m)/ [2
k
-1]}
X X
0 0
( , ) ( , ) g x y f x x y y =
x
0
y
0
f(x,y) g(x,y)
In this case we keep
dimension of the target
image to the same dimension
as in the original image. In
region appearing by
translation we put white or
some other default color.
This strategy is called
cropping. An alternative
strategy (when we want to
change image dimension) is
enlarging image in order to
entire original image be kept
in target image.
Also, it is possible to have cyclical translation
with part that are removed from the image
cyclically shifted at the beginning.
Cropping
Cropping is operation were part of original image is
used as a new image. Of course this image has
smaller dimensions then the original one.
For example let the original image f(x,y) has
dimensions (M,N) and let we want to crop region
between (M
1
,N
1
) and (M
2
,N
2
) where 0<M
1
<M
2
<M
and 0<N
1
<N
2
<N.
1 1
( 1, 1) ( , ) g x M y N f x y + + =
for x[M
1
,M
2
] i y[N
1
,N
2
]. Determine dimensions of
the target image?
Rotation
Coordinate transform in the case of the rotation is defined as:
Obtained image is given as:
g(x,y)=f(xcos+ysin,-xsin+ycos).
We assumed that coordinate transform is performed around
origin. This is rare situation in digital images. Develop the
coordinate transform for rotation around pixel (x
0
,y
0
).
Positive direction for rotation is counter clockwise.
cos sin
'
sin cos
x
y
=
X
(x',y') (x,y)
We will demonstrate distortion along one
of coordinate axis.
Coordinate transform:
' 1 cot
' 0 1
x x
y y
=
g(x,y)=f(x-ycot,y)
Consider distortion that would be performed in paralel to
the line y=ax+b.
Scaling
Coordinate scaling can be described as:
0
'
0
a
b
=
X X
Determine function of the output image based on input image.
Determine dimensions of output image in function a and b. For which
parameter values image is enlarged?
This is scaling along x and y axes. Is it possible to define scaling
along alternative directions?
Could reflection with respect to coordinate axes or origin be described
using scaling?
Nonlinear transforms
There are numerous non-linear transform used in
digital image processing.
Number of non-linear transforms is significantly
greater than of linear ones.
Here, we give a simple example:
g(x,y)=f(x+Asinby,y+Asinbx).
Important example is the fish-eye nonlinearity.
Fish-eye transform
The fish-eye effect is cause by shape and limited
(relatively small) dimensions of camera lens. It causes
that objects in the middle of the scene are larger than
objects on borders of the scene.
Sometimes this effect is desired in photography and
photographers simulate it or they produce it using
special form of lenses.
Try to simulate fish-eye transform and to propose
method for removing this effect.
Geom. transf. - Problem
Original image with pixels on
grid.
I mage rotated for 45 degrees.
Pixels are dislocated from the
grid.
Need for interpolation
Commonly only small number of pixels after
geometrical transforms is on the grid while others are
displaced.
Then we have a problem to determine pixels in grid
for the target images.
Techniques for determination of grid values are
called interpolation.
Here we will describe several strategies for
interpolation.
For other techniques look at textbook, Internet, and
for additional material available at the lecturer office.
Nearest neighbor
Nearest neighbor technique is the simplest
interpolation strategy.
For pixel in the grid we are taking value of
the nearest pixel of interpolated image.
This technique suffers from low quality
problem.
original
rectangle
after rotation for 5 degrees and this
interpolation technique
Human eye is very sensitive to
broken edges and disturbed small
details that are caused by this
interpolation form.
Bilinear Interpolation
Strategy of bilinear interpolation is slightly better with
respect to image quality than the nearest neighbor
but little bit slower. However, calculation burden in
this strategy is still reasonable.
Let pixel of original image be surrounded with four
pixels of transformed image.
pixel that we want to determined
luminance g(x,y)
transformed pixels (we
assume that dimensions of
square in which we perform
interpolation are not
changed)
x
1-x
y
1-y
Bilinear interpolation
For simpler determination we
will rotate the coordinate
system.
f(m,n) f(m+1,n)
f(m,n+1)
f(m+1,n+1)
f(m+x,n+y)
Bilinear interpolation determines luminance in point (m+x,n+y)
as:
f(m+x,n+y)=axy+bx+cy+d constants a, b, c and d
should be determined
Bilinear interpolation
Constants can be determined from the following condition:
f(m,n)=ax0x0+bx0+cx0+d d=f(m,n)
f(m+1,n)=ax1x0+bx1+cx0+d b=f(m+1,n)-f(m,n)
f(m,n+1)=ax0x1+bx0+cx1+d c=f(m,n+1)-f(m,n)
f(m+1,n+1)=a+b+c+d
a=f(m+1,n+1)+f(m,n)-f(m+1,n)-f(m,n+1)
Consider the following case. We are not performing geometrical
transform but we want to change number of pixels in image (for
example instead of NxM we want to get kNxlN where k and l are
integers and k>1 and l>1. Determine relationship that connects
original and target image with bilinear interpolation?
This operation is called image resize.
Rotation MATLAB program
function b=rotacija_i_interpolacija(a, theta, x0, y0)
%realization is performed within function where a is source image
%theta is rotation angle, x0 andy0 are rotation center
[M, N] = size(a); %Size of grayscale image
b = zeros(M, N); %Target image
%we assume that the target image has the same dimensions as source image
%we will perform cropping of remaining parts
for xp= 1 : M
for yp= 1 : N %performing operation for all image pixels
x = (xp- x0) * cos(theta) - (yp- y0) * sin(theta) + x0;
y = (xp- x0) * sin(theta) + (yp- y0) * cos(theta) + y0;
%Determination of the origin of the pixel mapped to (xp,yp) i.e.,
%where it is in the original image (inverse transform)
Rotation MATLAB program
if ((x >= 1) & (x <= M) & (y >= 1) & (y <= N))
%I s pixel within proper domain?
xd= floor(x); xg= ceil(x);
yd = floor(y); yg= ceil(y); %(xd,yd) bottom left corner
%rectangular for interpolation; (xg,yg) upper right corner
D = double(a(xd, yd));
B= double(a(xg, yd)) - double(a(xd, yd));
C = double(a(xd, yg)) - double(a(xd, yd));
A= double(a(xg, yg)) - double(a(xd, yd))...
- double(a(xd, yg)) - double(a(xg, yd));
%Determination of coefficients
b(xp, yp) = A*(x-xd)*(y-yd)+A*(x-xd) + B* (y-yd)+D;
%Values of target image
Rotation MATLAB program
end
end
end
b = uint8(b);
%%%end of the program (end of if selections and two for cycles)
%%%and return image to proper format
Write program for distortion.
Write program for rotation and nearest neighbor.
Rotate image for 5 degrees using the nearest neighbor two times and
perform rotation for -5 degrees two times. Perform the same operation
with bilinear interpolation and compare results.
Polar to rectangular raster
In medicine numerous images are created by axial
recording of objects under various angles. This is
imaging technique is common for various medical
scanner types.
Obtained image has circular shape.
Similar images are obtained in radars, sonars, and
some other acquisition instruments.
These images have polar raster (pixels distribution).
Since monitors have rectangular raster we have to
perform corresponding interpolation.
Polar to rectangular raster
4
p
1
p
3
p
2
p
c
p
1
, p
2
, p
3
, p
4
pixels of polar raster
c pixels of rectangular raster
Bilinear interpolation form that is commonly
applied here:
f c
f p p f p p f p p f p p
p p p p
( )
( ) / ( ) / ( ) / ( ) /
/ / / /
=
+ + +
+ + +
1 1 2 2 3 3 4 4
1 2 3 4
1 1 1 1
p
i
are distances between p
i
and c while f() is luminance in the
corresponding pixel
Other interpolation methods
Earlier the bicubic interpolation was not used due to its
demands. It is based on the third order polynomial function.
Today it is assumed that calculation demands are
reasonable and it is one of the most common strategies.
Some sensors in acquisition process deform image (for
example scanners). Fortunately distortion can be known in
advance and we can select appropriate interpolation
strategy commonly using the grid.
The procedure is as follows: scan of the rectangular grid is
performed; then it is observed position where nodes of the
rectangular are moved. Based on this we create inverse
transform that returns image to proper shape using
software means.
Other interpolation forms
A group of the polynomial interpolation algorithm is
quite common and among them the Lagrange
interpolation technique is quite popular.
There are numerous well-established interpolation
technique able to preserve important image features
such for example edges.
Quite common interpolation technique today is based
on spline. This technique is related to both
polynomial interpolation and wavelets.
The Fourier transform can also be used for
interpolation purpose.
For self-exercise
Write own program for evaluation of the image histogram.
Write program for histogram adjusting where upper and lower
bounds are adjusted that reject 5% darkest and 5% brightest
pixels. Pixels outside of this range should be set to the maximal,
i.e., minimal luminance.
How to determined negative of image written using colormap?
Calculation of image negative for color models different from
the RGB?
Write programs for calculation of the image negative.
Create target image based on original image of the hexagonal
shaped range in the size 2-4-2 where pixels of destination
image are equal to mean of pixels of original image.
Realize own functions for all variants of translation.
For self-exercise
Write coordinate transform where rotation is performed for arbitrary
angle.
Write coordinate transform that performs distortion parallel to arbitrary
line y=ax+b.
Determine functional relationship between output and input image for
all transforms defined within lectures.
Is original image enlarged or shrinked with respect to a and b in the
case of scaling?
Can scaling be defined for alternative directions than along the x and y
axes?
Can reflection along coordinate axes and with respect to origin be
described using scaling?
Realize all introduced linear geometric transforms.
Realize coordinate transform: g(x,y)=f(x+Asinby,y+Asinxb). Perform
experiments with A and b.
For self-exercise
Create program for image resize based on bilinear transform. This program
should be able to handle with non-integer values of scale k and l, as well as
with possibility that k and l are smaller than 1.
Write program for distortion.
Write program for rotation and the nearest neighbor interpolation.
Perform rotation for 5 degrees twice using the nearest neighbor and after
that for -5 degrees twice. Also, these operations repeat with bilinear
interpolation. Compare results.
Check if the bilinear interpolation used for transformation of polar to
rectangular raster is the same as the standard bilinear interpolation
introduced previously.
Project
Write program that allows that users can select
colors and adjust colors in interactive manner
including usage of curves presented on slide 18.
When user define more different points curve should
be interpolated using the Lagrange multipliers.
Write program that perform the fish-eye transform as
well as program able to image distorted with fish-eye
return to normal (or close to normal) shape.
Project
Review the Lagrange interpolation formula and use it
for polynomial interpolation using the grid.
Find Internet resources related to interpolation and
write seminar paper related to found facts.
Digital Image Processing
Image and Fourier transform
FT of multidimensional signals
Images are 2D signals.
The Fourier transform of 2D continuous time signal
x(t
1
,t
2
) is:
Inverse Fourier transform gives the original signal:
1 1 2 2
1 2 2 1 2 1
( , , ) ( )
j t j t
e dt dt t X x t
=
1 1 2 2
1 2 1 2 1
2
2
1
(2 )
) ( ( , , )
j t j t
e d d x X t t
+
=
FT of multidimensional signals
Signal and its FT represent the Fourier transform pair.
This pair can be written in compact form by introducing
vectors (allowing larger number of coordinates than 2):
Fourier transform can be written as: 1 2
( , ,..., )
Q
t t t t =
1 2
( , ,..., )
Q
=
( ( ) )
j t
t
x d t e t X
=
1 2
1 2
... ...
Q
Q
t t t t
dt dt dt dt
= = =
= =
1 1
...
Q Q
t t t = + +
FT of multidimensional signals
Inverse multidimensional Fourier transform:
We will consider the 2D signals only.
Since we are considering discretized signals we will not
consider in details FT of continuous time signals.
Before we proceed with story about discretized signals
we will give several general comments about the FT.
( (
1
(2 )
) )
j t
Q
X t d x e
Fourier transform
FT establishes the 1 to 1 mapping with signal.
Roughly speaking signal in time and spectral domain
(its Fourier transform) are different representations
of the same signal.
In addition the FT and its inverse have quite similar
definitions (difference in constant and sign of the
complex exponential).
Why we in fact use the FT?
Motivation for introducing the FT
signal in time domain
signal in frequency domain
Consider the sinusoid
represented with red line. In
the FT domain it can be
represented with two clearly
visible spectral peaks. When
we add large amount of
Gaussian noise to sinusoid it
cannot be recognized in time
domain (blue lines) while in
spectral domain still the
spectra achieves maximum for
frequency corresponding to
considered sinusoid.
Motivation for introducing the FT
Roughly speaking: some signals are represented in
better manner in frequency than in time domain.
In addition, filter design is much simpler in spectral
than in space domain. In previous example we will
detect spectral maximum and we will design cut-off
filter that will take just narrow region in the
frequency domain around spectral maximum. In this
way we would reduce significantly influence of noise
to sinusoid.
Similar motivation is used for introducing additional
transforms in digital image processing but we will
consider them later in this course
Similar motivation is used in the case 2D signals and
images.
2D FT of discrete signals
Here we will consider discrete signals.
Then we will skip properties of the 2D FT of
continuous time signals but I propose you to check
that in textbook.
Discrete-time signals are obtained from the
continuous time counterpart by using sampling
procedure.
The sampling in the case of 1D signals is simple and
we can take equidistantly separated samples:
x(n)=c x
a
(n T)
discrete-time signal
continuous time signal (T is sampling
rate)
constant
(commonly c=T)
2D FT of discrete signals
In order that we can reconstruct continuous-time
signal based on discrete-time counterpart the
sampling theorem should be satisfied.
This theorem is satisfied if sampling rate satisfies:
T 1/2f
m
If the sampling theorem is not satisfied we are
making smaller or bigger mistake.
How to perform sampling in the case of digital
images?
maximal signal frequency
Sampling of 2D signals
The simplest sampling in digital images is:
x(n,m)=c x
a
(n T
1
,m T
2
)
Constant c is commonly selected as c=T
1
T
2
.
Sampling rate is usually equal for both coordinates T
1
=T
2
.
Sampling theorem is satisfied when T
1
1/2f
m1
and T
2
1/2f
m2
.
Here, f
m1
and f
m2
are maximal frequencies along corresponding
coordinates (note that 2D FT of continuous time signals has two
coordinates
1
and
2
. Then f
m1
and f
m2
are maximal
frequencies along these coordinates f
mi
=
mi
/2).
Sampling of 2D signals
Previously described rectangular sampling is not
unique sampling scheme.
In the case of rectangular sampling we are replacing
the rectangular of image with single sample:
Entire rectangular can be replaced with single sample
This is practical sampling scheme but we can
apply some alternative sampling patterns.
2D signal samplings
Some of alternative sampling schemes are given
below:
It can also be rhomb but the hexagon is best
pattern with respect to some well-established
criterion.
However, we will continue with
usage of hexagonal sampling due to
simplicity and practical reasons!!!
Quantization
Discretized signal is not used directly but it is
quantized.
Instead of exact values we are taking values rounded
(or truncated) to the closest value from the set of
possible values (quant).
Errors caused to rounding is smaller than error
caused by truncation but truncation is used more
often in practice.
Quantization
Quantization can be performed with possible values
equidistantly separated but some systems and
sensors are using non-uniform quantization.
Number of quantization levels is commonly 2
k
and
these quant levels are commonly represented as
integers in domain [0,2
k
-1].
We will almost always assume that we have
discretized and quantized (these signals are called
digitalized).
2D FT of discrete signals
Fourier transform pair between discrete-time signal and
corresponding FT can be represented using the following
relationships:
1 2
1 2
( , ) ( , )
j n j m
n m
X x n e m
= =
=
1 2
1 2
1 2
2
1 2
( , ( , )
1
2 )
)
(
j n j m
X d x e n d m
+
= =
=
2D FT of discrete-time signal is continuous variable and it is not
suitable in this form for operation on the computer machines.
2D FT is periodic with period along both coordinates of 2.
2D DFT
We will not explain in details properties of the 2D
FT of discrete-time signals since we will not use it
in process.
Our goal is to have discretized transform that is
suitable for processing using computer machines.
In order to achieve this we use periodicity
extension property.
Namely, assume that signal x(n,m) is defined
within limited domain (it is always the case for
digital images).
Let size of signal (image) is NxM.
2D DFT
Perform periodical extension of the original signal with period
N
1
xM
1
(it should be satisfied N
1
N and M
1
M but here from the
brevity reasons we assume N
1
=N and M
1
=M).
n
m
original signal x(n,m)
( , ) ( , )
p
r p
x n rN m x n m pM
= =
+ + =
periodical extension
2D DFT
FT of periodically extended signal is:
1 2
1 2 1 2
1 2
1 2
1 2 1 2 1 1 2 2
( , ) ( , )
( , )
( , ) ( , ) ( 2 ) ( 2 )
j n j m
p
n m r p
j n j m jrN jpM
r p n m
jrN jpM
r p
X x n rN m pM e
x n m e e
X e X N k M k
= = = =
+
= = = =
+
= =
= + + =
= =
= =
Here we changed places of sums and we used property of FT of
translated (shifted) signal with possible neglecting some multiplicative
constants.
Analog Dirac pulses
(generalized functions)
produced by sumation
over infinity number of
terms in sums
2D DFT
Finally we obtain:
Thus, the FT of periodically extended signal is equal to samples
of the 2D FT taken in the discrete grid k
1
[0,N) and k
2
[0,M).
Periodical extension produce discretized FT (DFT).
Periodical extension is commonly not performed in practice due
to infinity number of terms in sums and usage of generalized
functions.
However, we should keep in mind that we assumed and that
the smallest period for extension is equal to dimension of image
NxM.
1 2
1 2
2 2
( , ) ,
p
k k
X X
N M
=
2D DFT
2D discrete signal and 2D DFT are transformation pair
1 2 1 1
1 2
0 0
2 2
( , ) ( , ) e
k k
N
n m
M
N M
j j
n m
X k k x n m
= =
=
It is the FT of discrete
signal calculated in limited
interval and for discretized
frequencies.
1 2
1 2
2 2
1 1
1 2
0 0
1
( , ) ( , ) e
k n k m
N M
j j
N M
k k
x n m X k k
NM
+
= =
=
Important fact: The inverse DFT can be calculated in almost the same way as direct on usign
sums. Differences are very small (minus in complex exponential and normalization constant
1/NM).
Domains of the 2D DFT and FT of
discrete signals
Domain of the 2D FT of the discrete time signal is
Descartes product:
Domain of the 2D DFT is discrete set of points
(k
1
,k
2
)[0,N)x[0,M).
We have to determine relationship between frequencies
in these two domains!!!
Relationship
1
=2k
1
/N and
2
=2k
2
/M can be satisfied
only for 0k
1
<N/2 and 0k
2
<M/2 while for larger k
1
and
k
2
these relationship would produce frequencies outside
of
1
and
2
domain.
1 2
( , ) [ , ) [ , )
[ belongs to interval
( does not belong to
interval
Domains 2D DFT and FT of
discrete signals
For larger k
1
and k
2
we can use the fact that the 2D FT of
discrete time signals is periodical along both coordinates
1
and
2
with period 2 and we can establish relationships:
1
=2(N-k
1
)/N and
2
=2(M-k
2
)/M for N/2k
1
<N and
M/2k
2
<M.
(0,0) (N,0)
(N,M)
(0,M)
Arrows depict
quadrants of the 2D
DFT shift in this
operation for obtaining
properly ordered
frequencies.
2D DFT, convolution and LSIS
Digital image can be subject to numerous forms of
processing.
Assume that image is input in linear space invariant
system:
System is linear when linear combination of inputs
produces linear combination of outputs:
If x(n,m) is input and T{x(n,m)} is transformation of
input produced by the LSIS then it holds:
T{ax
1
(n,m)+bx
2
(n,m)}=aT{x
1
(n,m)}+bT{x
2
(n,m)}
LPIS
?
These systems can be for various
purposes but we will assume their
application for image filtering and
denoising
2D DFT, convolution and LSIS
System is space invariant (extension of the concept of time
invariance) when transform of shifted image is equal to
shifted transform of original image:
If y(n,m)=T{x(n,m)} if should hold
y(n-n
0
,m-m
0
)=T{x(n-n
0
,m-m
0
)}.
LSIS has important property that
its output can be given as a
convolution of input signal with
2D impulse response of the system:
Due to limits in image size,
rounding, discrete nature of
image systems that process
digital imagers are rarely LSIS
but most of them can be
approximated with the LSIS.
y(n,m)=x(n,m)
*n*m
h(n,m)
2D convolution
2D DFT, convolution and LSIS
h(n,m)=T{(n,m)} where:
Assume that impulse response is finite N
1
xM
1
. Then
linear convolution (here, we consider only this type of convolution and
other types will not be considered here) can be defined as :
1 for =0 and =0
( , )
0 elsewhere
n m
n m
=
1 1
1 1
1 1
1 1 1 1
0 0
( , ) { ( , )} ( , ) * * ( , )
( , ) ( , )
n m
N M
n m
y n m T x n m x n m h n m
h n m x n n m m
= =
= =
=
=
+ +
1 1
1 1 1 1 1
( , ) [0, ) [0, )
'( , )
0 [ , ) [ , )
h n m n N m M
h n m
n N N N m M M M
=
+ +
1 2
1 1
1 1
2 2
1 1
1 2
0 0
'( , ) '( , )
k n k n
N N M M
j j
N N M M
n m
X k k x n m e
+ +
+ +
= =
=
1 2
1 1
1 1
2 2
1 1
1 2
0 0
'( , ) '( , )
k n k n
N N M M
j j
N N M M
n m
H k k h n m e
+ +
+ +
= =
=
2D DFT, convolution and LPIS
1 2
1 1
1 1
1 2
1 1
2 2
1 1
1 2 1 2
0 0
1
( , )
( 1)( 1)
'( , ) '( , )
k n k n
N N M M
j j
N N M M
k k
y n m
N N M M
X k k H k k e
+ +
+
+ +
= =
=
+ +
There are cases when calculation of the convolution is faster in the case
of using 3 2D DFTs than by direct computation.
This is possible when we use fast algorithms for evaluation of the 2D
DFTs.
Need for fast algorithms
In the digital signal processing course we have learnt
two the simplest fast algorithms for the 1D DFT
evaluation:
decimation in time
decimation in frequency
Since digital images have more samples than 1D
signals these algorithms are even more important
than in the case of 1D signals.
We can freely claim that the modern digital image
processing would not be developed without FFT
algorithms (FFT is the same transform as DFT but
name only indicates fast evaluation algorithm!!!).
Need for fast algorithms
Direct evaluation for single frequency requires NxM complex
multiplications (complex multiplication is 4 real multiplications and two real
additions) and NxM complex additions (2 real additions). For each
k
1
, k
2
these operations should be repeated NxM times. Then
calculation complexity is of order:
N
2
M
2
complex additions (2N
2
M
2
real)
N
2
M
2
complex multiplications (4N
2
M
2
real+2N
2
M
2
real
additions)
For example N=M=1024 on the PC that can perform 1x10
9
operations in second requires: 8N
2
M
2
>8x10
12
real operations,
i.e., 8x10
3
sec that is more than 2h.
1 2 1 1
1 2
0 0
2 2
( , ) ( , ) e
k k
N
n m
M
N M
j j
n m
X k k x n m
= =
=
=
= =
=
2D DFT can be written as:
1D DFT of image row
Step-by-Step algorithm
For simplification assume that N=M.
For each of the FFT in rows and columns we need
Nlog
2
N complex additions and multiplications and for
entire freqeuncy-frequency plane it is required:
2NxNlog
2
N complex additions and multiplications.
This ie equal to 8N
2
log
2
N real additions and
multiplications, i.e., 16N
2
log
2
N operations.
For N=M=1024 required number of operations is
160x10
6
. It is equal to 0.16sec on considered
machine.
Step-by-step algorithm
For 1D signals 2FFT algorithms are in usage:
decimation in time
decimation in frequency.
Complexity of both of these algorithms is similar.
The step-by-step algorithm is not optimal for 2D signals but it is quite
popular due to its simplicity.
PC earlier and today mobile devices have problems with memory
demands of the 2D FFT algorithms since today some machines like
mobile devices still have moderate memory amounts. Then we have
problem since in the step by step algorithm we need 3 matrices for:
original image
FT of columns or rows (complex matrix written in memory as two real-
valued matrices)
2D DFT (again complex-valued matrix)
Step-by-step algorithm -
memory
Then in fact we need memory for 5 real-valued matrices.
Fortunately, image is real-valued signal where the following rule
holds:
1 2
1 2
2 2
1 1
*
1 2
0 0
2 ( ) 2 ( )
1 1
1 2
0 0
( , ) ( , ) e
( , ) e ( , )
k n k m
N M
j j
N M
n m
N k n M k m
N M
j j
N M
n m
X k k x n m
x n m X N k M k
+ +
= =
= =
= =
= =
it holds
x*(n,m)=x(n,m)
You can find in the textbook how this relationship can be
used to save memory space!!!
Advanced 2D FFT algorithms
Advanced FFT algorithms for images perform decimation
directly along both coordinates.
1 2
1 1
1 2
0 0
( , ) ( , )
N N
nk mk
N N
n m
X k k x n m W W
= =
=
W
N
=exp(-j2/N)
We assume N=M
This 2D DFT can be given with 4 subsums for even and odd
coefficients
1 2 1 2
1 2 1 2
1 1 1 1
1 2
0 0 0 0
even even even odd
1 1 1 1
0 0 0 0
odd even odd odd
( , ) ( , ) ( , )
( , ) ( , )
N N N N
nk mk nk mk
N N N N
n m n m
n m n m
N N N N
nk mk nk mk
N N N N
n m n m
n m n m
X k k x n m W W x n m W W
x n m W W x n m W W
= = = =
= = = =
= + +
+ +
Advanced 2D FFT algorithms
After simple manipulations we obtain all sums within limits of
[0,N/ 2)x[0,N/ 2)
X k k S k k S k k W S k k W S k k W
N
k
N
k
N
k k
( , ) ( , ) ( , ) ( , ) ( , )
1 2 00 1 2 01 1 2 10 1 2 11 1 2
2 1 1 2
= + + +
+
where
S k k x m m W
N
m k m k
m
N
m
N
00 1 2 1 2
2 2
0
2 1
0
2 1
2 2
1 1 2 2
2 1
( , ) ( , )
/ /
=
+
=
S k k x m m W
N
m k m k
m
N
m
N
01 1 2 1 2
2 2
0
2 1
0
2 1
2 2 1
1 1 2 2
2 1
( , ) ( , )
/ /
= +
+
=
S k k x m m W
N
m k m k
m
N
m
N
10 1 2 1 2
2 2
0
2 1
0
2 1
2 12
1 1 2 2
2 1
( , ) ( , )
/ /
= +
+
=
S k k x m m W
N
m k m k
m
N
m
N
11 1 2 1 2
2 2
0
2 1
0
2 1
2 12 1
1 1 2 2
2 1
( , ) ( , )
/ /
= + +
+
=
Advanced 2D FFT algorithms
2D DFT can now be represented depending on the
(k
1
,k
2
) as:
X k k S k k W S k k W S k k W S k k
N
k
N
k
N
k k
( , ) ( , ) ( , ) ( , ) ( , )
1 2 00 1 2 01 1 2 10 1 2 11 1 2
2 1 1 2
= + + +
+
for
0 2 1
1
k N /
0 2 1
2
k N /
X k k S k k W S k k W S k k W S k k
N
N
k
N
k
N
k k
( , ) ( , ) ( , ) ( , ) ( , )
1 2 2 00 1 2 01 1 2 10 1 2 11 1 2
2 1 1 2
+ = +
+
for
0 2 1
1
k N /
0 2 1
2
k N /
X k k S k k W S k k W S k k W S k k
N
N
k
N
k
N
k k
( , ) ( , ) ( , ) ( , ) ( , )
1 2 2 00 1 2 01 1 2 10 1 2 11 1 2
2 1 1 2
+ = +
+
for
0 2 1
1
k N /
0 2 1
2
k N /
Advanced 2D FFT algorithms
Finally:
X k k S k k W S k k W S k k W S k k
N N
N
k
N
k
N
k k
( , ) ( , ) ( , ) ( , ) ( , )
1 2 2 2 00 1 2 01 1 2 10 1 2 11 1 2
2 1 1 2
+ + = +
+
for
0 2 1
1
k N /
0 2 1
2
k N /
S k k 00 1 2 ( , )
S k k 01 1 2 ( , )
S k k 10 1 2 ( , )
S k k 11 1 2 ( , )
1
WN
WN
WN
k1
k2
k1+k2
X k k ( , ) 1 2
X k k ( , ) 1 2 + /2 N
X k k ( , ) 1 2+ /2 N
X( , ) k k 1 2 + /2 + /2 N N
-1
-1
-1
-1
-1
-1
Decomposition can be presented using the following lattice:
Decomposition can
be performed in the
next stage on each
S
ij
(k
1
,k
2
) block. Full
decomposition for
image with 4x4
pixels.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
x(0,0)
x(0,1)
x(1,0)
x(1,1)
1
1
1
1
1
-j
1
-j
1
1
-j
-j
1
-j
-j
-1
x(0,2)
x(0,3)
x(1,2)
x(1,3)
x(2,0)
x(2,1)
x(3,0)
x(3,1)
x(2,2)
x(2,3)
x(3,2)
x(3,3)
X(0,0)
X(2,0)
X(0,2)
X(2,2)
X(0,1)
X(2,1)
X(0,3)
X(2,3)
X(1,0)
X(3,0)
X(1,2)
X(3,2)
X(1,1)
X(3,1)
X(1,3)
X(3,3)
-1
-1
-1
-1
-1
-1
Advanced 2D FFT algorithms
Number of complex multiplications for this type of
the FFT algorithm is 3N
2
log
2
N/ 4 while number of
complex additions is 2N
2
log
2
N.
Then there are 3N
2
log
2
Nreal multiplications and
there are 5.5N
2
log
2
Nreal additions.
Total number operations is about half of the number
of operations in the step-by-step algorithm.
There are myriad FFT algorithmtypes!!!
Features of the DFT of real image
2D DFT can be calculated in MATLAB using function:
F=fftshift(fft2(I))
image
Function that calculate the
2D DFT usingFFT algorithm
Shifting FFT coefficients in natural order
(see slide 22)
On the next slide image Lena with logarithm of its 2D DFT is shown
(white positions represent larger values while dark are smaller)
Features of 2D DFT
Test image Lena
2D FFT
values around origin correspond to
(
x
,
y
)=(0,0) are white and they are up to
10
10
times larger than dark positions.
Features of 2D DFT
In the considered image Lena of dimension 256x256 pixels, less than
10 samples of the 2D DFT has more than 99% of energy.
Can we memorize image with just 10 2D DFT samples (comparing to
256x256 pixels).
The answer is NO, NOand NO!!!
Namely, the main part of energy is related to image luminance while
details of image that corresponds to features very important for
human vision are on higher frequencies.
This is very important feature of human eye: components of small
energy on higher frequencies contain main part of information that
humans receive.
High-frequency components have small energy and they are subject to
noise influence.
For self-exercise
Determine properties of the 2D DFT of real-valued signals.
Prove properties of the 2D FT of continuous time signals given in the
textbook. Do these properties hold for 2D FT of discrete signals and
2D DFT?
Assume that 2D signal is discretized in some arbitrary manner
(diamond or hexagon instead of the rectangular sampling).
Reconstruct original signal based on these samples.
Consider convolution of 2D signals. Can evaluation of the convolution
be more efficient using the 2D DFT?
We demonstrated within slides one algorithm for the 2D FFT by
decimation of signal on 4 subsignals. Is this signal decimation in time
or decimation in frequency?.
For self-exercise
Realize the 2D FFT using alternative decimation algorithm.
Is it possible to combine decimations? For example decimation
in frequency along rows and decimation in time along columns.
If it is possible perform this decimation for 4x4 image and
present full decomposition. If it is not possible explain reasons.
Interpolate image using zero-padding of the 2D DFT of original
image!!!
Solve problems given in the textbook at the end of
corresponding chapter.
Digital Image Processing
Radon transform
DCT
Unitary and orthogonal transforms
Radon transform
Radon transform is developed during XIX century.
The aim was to reconstruct interior of objects by using
projections made along different angles.
The entire scientific area called computer tomography is
based on this transform.
In practice, for this transform are used signals that can
penetrate through the object (X-rays or some other
signals) and we are recording the attenuation of these
signals on the way through the objects.
The first application of this transform was recording the
Sun interior based on recordings from the Earth (the Sun
was source of light, i.e., projection in this experiment).
Radon transform
Medicine is the main consumer of
the Radon transform but it is also
used in other fields such as
astronomy.
Recently it is used intensively in
geology.
Namely, the earth surface is
searched for oil and other mineral
goods.
It is used sound signals for these
recordings.
Earth surface
propagation of
sound wave
Radon transform
Assume that we have signal (wave, ray) that has possibility
to penetrate through objects.
This signal attenuates locally in point (x,y) with some
attenuation function f(x,y).
This function can tell us some important information about
material through which this ray is penetrating through the
object (ultrasound is able to propagate through liquid
materials, X-rays attenuate significantly on bones etc).
Therefore, our goal is visualization of attenuation function
f(x,y) for each object point.
However, we know only total attenuation of beam that is
passing through the object (accumulated through the object
on considered path) and based on this information we want
to reconstruct f(x,y).
Radon transform
Consider attenuation function along direction s. A
and B are points where beam entering and exiting
the object.
Under relative mild assumptions beam has linear
propagation in the object and s can be parameterized
as:
xcos+ysin=t
( , )
AB
f x y ds
( ) ( , ) ( cos sin ) = +
z z
This relationship holds since
(x,y) along s satisfies
relationship from the previous
slide.
Now we can see that problem
is reduced to determination of
f(x,y) based on P
(t) where
and t are determined by beams
we are sending toward the
object along different angles
and for various values of t.
Usage of the 2D FT
Our goal is to reconstruct f(x,y) based on known P
(t) for
various angles and for various t.
Consider 2D FT of f(x,y):
beams that are
submitted toward
object for angle
1
for various t.
beams that are
submitted toward
object for angle
2
for various t.
F u v f x y e dxdy
j ux vy
( , ) ( , )
( )
=
+
z z
Relationship between P
(t) and
f(x,y) in Fourier domain
FT of signal P
(t) is:
S P t e dt
j t
( ) ( ) =
z
Introduce:
P t f x y x y t dxdy
( ) ( , ) ( cos sin ) = +
z z
Now we get:
( ) ( , ) ( cos sin )
j t
S f x y x y t e dxdydt
= +
Relationship between P
(t) and
f(x,y) in Fourier domain
Calculating integral over t we obtain:
( cos sin )
( ) ( , ) ( cos , sin )
j x y
S f x y e dxdy F
+
= =
Now we can claim that there is relationship
between the FT of projections and FT of f(x,y).
Our problem can be solved in 4 steps.
1. Calculating projections.
2. Determination of the 2D FT of projections
3. Determination of the 2D FT of f(x,y).
4. Evaluation of f(x,y) using 2D IFT.
Problems in Radon Transform
Commonly under the Radon transform is assumed
determination of projections P
(t).
In the same time we can without rotation sum vertical pixels and in
this manner calculate P
/2-
(t).
This procedure should be performed for all angles.
Note that this is not the most efficient technique for calculating the
Radon transform but it is simple for understanding and it is reason
why it is quite common in practice.
Radon transf. - Example
Assume that we have white line y=ax+b on black
background. Calculate the Radon transform:
P t f x y x y t dxdy
( ) ( , ) ( cos sin ) = +
z z
( , ) ( ) f x y y ax b =
Our image can be written as:
Then projection is given as:
( ) ( ) ( cos sin ) P t y ax b x y t dxdy
= +
Radon transf. - Example
After some calculations we obtain:
( sin )
cos si
( ) ( cos ( ) )
n
sin P t x ax b t d
t b
a
x
+
= + + =
for k=1,...,N-1
C k
N
x n
n k
N
n
N
( ) ( ) cos
( )
=
+
=
2 2 1
2
0
1
Inverse DCT
Inverse DCT is defined as:
x n
N
C
N
C k
n k
N
k
N
( ) ( ) ( ) cos
( )
= +
+
=
1
0
2 2 1
2
1
1
For self exercise try to prove that the DCT and inverse
DCT are mutually inverse.
Fast DCT
Since the DCT is very useful in the digital image processing it
is important to have developed fast algorithms for its
evaluation.
There are several approaches for solving this problem:
One is to use the property:
(2 1) (2 1)
exp exp
(2 1)
2 2
cos
2 2
n k n k
j j
n k
N N
N
+ +
+
+
=
and using several simple relationship to reduce the 1D DCT evaluation
to fast evaluation of the 1D DFT.
Try this for homework!!!
Fast DCT
The second technique for fast DCT evaluation is based on
specific methodology for signal extension. Check this
methodology described in the textbook. Again using this
methodology the fast DCT can be reduced to fast DFT
evaluation.
Finally, it is possible to decompose DCT to two DCTs with
N/2 samples. Do it for homework.
dct function is used in MATLAB for the 1D DCT
evaluation while the idct is used for its inverse.
2D DCT
There is not unique form of the 2D DCT. The simplest
realization technique is calculation of the 1D DCT along
columns and after that along rows of newly obtained
matrix.
However, there are alternative techniques for direct
evaluation of the 2D DCT. Again there are several
definitions of the 2D DCT that can be used in practice but
here we adopted:
C k k x n n
n k
N
n k
N
n
N
n
N
( , ) ( , ) cos
( )
cos
( )
1 2 1 2
1 1
1
2 2
2
0
1
0
1
4
2 1
2
2 1
2
2
2
1
1
=
+ +
=
Inverse 2D DCT
Inverse 2D DCT (for our the 2D DCT form) is:
x n n
N N
w k w k C k k
n k
N
n k
N
k
N
k
N
( , ) ( ) ( ) ( , ) cos
( )
cos
( )
1 2
1 2
1 1 2 2 1 2
1 1
1
2 2
2
0
1
0
1
1 2 1
2
2 1
2
2
2
1
1
=
+ +
=
w k
k
k N
i i
i
i
( )
/
=
=
R
S
T
1 2 0
1 1 1
For homework prove that the 2D DCT and its inverse
defined on this slide form transformation pair, i.e., they
are mutually inverse. If this is not satisfied propose
modification of one of them or both!!!
Fast 2D DCT
The same three methodologies used for the 1D DCT
realization can be applied here for the 2D DCT fast
realization.
However, additional problem is related to problem
dimensions since we should decide between step-
by-step realization and direct 2D evaluation.
Step-by-step reduces problem to the 1D DCT and
(with help of the textbook) try to apply these variants
to the realization of the 2D DCT.
I am sharing your excitement with this task!
2D DCT MATLAB - Example
Function for 2D DCT realization in MATLAB is dct2 while its inverse
is idct2. In the case of the 2D DCT (as well as in the 1D DCT)
coefficients shifting is not required.
Logarithm of the DCT coefficients for Baboon is given below.
low-frequency coefficients
with very high values, this part
corresponds to the image
luminance
high-frequency
coefficientswith extremely
small values, these coefficients
correspond to the image details
Unitary and orthogonal
transforms
The DFT and DCT obviously are quite similar.
They have similar properties but even their structure
is quite similar.
Namely, both group of transforms can be written as
(for 1D signals) :
1
0
( ) ( ) ( , )
N
n
X k x n w n k
=
=
X = W x
Unitary and orthogonal transf.
Two important classes of transforms are:
Unitary transform has unitary transform matrix
W
-1
=W
H
(W
H
is Hermitian matrix, i.e., transpose and
conjugate W
H
=(W
T
)
*
).
Orthogonal transforms have orthogonal transformation
matrix: W
-1
=W
T
.
Obviously, all orthogonal transforms are in the same
time unitary for real valued W.
At the first glance the DFT doesnt belong to any of
these two important classes. Then the DFT is
sometimes defined as:
1
0
( ) ( )
1
N
nk
N
n
X k x n W
N
=
=
Multiplicative constant that
has no impact on transform
properties.
DFT as unitary transform
Inverse transform is now defined as:
1
0
( ) ( )
1
N
nk
N
k
x n X k W
N
=
=
It is easy to prove that this form of the DFT is unitary transform.
In order to avoid complications, we assume under orthogonal and
unitary transforms all transforms that can be reduced to these
transforms by simply introducing multiplicative constants.
Basic reason for using these two group of transforms is the fact that inverse matrix
calculation is very demanding operation and that it is avoided in these transforms.
Basis signals
Now, we want to introduce a concept of basis
signals.
Consider the inverse transform:
1
0
( ) ( ) ( , )
N
k
x n X k g n k
=
=
=
=
Basis signals Example
Consider DFT where g(n,k)=exp(j2nk/N).
Take N=8 and try to visualize real part of the
transform. In order to further simplify
visualization we will use instead of discrete
instants n continuous time variable t.
Obtained basis signals are:
1 cos(t/4) cos(t/2) cos(3t/4)
cos(t) cos(5t/4) cos(3t/2) cos(7t/4)
1
cos(t/4)
cos(t/2)
cos(3t/4)
cos(t)
cos(5t/4)
cos(3t/2)
cos(7t/4)
Basis signals describe
nature of transforms.
Namely, weighted sum of
basis functions produce
transformed signal. When
weight of low-frequency
components is larger (for
small k) signal is more
low-frequency, while in
opposite case it is more
on high frequencies.
Generalized transforms for
2D signals
Before we proceed with some important transforms
(in addition to DFT and DCT) we will consider
generalization of transforms for 2D signals.
Let 2D signal x(n,m) has dimension NxM.
Generalized transform can be written as:
1 1
1 2 1 2
0 0
( , ) ( , ) ( , ; , )
N M
n m
X k k x n m J n m k k
= =
=
4D transformation matrix
Common form of
transformation matrix
Fortunately the 4D transform matrix is not in common
usage and instead we are performing transforms in step-
by-step manner along columns (or rows) and after that on
rows (or columns).
The rationale is the same as in the step-by-step algorithm
for 2D DFT calculation.
Then the 2D transform can be written as:
1 1
1 2 2 1
0 0
( , ) ( , ) ( , ) ( , )
N M
c r
n m
X k k x n m H m k H n k
= =
=
2D transforms
Matrix form
Matrix representation of separable transforms (where 4D
transform matrix can be written as multiple of 2 2D transform matrices) can
be written as:
X=H
c
x H
r
Inverse transform can be written as (recall basic matrix algebra):
x=H
c
-1
X H
r
-1
The commonly transforms applied on rows and columns
are the same H
c
=H
r
=T and it follows:
X=T x T x=T
-1
X T
-1
For T unitary or orthogonal matrix the 2D transform can be
performed as:
x=T
H
X T
H
or x=T
T
X T
T
Basis images
In analogy to the basis signals we can introduce the basic
images. For image of dimensions NxM it can be defined NxM
basic images that are equal to inverse transform of signal
(i-p,j-q). These NxM basic images can be obtained when p i q
are changed within the range (p,q)[0,N)X[0,M).
If P=T
-1
we have separable transform where basic image
(p,q) is equal to:
f
(p,q)
(n,m)=P(n,p)P(m,q)
Then any image can be represented as a sum of the basic
images:
1 1
, ( , )
0 0
( , ) ( , )
N M
p q p q
p q
x n m X f n m
= =
=
Sinusoidal transforms
Signal can be given in the form of the expansion over
function of different type but sinusoidal (or cosinusoidal
functions) are the most common.
They are quite natural concept. In electrical engineering
they corresponds to electrical current produced in
generators causing such type of current shape in our
power lines. Numerous phenomena in communication
systems are also associated with sinusoidal functions. In
mechanic this function type can appear in the case of
oscillations.
In addition, sinusoidal functions offer elegant
mathematical apparatus useful in analysis of numerous
practical phenomena.
Sinusoidal transforms
The DFT has coefficients of the transformation matrix:
w(n,k)=exp(-j2nk/N)
In addition we introduced the DCT with coefficients:
I t can be multiplied with N
(
(
2 1)
( ) , ) cos
2
n k
c n k
N
k
+
=
1/ 0
2
( )
/ 0
N k
N
k
k