You are on page 1of 4

ROTATION INVARIANT CURVELET FEATURES FOR TEXTURE IMAGE RETRIEVAL

Md Monirul Islam, Dengsheng Zhang, and Guojun Lu



Gippsland School of Information Technology, Monash University, VIC 3842, Australia
E-mail: {md.monirul.islam, dengsheng.zhang, guojun.lu}@infotech.monash.edu.au

ABSTRACT

Effective texture feature is an essential component in any content
based image retrieval system. In the past, spectral features, like
Gabor and wavelet, have shown superior retrieval performance
than many other statistical and structural based features. Recent
researches on multi-resolution analysis have found that curvelet
captures texture properties, like curves, lines, and edges, more
accurately than Gabor filters. However, the texture feature
extracted using curvelet transform is not rotation invariant. This
can degrade its retrieval performance significantly, especially in
cases where there are many similar images with different
orientations. This paper analyses the curvelet transform and
derives a useful approach to extract rotation invariant curvelet
features. Experimental results show that the new rotation invariant
curvelet feature outperforms the curvelet feature without rotation
invariance.

Index Terms Curvelet transform, CBIR

1. INTRODUCTION

Texture is one of the most important features used in content based
image retrieval (CBIR) [1, 2]. A significant number of techniques
have been proposed in the literature to extract texture features
which can be broadly divided into spatial and spectral techniques.
Spatial techniques measure image-textures using low order
statistics from image grey levels and are subject to noise.
Furthermore, spatial features are not robust and the number of
useful features is very small. So far, spectral features, like Gabor
[3] and wavelet [4], have shown to have better retrieval
performance than the features calculated using spatial methods,
like statistical and structural techniques. Recently, researches on
multi-resolution analysis show that curvelet transform has
significant advantages over Gabor transform due to curvelet is
more effective in capturing curvilinear properties, like lines and
edges [5, 6]. Curvelet transform was originally proposed for
image-de-noising application [6] and has shown promising results
in character recognition [7] and image retrieval [8]. Recently,
Sumana et al. [9] show that curvelet transform significantly
outperforms the widely used Gabor transform in the standard
Brodatz texture database. However, the curvelet features extracted
in this work are not rotation invariant. Therefore, these features
will not be able to retrieve images with different orientations. For
image retrieval application, this means expensive online shifting of
the features vectors in all directions to find the best match between
the query and example image. Instead of doing shift matching, the
feature vector can be normalized before indexing, so as the online
matching is simple [10].
In this paper, we propose an effective and efficient technique
to extract rotation invariant curvelet feature. We analyse the
features and proposes a novel technique to extract rotation
invariant curvelet features. The method normalizes each curvelet
descriptor to avoid expensive online matching. We show that the
rotation invariant curvelet feature has better performance than the
rotation variant curvelet feature in retrieving both man-made and
natural textures.
The rest of this paper is organized as follows. In section 2, we
briefly introduce the curvelet transform, while section 3 describes
the technique of rotation invariant curvelet feature extraction. The
experimental results and comparison are presented in section 4.
Section 5 concludes the paper.

2. THE CURVELET TRANSFORM AND FEATURE
EXTRACTION

This section briefly describes curvelet transform and texture
feature extraction using curvelet transform.

2.1. Curvelet transform

The concept of curvelet transform has been extended from the two-
dimensional ridgelet transform. Therefore, the continuous 2D
ridgelet transform is defined at first. Given an image f(x, y), its
continuous ridgelet transform at scale a, translation b, and
orientation , is defined as,

= dxdy y x f y x b a CRT
b a f
) , ( ) , ( ) , , (
, ,
(1)
where, 2D ridgelet function,
a,b,
(x,y) is generated from a
univariate function, (x) which has vanishing mean and sufficient
decay. The ridgelet,
a,b,
(x,y) is given as,
) / ) sin cos (( ) , (
2 1
, ,
a b y x a y x
b a
+ =



(2)



A ridgelet is a wavelet type function and is constant along the
lines, xcos +ysin = const. Fig. 1(a) shows a typical ridgelet [5].
A ridgelet is much sharper than a sinusoidal wavelet. Fig. 1(b)
shows the 3D view of a wavelet function. In contrast to wavelet
which is efficient in detecting points, a ridgelet is efficient in
(a)
Fig. 1. Visualization of waveforms of (a) ridgelet, (b) wavelet,
and (c) Gabor functions.
(b)
(c)
562 978-1-4244-4291-1/09/$25.00 2009 IEEE ICME 2009
Authorized licensed use limited to: Monash University. Downloaded on November 10, 2009 at 20:32 from IEEE Xplore. Restrictions apply.
detecting lines. The sharp peak in Fig 1(b) indicates that wavelet
can localize the point singularity whereas the sharp edge in Fig.
1(a) means that ridgelet can localize the line singularity. A ridgelet
is also mathematically compared with wavelet and it has been
shown point parameters of a wavelet are replaced by the line
parameters of a ridgelet [5]. This similarity means that like Gabor,
a ridgelet can be tuned at different scales and orientation to create
curvelets. However, unlike Gabor which has oval shape in its
waveform as shown in Fig. 1(c), curvelets are linear in edge
direction (Fig. 1(a)). Therefore, a ridgelet can capture lines and
edges more accurately than Gabor. Furthermore, because of the
oval shapes of Gabor filters, the frequency spectrum covered by a
set of Gabor filters is not complete. In contrast to Gabor, curvelet
covers the entire frequency spectrum. Fig. 2(a) shows that there
many holes between the ovals in the frequency plan of the Gabor
filters [3]. Fig. 2(b) shows the frequency tiling by curvelet
transform with 4 scale decomposition [11]. In Fig. 2(b), s
i
means
scale i, and 1, 2, 3, etc. are the subband or orientation numbers.
Fig. 2(b) clearly shows that curvelet covers the entire frequency
spectrum.



2.2. Curvelet feature extraction

Given a digital image f[m, n] of dimension M by N, the digital
transform, CT
D
(a, b, ) is obtained as,
=
< < N n
D
b a
M m
D
n m n m f b a CT
0
, ,
0
] , [ ] , [ ) , , (


(3)
Equation (3) is implemented in frequency domain and can be
expressed as,
])) , [ ( ]) , [ ( ( ) , , (
, ,
n m FFT n m f FFT IFFT b a CT
D
b a
D

= (4)

A detail description for the implementation of Equation (4)
can be found in [10]. After obtaining the coefficients in CT
D
(a, b,
), the mean and standard deviation are calculated from each set of
curvelet coefficients. Therefore, if n curvelets are used, a feature
vector of dimension 2n is used to represent an image.
This feature extraction is applied to each database image.
Each image is decomposed into 4 or 5 levels of scales using
curvelet transform. The numbers of subbands at different scales are
different. For 4 levels of decomposition, there are 1, 16, 32, and 1
subbands at decomposition level 1, 2, 3, and 4, respectively.
Therefore, 4 levels decomposition creates 50(=1+16+32+1)
subbands of curvelet coefficients. However, because a curvelet
oriented at an angle, produces the same coefficients as a curvelet
oriented at an angle, +, only half of the subbands at level 2 and 3
are used. Therefore, for 4 levels decomposition, total
26(=1+8+16+1) subbands of curvelet coefficients are used. Thus a
feature vector of 52 (=2X26) dimensions is created for each image.
To make the features of different images compatible to each
other, feature elements from different subbands are organized in
the same order in the feature vector, as shown in Fig. 2. According
to Fig. 2, the feature vector, f, for the 0 oriented images of Fig.
3(a) is organized as,

f = {
s1b1
,
s1b1
,
s2b1
,
s2b1
,
s2b2
,
s2b2
, ...,
s2b8
,
s2b8
,
s3b1
,
s3b1
,

s3b2
,
s3b2
, ...,
s3b16
,
s3b16
,
s4b1
,
s4b1
} (5)

where,
sibj
and
sibj
are the mean and standard deviation calculated
from the subband b
j
at scale s
i
. Each image in the database is
represented and indexed using this feature vector.
During retrieval, an image is given as a query. The feature
vector of the query image is compared with the feature vectors of
all images in the database using L
2
distance measure. The distance,
D, between a query feature vector, Q, and a target feature vector,
T, is given by
=
=
n
i
i i
T Q D
2
1
2 2
) (
(6)
Finally, database images are ranked based on the distance
measures and displayed to the users.

3. ROTATION INVARIANT CURVELET FEATURE

The curvelet features described in the previous section are not
rotation invariant. The feature vector changes significantly when
an image is rotated. Consider the three images in Fig. 3. Though
these images have different orientations, their textures are similar.
However, Fig. 4(a) shows that their feature vectors are quite
different. Fig. 4(a) shows some portion of the curvelet feature
values of the three images for a 4 level analysis. The mean
energies of different subbands are shown in the figure. The
maximum and the second maximum mean energies, which identify
the dominant directions of these images, appear in the different
positions in the vectors. Therefore, the feature distances between
these images will be large. Eventually, these images will not be
treated as similar during retrieval though they are actually similar.
Thus, rotation invariant curvelet features are needed so that similar
images with different orientations should have similar feature
vectors.


Fig. 3. Three similar images with orientation (a) 0,
(b) 30, and (c) 60.

In the following, we propose an efficient technique to solve
the rotation variant issue. The idea is to rearrange the feature
values based on their dominant orientation. The feature elements
which show the dominant direction are kept at the first position in
the feature vector and the other elements are shifted circularly
relative to the maximum element. This is done for each scale
(a) (c) (b)
Fig. 2. Frequency Spectrum coverage by (a) Gabor and
(b) curvelet.
(b)
u
U
h
U
l
v
(a)
1
1
1
2 3 4 5 6 7 8 9
10
11
12
13
14
15
16
2 3 4
5
6
7
8
s
1


s
2


1
s
4
s
3
563
Authorized licensed use limited to: Monash University. Downloaded on November 10, 2009 at 20:32 from IEEE Xplore. Restrictions apply.
separately because the numbers of subbands at different scales are
different. By analysing the energy distribution among the feature
elements, it is found in Fig. 4(a) that the maximum and the second
maximum elements always appear together. This is because the
energy of the dominant orientation of an image usually spreads
between two neighboring subbands. Therefore, during
reorganizing the feature elements, these two maximum feature
elements are kept together in the reorganized feature vector
preserving their original relative order. For example, consider the
feature elements of the 0 oriented image of Fig. 3(a) at scale 2.
The maximum and the second maximum mean energies are found
at subbands 3 and 2, respectively. Therefore, the mean energies at
scale 2 are rearranged as,

{
s2b2
,
s2b3
, . . .,
s2b8
,
s2b1
} new arrangement of means
{
s2b1
,
s2b2
, . . .,
s2b7
,
s2b8
} previous arrangement of means


Fig. 4. Energy distribution of different images at different
subbands. (a) Before rotation invariance (b) After rotation
invariance.

Note that the first two maximum means of scale 2 appear
together at the first two positions in the new organization and
s2b2
appears before
s2b3
to maintain their original order. The mean
energies at other scales are also reorganized in the similar way. As
the mean energies determine the dominant orientation of an image,
the standard deviations of different subbands are rearranged in the
same order of the means. After everything is restructured, the final
rotation invariant curvelet feature vector is given as,

f = {
s1b1
,
s1b1
,
s2b2
,
s2b2
, ...,
s2b8
,
s2b8
,
s2b1
,
s2b1
,
s3b4
,
s3b4
,
...,
s3b16
,
s3b16
,
s3b1
,
s3b1
, ...,
s3b3
,
s3b3
,
s4b1
,
s4b1
} (7)

Fig. 4(b) shows the rearranged energies of sub-bands of
scale2 and 3. It is clear that the feature values of different images
are more similar in Fig. 4(b) than in Fig. 4(a).
4. EXPERIMENTAL RESULTS

This section compares the texture retrieval performance of curvelet
features with and without rotation invariance. In this experiment,
we use the widely used Brodatz texture database which consists of
112 images of size 640 by 640 pixels. As the performance of the
rotation invariant feature is tested, a database is needed which
consists of sufficient number of similar images rotated at different
orientations. Therefore, each of original 112 images is rotated to
0

, 30

, 60

, 90

, , and 330

. Each rotated image is then cut into a


number of sub images of size 128*128. Thus each original image
produces 188 sub images oriented in different angles and this
image is regarded as the ground truth for all of its sub images. In
total, 21056 128*128 images are created from 112 original Brodatz
texture images. This generated database is used in the experiment.
We apply both rotation variant and invariant curvelet feature
extraction process for each database image. Therefore, each image
is represented and indexed by the two sets of features.
The conventional precision-recall curve is used to evaluate
the retrieval performance. Precision is the ratio of the number of
relevant images retrieved and the total number of retrieved images.
Recall is calculated as the ratio of the number of relevant images
retrieved and the number of total relevant images.
As the ground truth of the database images are known, each
of them is used as a query. From each query, precisions in
percentages are measured at 10 levels of recall percentages. The
average precisions are calculated from all of 21,056 queries at each
recall level. Fig. 5 shows the comparison between the retrieval
performance of rotation variant and invariant curvelet features.
0
10
20
30
40
50
60
70
80
90
10 20 30 40 50 60 70 80 90 100
Recall
P
r
e
c
i
s
i
o
n
Rotation Variant
Rotation Invariant

Fig. 5. Average retrieval performance of rotation invariant and
variant curvelet features.

Fig. 5 clearly shows that the curvelet feature with rotation
normalization significantly outperforms the curvelet feature
without rotation normalization. The reason is that rotation variant
curvelet feature fails to identify the similar images with
orientations different from the query, whereas rotation invariant
curvelet feature can do it. This is verified in all the examples in
Fig. 6.
Fig. 6 shows few retrieval results with both rotation invariant
(top snapshots) and rotation variant curvelet feature (bottom
snapshots). In each case, the top left image is the query and first 30
retrieved images are shown. In Fig. 6(a), the query image is a man-
made texture. It shows that all of the first 30 retrieved images by
rotation invariant feature are similar but oriented in different
directions. In contrast, rotation variant feature retrieves images
oriented only in the same direction of the query and fails to
0 2 4 6
mean energy
(a) (b)
0 2 4 6
s2b1
s2b2
s2b3
s2b3
s2b4
s2b5
s2b7
s2b8
s3b1
s3b2
s3b3
s3b4
s3b5
s3b6
s3b7
s3b8
s3b9
s3b10
s3b11
s3b12
s3b13
s3b14
s3b15
s3b16
mean energy
Image of Fig. 3(c)
Image of Fig. 3(b)
Image of Fig. 3(a)
564
Authorized licensed use limited to: Monash University. Downloaded on November 10, 2009 at 20:32 from IEEE Xplore. Restrictions apply.
retrieve other similar images with different orientations. The
difference between the performances of these two features is more
significant in retrieving natural textures. Fig. 6(b-d) use natural
texture images as queries which are difficult to retrieve. The
results show that the rotation invariant feature gives significantly
good retrieval results in Fig. 6(b-d), while the rotation variant
feature retrieves only few similar images in all cases. In contrast to
the rotation variant curvelet feature fails to retrieve natural texture
images, the rotation curvelet invariant feature successfully
retrieves them. All these examples undoubtedly prove that the
rotation invariant curvelet feature has better retrieval performance
than the rotation variant curvelet feature.

5. CONCLUSION

Rotation invariance is one of the key issues for any texture
descriptor. Texture features extracted from spectral transform are
usually not rotation invariant due to scale and subband distribution.
This paper has proposed an efficient and effective way of
normalizing curvelet features to extract rotation invariant texture
features. The method can be used for normalization of other multi-
resolution spectral features. It has two fold advantages. Firstly, it
can avoid expensive online matching as used in MPEG. Secondly,
the retrieval performance is significantly improved. The
experimental results show that the rotation invariant feature has
considerably outperformed the rotation variant feature, especially
the performance of rotation invariant curvelet feature has been
shown significantly promising in retrieving natural texture images.
Therefore, this feature has very good potential in retrieval of real
world images as they consist of natural textures. Currently, we are
investigating the application of rotation invariant curvelet feature
in region based retrieval and semantic learning of natural images.
Scale invariance issue is still unsolved and will be addressed in our
future research work.
6. REFERENCES

[1] F. Long, et al., Fundamentals of Content-based Image
Retrieval, in Multimedia Information Retrieval and
Management, D. Feng Eds, Springer, 2003.
[2] M. Tuceryan and A. K. Jain, Texture Analysis, in The
Handbook of Patt. Recog. and Computer Vision, 2nd Ed.,
World Scientific Publishing Co., 1998.
[3] B.S. Manjunath et al., Introduction to MPEG-7, John Wiley
& Son Ltd., 2002.
[4] S. Bhagavathy and K. Chhabra, A Wavelet-based Image
Retrieval System, Technical ReportECE278A, Vision
Research Laboratory, University of California, Santa Barbara,
2007.
[5] M. N. Do, Directional Multiresolution Image
Representations, PhD Thesis, EPFL, 2001.
[6] J. Starck, et al., The Curvelet Transform for Image
Denoising, IEEE Trans. on Image Processing, 11(6), 670-
684, 2002.
[7] A, Majumdar, Bangla Basic Character Recognition Using
Digital Curvelet Transform, Journal of Patt. Recog.
Research, 1: 17-26, 2007.
[8] L. Ni and H. C. Leng, Curvelet Transform and Its
Application in Image Retrieval, 3
rd
Int. Symp. on
Multispectral Image Proc. and Patt. Recog., Proceedings of
SPIE, vol. 5286, 2003.
[9] I. J Sumana et al., Content based image retrieval using
curvelet transform, Accepted to be appeared in the Proc. of
Int. workshop on MMSP, Oct, 2008.
[10] D. Zhang et al., Content-based image retrieval using Gabor
texture features, In Proc. of First IEEE PCM, pp. 392-395,
Sydney, Australia, Dec 2000.
[11] E. Candes et al., Fast Discrete Curvelet Transforms,
Multiscale Modeling and Simulation, 5(3), 861-899, 2006.
Fig. 6. First 30 retrieved images by different queries. Top and bottom rows are the results from rotation invariant and rotation variant
curvelet features. Images are organized from left to right and top to bottom in increased distances from the query.
(b) D15 (Straw)
as the query
(a) D47 (Woven brass)
as the query
(c) D22 (Reptile skin)
as the query
(d) D37 (water) as
the query
565
Authorized licensed use limited to: Monash University. Downloaded on November 10, 2009 at 20:32 from IEEE Xplore. Restrictions apply.

You might also like