You are on page 1of 10

Image Retrieval Based on Similarity Score Fusion from Feature Similarity Ranking Lists

Mladen Jovi, Yutaka Hatakeyama, Fangyan Dong, and Kaoru Hirota c


Dept. of Computational Intelligence and Systems Science Interdisciplinary Graduate School of Science and Engineering Tokyo Institute of Technology G3-49, 4259 Nagatsuta, Midori-ward, Yokohama 226-8502, Japan {jovic, hatake, tou, hirota}@hrt.dis.titech.ac.jp

Abstract. An image similarity method based on the fusion of similarity scores of feature similarity ranking lists is proposed. It takes an advantage of combining the similarity value scores of all feature types representing the image content by means of dierent integration algorithms when computing the image similarity. Three fusion algorithms for the purpose of fusing image feature similarity scores from the feature similarity ranking lists are proposed. Image retrieval experimental results of the evaluation on four general purpose image databases with 4,444 images classied into 150 semantic categories reveal that a proposed method results in the best overall retrieval performance in comparison to the methods employing single feature similarity lists when determining image similarity with an average retrieval precision higher about 15%. Compared to two well-known image retrieval system, SIMPLicity and WBIIS, the proposed method brings an increase of 4% and 27% respectively in average retrieval precision. The proposed method based on multiple criteria thus provides better approximation of the users similarity criteria when modeling image similarity.

Introduction: Image Similarity Computation

One of the most important issues in the present image retrieval is modeling image similarity [1], [12]. Image is typically modeled as a collection of low-level image features [11], [1]. Image similarity computation involves the application of dierent feature similarity measures on the extracted image features. Based on the overall image similarity to the (users) query image, the database images are ranked in a single similarity ranking list. Finally, the most similar images from the similarity ranking list are presented to the end user [14]. In the recent survey of 56 Content Based Image Retrieval (CBIR) systems [13], most of the systems are employing a single similarity values ranking list. As an alternative approach to this one, the application of the three dierent feature similarity ranks score fusion algorithms(cf. 2) when ranking overall image similarity in terms of partial feature similarities without using human relevance judgments is approached. The focus on how the creation of the nal similarity values ranking list between a query image q and database images from
L. Wang et al. (Eds.): FSKD 2006, LNAI 4223, pp. 461470, 2006. c Springer-Verlag Berlin Heidelberg 2006

462

M. Jovi et al. c

the low-level image features similarity ranking lists is emphasized. Combining dierent feature similarity score ranking using data fusion methods based on multiple criteria in a rank aggregation [15] manner is done. Fusion of feature similarity value scores in case of two algorithms is derived in a non-heuristical manner. The third approach is done in a heuristical manner. The empirical evaluation is done on four general purpose image databases containing 4,444 images in 150 semantic categories. In total, more than 66,000 queries are executed, based on which several performance measures are computed. In 2, proposed feature similarity ranking lists fusion algorithms are described. An empirical evaluation and comparison of the proposed algorithms is demonstrated in 3.

Feature Similarity Ranking Score Fusion Algorithms

The employment of a single similarity ranking list is done by most of the CBIR systems surveyed in [13](Fig. 1(a)). As an alternative to this, a calculation of the overall image similarity between the query image q and all database images based on a multi feature similarity ranking lists is proposed. Image similarity computation involves the application of dierent feature similarity measures on the extracted image features. Based on the feature similarity values(feature similarities) between query image q and all database images, with a help of image retrieval techniques, the feature similarity ranking lists are determined. When calculating the feature similarity ranking lists, a ranking position(feature similarity scores) of each database image i with respect to the query image q is
database images
color features shape features texture features

database images color shape texture similarity features features features ranking list

color similarity ranking list

color image similarity


shape similarity ranking list

computing image similarity

retrieved images

shape image similarity texture image similarity


color features shape features texture features texture similarity ranking list

merging similarity scores

retrieved images

color shape texture features features features query image


(a)

query image

(b)

Fig. 1. (a) Computing the overall image similarity based on low-level image features in most of the CBIR systems surveyed in [13]; (b) A new approach of computing image similarity based on low-level image features by employing multi feature similarity ranking lists

Image Retrieval Based on Similarity Score Fusion

463

uniquely determined. Number of features used for the image representation will determine the number of feature similarity lists. Integrated ranking of the multi feature similarity ranking lists is then determined by the the fusion algorithms. Such a framework is illustrated on the Fig. 1(b). The fusion is done in such a way to optimize retrieval performance. Three feature similarity score fusion algorithms are proposed. As for the image feature representation, color, shape and texture image features are chosen. Color feature is represented by the color moments [3]. Shape feature is represented by the edgedirection histogram [4]. Texture feature is represented by the texture neighborhood [9]. Color feature similarity in is measured by the weighted Euclidean distance [2], while shape and texture feature similarity are measured with a help of city-block distance. Let us for a given query image q, with respect to all database images, dene three feature similarity ranking lists: color feature similarity ranking list (CFSRL), shape feature similarity ranking list(SFSRL) and texture feature similarity ranking list(TFSRL). Next, let us assume that at CFSRL, SFSRL and TFSRL top ve positions, the images with identiers {a, b, c, d, e} are ordered as following: CFSRL = (a, b, c, e, d) ; SFSRL = (d, a, c, e, b) ; TFSRL = (b, a, c, e, d) .(1)

Inverse Rank Position Algorithm(IRP) is a rst algorithm to merge the multi feature similarity lists into a single overall similarity ranking list. The inverse of the sum of inverses of the feature similarity rank scores for each individual feature for a given image from relevant feature similarity ranking lists is used( 3). IRP(q,i) = 1
n 1 feature similarity=1 rank positionfeature
similarity

(2)

feature similarity {CFSRL, SFSRL, TFSRL};

i {a, b, c, d, e}; n = 3. (3)

Example. According to the sample feature similarity ranking lists given in 2, the overall similarity ranking of the images {a, b, c, d, e} with respect to the query image q is calculated as following: IRP (a) = 1 ; 2 IRP (b) = 10 ; 19 IRP (c) = 1; IRP (d) = 5 ; 7 IRP (e) = 4 .(4) 3

= e > c > d > b > a, meaning that image a is the most relevant image to the query q, image b is the next relevant etc(Fig. 2). Borda Count Algorithm(BC) taken from social theory in voting [16] is a second algorithm to merge the multi feature similarity lists into a nal overall similarity ranking list. An image with the highest rank on each of the feature similarity ranking lists (in an nway vote) gets n votes. Each subsequent image gets one vote less (so that the number two gets n-1 votes, number three n-2

464

M. Jovi et al. c
color feature similarity ranking list shape feature similarity ranking list texture feature similarity ranking list

Rank Image
1 2 3 4 5

Rank Image
1 2 3 4 5

Rank Image
1 2 3 4 5

a b c e d

d a c e b

b a c e d

Inverse Rank Position Algorithm


final image similarity ranking list

Rank Image
1 2 3 4 5

a b d c e

Fig. 2. Ordering of the rst ve retrieved images based on the color-shape-texture features merged by Inverse Rank Position Algorithm

votes etc.). Finally, for each database image, all the votes from all of the three feature similarity ranking lists are summed up and the image with the highest number of votes is ranked as the most relevant to the query image, winning the election.
n

BC(q,i) =
feature similarity=1

rank positionfeature similarity .

(5)

feature similarity {CFSRL, SFSRL, TFSRL};

i {a, b, c, d, e}; n = 3. (6)

Example. According to the sample feature similarity ranking lists given in 2, the overall similarity ranking of the images {a, b, c, d, e} with respect to the query image q is calculated as following: BC (a) = 5; BC (b) = 8; BC (c) = 9; BC (d) = 11; BC (e) = 12. (7)

= e > d > c > b > a, meaning that image a is the most relevant image to the query q, image b is the next relevant etc(Fig. 3). Leave Out Algorithm(LO) is a third algorithm to merge the multi feature similarity lists into a single overall similarity ranking list. The elements are inserted into the nal similarity ranking list circularly from three feature similarity ranking lists(see Algorithm 1). Repeating elements from feature similarity ranking lists are not inserted into the nal similarity ranking list if already appeared there. Order of the next selected element from the feature similarity ranking lists to be inserted into the nal similarity ranking list can be arbitrary and will therefore inuence on the retrieval precision. In the experimental part, the

Image Retrieval Based on Similarity Score Fusion


color feature similarity ranking list shape feature similarity ranking list texture feature similarity ranking list

465

Rank Image
1 2 3 4 5

Rank Image
1 2 3 4 5

Rank Image
1 2 3 4 5

a b c e d

d a c e b

b a c e d

Borda Count Algorithm


final image similarity ranking list

Rank Image
1 2 3 4 5

a b c d e

Fig. 3. Ordering of the rst ve retrieved images based on the color-shape-texture features merged by Borda Count Algorithm

Algorithm 1. Leave Out Algorithm


Input: Q, I : images (e.g., a query image and a database image); {Q[q] : 1 q N }, {I[i] : 1 i N }, respectively; feature similarity ranking lists: feature similarity {CFSRL, SFSRL, TFSRL}; Output: overall image similarity ranking list for {(Q[q], I[i]) : 1 q, i N }; 1: for (q 1 to N ) do 2: for (i 1 to N ) do {3 feature similarity values computed for a pair of images } 3: compute Feature Similarities(Q[q], I[i]); 4: end for 5: end for 6: for (iteration 1 to N ) do 7: get the image[I] from CFSRL with highest rank {nal similarity list} / 8: image[I] insert nal image similarity list; 9: get the image[I] from SFSRL with highest rank {nal similarity list} / 10: image[I] insert nal image similarity list; 11: get the image[I] from TFSRL with highest rank {nal similarity list} / 12: image[I] insert nal image similarity list; 13: end for

order CFSRL, SFSRL, TFSRL is chosen, as comparing to the other permutations of the feature similarity ranking lists when employing Leave Out Algorithm it provides the optimal retrieval precision. Therefore, this ranking score fusion algorithm is rather heuristical compared to the previous two. In such a way, in each of the similarity score merging iteration, only one element is inserted from each feature similarity list, as illustrated on Fig. 4.

466

M. Jovi et al. c
color feature similarity ranking list shape feature similarity ranking list texture feature similarity ranking list

Rank Image
1 2 3 4 5

Rank Image
1 2 3 4 5

Rank Image
1 2 3 4 5

a b c e d

d a c e b

b a c e d

Leave Out Algorithm final image similarity ranking list

Rank Image
1 2 3 4 5

a d b c e

Fig. 4. An example of the ordering of the rst ve retrieved images based on the color-shape-texture features merged by Leave Out Algorithm

Example. According to the sample feature similarity ranking lists given in 2, the overall similarity ranking of the images {a, b, c, d, e} with respect to the query image q is calculated as following: LO (iter#1) > a; LO (iter#2) > d; LO (iter#3) > b; LO (iter#5) e; (8) (9)

LO (iter#4) c;

= a < d < b < c < e, meaning that image a is the most relevant image to the query q, image d is the next relevant etc(Fig. 4).

Experimental Evaluation

All the experiments are performed on AMD Athlon Processor Machine 64-bit Processor Machine, with 1 GByte RAM Memory. Four standard test databases are used when conducting experiments, containing 4,444 images, divided into 150 semantic categories C-1000-A database [5]; C-1000-B database [5]; V-668 database [8] and B-1776 database [10]. All the four test databases originate from the well-known image collections, used for the evaluation of the image retrieval systems [6], [7]. Partitioning of each database into semantic categories is determined by the creators of the database, and reects the human perception of image similarity. The semantic categories dene the ground truth. For a given query image, the relevant images are considered only those belonging to the same semantic category as the query image. This implies that the number of relevant images for a given query image equals the number of images in the category to which that image belongs. The performance measures are: (1) precision [P.], (2) weighted precision[W. P.] and (3) average rank[A.R.].

Image Retrieval Based on Similarity Score Fusion

467

These the most frequently used measures of the image retrieval performance [1]. All the performance measures are computed for each query image, based on the given ground truth. Since each image in each test database is used as a query, all the performance measures are averaged for each test database. For each algorithm, average values of retrieval precision (P.), weighted precision (W.P.) and average rank (A.R.) are provided in Table 1 and Table 2. For a given query image, precision is computed as the fraction of the relevant images that are retrieved. Weighted precision is computed in a similar way, however, the higher a relevant image is ranked, the more it contributes to the overall weighted precision value [7]. This means that, unlike the precision, weighted precision takes into account the rank of the retrieved relevant images as well. Average rank is simply the average of the rank values for the relevant images. In addition to the global image representation, ve region-based image similarity representations are also experimentally evaluated. Image is initially divided into N N, (N {1, 2, 3, 4, 5}) non-overlapping image regions. From each region color, shape and texture feature are extracted, as described in 2. Therefore, each resolution is uniquely determined by N , that is the number of image regions. 3.1 Experiment Results and Discussion

Initially, three proposed fusion algorithms are compared to each other. Next, a comparison to six conventional image similarity models employing a single overall similarity ranking list of color, shape, texture, color-shape, color-texture and shape-texture image features, respectively, is done. As among image similarity
Table 1. Comparison of the IRP, BC and LO algorithms on C-1000-A, C-1000B, B-1776 and V-668 test databases in the resolutions providing optimal retrieval performance(5 5 resolution, except for the LO algorithm on B-1776 and V-668 test databases where 1 1 resolution provides the optimal performance)
C-1000-A C-1000-B B-1776 V-668 P.[%] W.P.[%] A.R. P.[%] W.P.[%] A.R. P.[%] W.P.[%] A.R. P.[%] W.P.[%] A.R. IRP 49.04 BC 44.34 LO 20.18 61.72 203.8 39.13 56.5 217.9 36.63 31.57 435.0 18.09 52.88 246.0 73.89 48.42 252.5 69.12 26.57 449.7 30.30 85.69 40.06 41.23 82.00 46.0 38.04 45.56 158.0 20.98 63.4 162.6 58.28 170.2 38.02 256.6

Table 2. Comparison of the average retrieval precision to the system employing single overall similarity ranking lists: color-texture image feature combination (Resolution 5) as well as two advanced well known image retrieval systems: SIMPLicity and WBIIS Image Number of Database Images C-1000-A 1000 C-1000-B 1000 B-1776 1776 V-668 668 Color SIMPLicity WBIIS IR BC LO Texture 43.00 45.3 22.6 49.04 44.34 20.18 22.05 39.13 36.63 18.09 50.09 73.89 69.12 30.3 25.01 41.23 38.04 20.98

468

M. Jovi et al. c

models mentioned color-texture feature based image similarity model performed optimally, only results for this model are reported. Next, proposed fusion algorithms are compared to two representative well-known image retrieval systems, WBIIS and SIMPLicity, also employing a single similarity ranking list. Finally the conclusions are drawn. Evaluation of the Fusion Algorithms. A shown in Tables 1 and 2, with respect to the retrieval eectiveness measured by average retrieval precision, the Inverse Rank Position algorithm performs optimally on all databases. The difference between Inverse Rank Position and Borda Count algorithm is smaller than the dierence between Borda Count and Leave Out algorithms. As seen in Tables 1 and 2, the retrieval performance of the Borda Count is about twice higher than Leave Out algorithm, independently from the database. With respect to the region-based modeling, in most cases, partitioning the image into more regions(Resolution 5 ) improves the retrieval performance in case of all fusion algorithms on all databases, according to the expectations. However, this is not the case for the Brodatz-1776 and Vistex-668 databases, where in case of Leave Out algorithm global image representation provides the optimal retrieval performance. Comparing Inverse Rank Position and Leave Out algorithms, the highest dierence in average precision is in the case of Brodatz-1776 database, reaching the dierence about 43%. On other three databases, the dierence in average precision is about 25%. Next, as shown in Table 2, when compared to the best out of the six conventional image similarity methods employing colortexture image features while Inverse Rank Position and Borda Count have higher average retrieval precision on all databases, this is not the case for Leave Out algorithm. This fact might be explained to the heuristical nature of feature similarity ranking lists fusion into the nal similarity ranking list. Finally, as shown in Table 2, on C-1000-A testing database, when compared to two state-of-art CBIR systems, Inverse Rank Position algorithm reaches the values of SIMPLicity system, while WBIIS lags behind both Inverse Rank Position and Borda Count algorithms. Among three proposed algorithms, Inverse Ranking provides the optimal retrieval performance. Possible explanation is that equal emphasis is put on any of the three image features as important image similarity modeling elements, compared to other two algorithms, which are more heuristics based. In particular to the Leave Out algorithm, which in some cases might be only based either on color, or shape or texture image feature.

Conclusion

Image similarity model based on the fusion of the feature similarity ranking list scores is proposed. It takes an advantage of combining them by means of the data fusion algorithms when computing the overall image similarity. Three fusion algorithms are proposed for this reason. The evaluation on the four test databases, containing 4,444 images, in 150 semantic categories is done, based on which the (weighted) precision and average rank are computed.

Image Retrieval Based on Similarity Score Fusion

469

Eectiveness of the three fusion algorithms measured by the retrieval performance is compared to the conventional systems using a single similarity ranking list. In addition, the eectiveness of two advanced CBIR systems(SIMPLicity and WBIIS) is compared, too. As shown in the experiments, data fusion methods based on the multi feature similarity lists provides better approximation of the users similarity criteria than a single feature similarity list. The out performance is with an average retrieval precision higher about 15%. Compared to SIMPLicity and WBIIS, this is also the case. Possible explanation of this is that the combining dierent rankings using data fusion methods in general based on multi criteria(feature similarity lists) provides better approximation of the users similarity criteria than a single feature similarity list. Thus the assessment of such approach in ranking image similarity in terms of partial feature similarities without using human relevance judgments should be carefully considered when modeling image similarity. Reported improvements might be of the signicance for the various application domains covering dierent image domain(s). The experimental results are only reported for images covering (1) color homogeneous structures and (2) unconstrained color photographs image domains. Large variance among the visual characteristics of the images in all testing databases allows for the general conclusions about the performance of the proposed algorithms. However, the applicability is not strictly concerned to two above mentioned domains. Validity of the results is also applicable to any databases containing images with large variance among the visual characteristics. These could be e.g. medical image or nger prints databases. Additionally, an investigation of extending testing data sets from the image to video data are already going on.

Acknowledgments
The authors would like to thank Dr Zoran Steji, from Ricoh Co., Ltd., Japan and c Thomas Seidl from RWTH Aachen University, for their comments, constructive suggestions and valuable research discussions on content based image retrieval as well as the source codes of the image features.

References
1. A. W. M. Smeulders, M. Worring, S. Santini, A. Gupta, R. Jain, Content-based image retrieval at the end of the early years, in: IEEE Transactions in Pattern Analysis and Machine Intelligence, 22(12), (2000) 13491380. 2. M. Stricker, M. Orengo, Similarity of color images, in: Storage and Retrieval for Image and Video Databases, Proc. SPIE 2420, (1995) 381-392. 3. M. Stricker, M. Orengo, Similarity of color images, in: Proc. of IS&T and SPIE Storage and Retrieval of Image and Video Databases III, San Jose, CA, USA, (1995) 381-392. 4. S. Brandt, J. Laaksonen, E. Oja, Statistical shape features in content-based image retrieval. in: Proc. of 15th Int. Conf. on Pattern Recognition (ICPR-2000), Vol. 2. Barcelona, Spain, (2000) 1066-1069.

470

M. Jovi et al. c

5. Corel Corporation, Corel Gallery 3.0., 2000 Available: http://www3.corel.com/ 6. Z. Steji, Y. Takama, K. Hirota, Genetic algorithm-based relevance feedback for c image retrieval using Local Similarity Patterns. in: Information Processing and Management, 39(1), (2003) 1-23. 7. J. Z. Wang, J. Li, G. Wiederhold, SIMPLIcity: Semantics-sensitive Integrated Matching for Picture LIbraries, in: IEEE Transactions on Pattern Analysis and Machine Intelligence, 23(9), (2001) 947-963. 8. Massachusetts Institute of Technology, Media Lab, Vision Texture Database, 2001, Available: ftp://whitechapel.media.mit.edu/pub/VisTex/. 9. J. Laaksonen, E. Oja, M. Koskela, S. Brandt, Analyzing low-level visual features using content-based image retrieval, in: Proc. 7th Int. Conf. on Neural Information Processing (ICONIP00), Taejon, Korea, (2000) 1333-1338. 10. P. Brodatz, Textures: a photographic album for artists and designers, New York: Dover Publications, (1966) 40-46. 11. J.M. Jolion, Feature Similarity, in: M.S. Lew (Ed.), Principles of Visual Information Retrieval, Springer, London, (2001) 121-143. 12. V. Castelli, L.D. Bergman: Digital imagery: fundamentals, in V. Castelli, L.D. Bergman(Eds.), Image Databases: Search and Retrieval of Digital Imagery, Wiley, New York, USA, (2002) 1-10. 13. R.C. Veltkamp, M. Tanase Content-Based Image Retrieval Systems: A Survey Technical Report UU-CS-2000-34, (2000). 14. Y. Rui, T.S. Huang, M. Ortega, S. Mehrotra: Relevance feedback: a power tool for interactive content-based image retrieval in: IEEE Trans. on Circuits Syst. Video Technol. 8 (5), (1998) 664 655. 15. Dwork, C., Kumar, R., Naor, M., Sivakumar, D.: Rank aggregation methods for the Web, in: Proceedings of 10th International World Wide Web conference , Hong Kong, (2001) 613 622. 16. Roberts, F. S. Discrete mathematical models with applications to social, biological, and environmental problems, Englewood Clis, NJ: Prentice Hall

You might also like