You are on page 1of 39

Fast multiresolution image querying

CS474/674 Prof. Bebis

Paper
Jacobs, A. Finkelstein, and D. Salesin, Fast multiresolution image quering, Proceedings of SIGGRAPH, pp. 277-286, 1995

Problem
Search an image database to retrieve images that are similar to a query image.
query by content or query by example
Typically, the K best matches are reported.

Challenges
What features to use? How to tolerate image distortions? How to organize the data? How to search fast? How to reduce storage requirements?

Image Distortions
This study considers two types of image distortion:
A low-resolution image from a scanner or video camera. A rough sketch of the image painted by the user.
painted low resolution target

Tolerating Image Distortions


Need to design an effective image query metric that can accommodate image distortions as well as distinguish the target image from the rest of the database.
The metric should be tunable to better account for the types of image distortions anticipated in the query image.

Tolerating Image Distortions (contd)


Traditional metrics based on the L1 and L2 norms cannot handle inexact matching and are time consuming.
L1
L2

Q: query T: target

Experiments (i.e., this paper) using these metrics have shown that the target image is in the highest 1% of the retrieved images only 3% of the time.

Fast Retrieval
Retrieval should be fast enough to handle tens of thousands of images at interactive rates.

Fast metric computation

Efficient image representation

Efficient database organization

Proposed Method: Key Ideas


Multi-resolution image decomposition using Haar wavelets. Compute a signature for each image, based on (truncated and quantized) Haar wavelet coefficients.
Signature has low storage requirements.

Proposed Method: Key Ideas (contd)


Compute image similarity using a metric that compares how many significant wavelet coefficients the query has in common with potential targets.
Metric can be tuned (i.e., using statistical analysis) to accommodate specific image distortions. Organize data properly to facilitate fast computation of the metric and speed-up search.

User Interface
Returns 20 highestranked targets at interactive rates!. Can process a 128 x 128 image query on a database of 20,000 images in under 0.5 seconds*.
*Faster

processing times should be possible using current technology!

Why using wavelets?


The use of wavelets allows the resolutions of the query and target images to be different . Wavelet decompositions are fast to compute and yield a small number of coefficients.
The signature can be extracted from a waveletcompressed version of the image directly.

Components of the metric


Color space:
Experimented with RGB, HSV, and YIQ color spaces. Wavelet transform was applied on each color channel separately. YIQ gave the best performance (i.e., for their data).

Components of the metric


Wavelet type:
Haar wavelets are the fastest to compute and simplest to implement. Other types of wavelets might give better results but at a higher cost.

Components of the metric (contd)


Decomposition type:
Experimented both with standard and non-standard decompositions for all three color spaces. Standard decomposition worked best (i.e., both for scanned and painted queries).

Components of the metric (contd)


Truncation:
128 x 128 image 1282 = 16,384 wavelet coefficients for each color channel! Keep only the coefficients with largest magnitude.
Accelerates the search for a query. Reduces storage requirements.

Improves discriminatory power of metric!


The 60 largest coefficients in each channel worked best for painted queries. The 40 largest coefficients in each channel worked best for scanned queries.

Components of the metric (contd)


wavelet decomposition truncated coefficients

Components of the metric (contd)


Quantization:
Quantize each of the retained coefficients into three levels: +1, 0 and -1
Large positive coefficients are quantized to +1 Large negative coefficients are quantized to -1

Improves discriminatory power of metric!


The mere presence or absence of these coefficients appears to be more important than their precise magnitudes.

Improves speed and reduces storage requirements.

Components of the metric (contd)


truncated coefficients truncated and quantized coefficients

Components of the metric (contd)


Normalization:
Basis functions are normalized so they become orthonormal to each other (see lecture slides on wavelets).

Wavelet-based metric
Suppose Q and T represent a single channel of the wavelet decomposition of the query and target images. Let Q[0, 0] and T[0, 0] be the scaling function coefficients (i.e., average intensity of that channel). [i, j ] and T [i, j ] represent the truncated, quantized Let Q wavelet coefficients of Q and T (i.e., -1,0,1).

[i, j ] T [i, j ] 0 ) (assume Q

wi,j : weights (to be determined)

Simplifying the metric (contd)


Replace

[i, j ] T [i, j ]) (Q

with

[i, j ] T [i, j ]) (Q

(new metric was found to be as effective as the previous one)

Simplifying the metric (contd)


Group terms together into "buckets" so that only a small number of weights wi, j needs to be determined experimentally.

i,j

Simplifying the metric (contd)


[i , j ] 0 Consider only the terms for which Q

Even faster computation. Allows for a query without much detail to match a very detailed target image.

i,j

Fast metric implementation (depends to data organization)


The majority of database images will not match the query. It would be quicker to count the number of matching coefficients than the number of mismatching coefficients.

Fast metric implementation (contd)

Fast metric implementation (contd)


The term does not depend on the target image. Ignore it for the purpose of ranking the different target images:

Example

Algorithm
Preprocessing
(1) Perform a standard 2D Haar wavelet decomposition of every image in the database. (2) Store T[0,0] for each color channel and the indices and signs of the m wavelet coefficients of largest magnitude. (3) Organize the indices for all the images into a single data structure to optimize searching.

Algorithm (contd)
Querying
(1) Perform the same wavelet decomposition on the query image. (2) Throw away all but the average color and the largest m coefficients. (3) Compute the score of each target image using the above equation.

Data Organization Search Arrays


To optimize the search process, the m coefficients from every image are organized into a set of six 2D arrays (i.e. search arrays). There is an array for every combination of sign (+ or -) and color channel (Y, I, and Q):

c D [i, j ] contains a list of all images T having a large positive wavelet coefficient T[i, j] in color channel c.

Querying Using Search Arrays


Compute a score for each target image by looping through each color channel c.

Return top 20 matches

Querying Using Search Arrays (contd)


Steps
(1) Compute the difference between the querys average intensity in that channel Qc[0, 0] and those in the database. (2) For each of the m nonzero, truncated wavelet coefficients Qc[i, j], go through the list corresponding to Dc+[i, j] or Dc- [i, j] (i.e., depending on the sign of Qc[i, j]). (3) Update the score of each image found in those lists.

Weights wij
The function bin(i, j) groups different coefficients into a small number of bins (i.e., 6 bins per color channel):

bin(i, j) = min(max(i, j), 5)


Each bin is weighted by some constant w[b]
Weights were determined using a statistical test (see paper).

Examples
Query examples using painted/scanned queries

(ranks for database sizes: 1093 | 20,558)

Examples (contd)
Interactive query examples using painted queries:

(ranks for database sizes: 1093 | 20,558)

Some Results
Success rate:
Lq : proposed metric Percentage of queries whose correct target was ranked among the top 1% of images in a database of 1093 images.

Some Results (contd)


Time requirements:

Lq : proposed metric Average times to match a single query in a database of 1093/20,558 images.

Extension
V. Nikulin and G. Bebis, "Multiresolution Image Retrieval Through Fusion", SPIE Electronic Imaging (Storage and Retrieval Methods and Applications for Multimedia), San Jose, January 2004.

You might also like