Tackling Curse of Dimensionality for Efficient CBIR

Tackling Curse of Dimensionality for Efficient Content Based Image Retrieval
Tackling Curse of Dimensionality for Efficient

Content Based Image Retrieval
Presented By: Dr. Minakshi Banerjee
RCC Institute of Information Technology
Canal South Road, Beliaghata, Kolkata - 700015, West Bengal, India
Wednesday, 01.07.2015

CBIR: What is it?
CBIR: What is it?

I
Retrieval of images based not on keywords or annotations but

based on features extracted directly from the image data.
To aid image retrieval, techniques from statistics, pattern

recognition, signal processing, and computer vision are
commonly deployed.
Feature extraction process may produces high dimensional

feature space.
Although high dimensional feature space reduce semantic gap

but it is difficult to handle high dimensional features while
classification and image retrieval task using similarity
measurement are involved.

Objectives of the paper
Objectives of the paper

I
Tackling the curse of dimensionality by a non-linear mapping

as the most real world data requires nonlinear methods in
order to perform tasks that involve the analysis and discovery
of patterns successfully.
Search space reduction by clustering while considering

optimum number of clusters.
Outlier detection for performance improvement.
One-class support vector machine is proposed for classification

as this classier is biased to the learned concept of a particular
category.

Proposed Method
Proposed Method
Database preprocessing
Test set preparation

Image
Database
Visual
features
extraction
(CSD)
Trai ning set preparation
Q uery
Image
Display
36 nearest
images
Mapping high
dimensional features
space to a lower
dimensional space
using kernel P CA
Query features
in mapped
space
Clustering using P AM
knowing no. of clusters
fro m optimu m silhouette
width plot
Similarity
me asure using
L 1 norm
User interaction (one

time) to mark relevant and
non- relevant
Test samples
accumulation from query
image s belonging cluster
by removing outliers by
SVC (reduc ed database)
Classification
one-class SVM
Automaticall
y select
entire
relevant
images
Training samples accumulation from KP CA

mapped space which are corresponding to all
relevant images
Display
Display 36 nearest
images using L1 norm
Figure 1: Proposed method
Select original
CSD features
vectors
corresponding to
all positive
samples

Proposed Method
Proposed Method: Algorithm
Figure 2: Proposed method

Tackling the curse of dimensionality: What is this and Why?
Tackling the curse of dimensionality : What is this and

Why?
I
It produces a new feature space
With a dimension significantly smaller,
Which comprises a large part of the original Information.
this also allows to de-noise the data.

Kernel Principal Component Analysis (KPCA)
Kernel Principal Component Analysis (KPCA)

I
The basic idea is to first map the input space into a feature
space via a nonlinear map and then compute the principal
components in that feature space.
(x)

Definition
Definition
A reproducing kernel k is a function k : 2 R
I
The domain of k consists of the data patterns

{x1 , x2 , ..., xl }
is a compact set in which the data lives
is typically a subset of RN
Computing k is equivalent to mapping data patterns into a higher

dimensional space F, and then taking the dot product there.
A feature map : RN F is a function that maps the input data
patterns into a higher dimensional space F .

Illustration
Illustration
Using a feature map to map the data from input space into a
higher dimensional feature space F :

Kernel Trick
Kernel Trick
We would like to compute the dot product in the higher
dimensional space, or
(x).(y).
To do this we only need to compute
k(x, y),
since
k(x, y) = (x).(y).
Note that the feature map is never explicitly computed. We
avoid this, and therefore avoid a burdensome computational task.

Example kernels
Example kernels
2
)
Gaussian: k(x, y) = exp( kxyk
2 2
d
Polynomial: k(x, y) = (x.y + c) , c 0
Sigmoid: k(x, y) = tanh( < x.y > +)
Nonlinear separation can be achieved.

Nonlinear Separation
Nonlinear Separation

Mercer Theory
Mercer Theory
Input Space to Feature Space
Necessary condition for the kernel-mercer trick:
k(x, y) =
NF
X
i
i i (x)i (y) and A =
X
i
i ui uiT
NF is equal to the rank of ui uiT - the outer product

is the normalized eigenfunction analogous to a normalized
eigenvector

Mercer :: Linear Algebra
Mercer :: Linear Algebra

Kernel Principal Component Analysis ....
Kernel Principal Component Analysis....

KPCA and Dot Products
KPCA and Dot Products

From Feature Space to Input Space
From Feature Space to Input Space

Projection Distance Illustration
Projection Distance Illustration

Minimizing Projection Distance
Minimizing Projection Distance

Fixed-point iteration
Fixed-point iteration

One Class Support Vector Machine (OCSVM)
One Class Support Vector Machine (OCSVM)

I
OCSVM maps input data into a high dimensional feature

space using a kernel and iteratively finds the maximal margin
hyperplane which best separates the training data from the
origin.
quadratic programming minimization function is

One Class Support Vector Machine (OCSVM)....

Here, (w, ) are a weight vector and offset parameterizing a
hyperplane in the feature space associated with the kernel
Parameter :
I
it sets an upper bound on the fraction of outliers (training

examples regarded out-of-class) and,
it is a lower bound on the number of training examples used
as Support Vector.
Parameter i
I
To prevent the SVM classifier from over-fitting with noisy

data (or to create a soft margin), slack variables i are
introduced to allow some data points to lie within the margin
using Lagrange techniques and using a kernel function for the
dot-product calculations, the decision function becomes:

This method thus creates a hyperplane characterized by w

and which has maximal distance from the origin in feature
space F and separates all the data points from the origin.

Partitioning Around Medoids (PAM)

Medoids is most central objects (the best representatives) of each
cluster
This allows using only dissimilarities d(r, s) of all pairs (r, s) of
the objects.
The aim is to find the clusters C1 , C2 , ..., Ck that minimize the
target function:
k P
P
i=1 rCi
d(r, mi ) where for each i the medoid mi minimizes
P
rCi
d(r, mi )

Partitioning Around Medoids (PAM): Algorithm
Partitioning Around Medoids (PAM): Algorithm

Randomly select k objects m1 , m2 , ..., mk as initial medoids.
Until the maximum number of iterations is reached or no
improvement of the target function has been found do:
I
Calculate the clustering based on m1 , m2 , ..., mk by

associating each point to the nearest medoid and calculate the
value of the target function.
For all pairs (mi , xs ), where xs s a non-medoid point, try to

improve the target function by taking xs to be a new medoid
point and mi o be a non-medoid point.

Number of clusters selection using PAM
Number of clusters selection using PAM

p(r) : the average dissimilarity of the object r and the objects of
the same cluster
q(r) :the average dissimilarity of the object r and the objects of the
neighboring cluster
Silhouette of the object r :the measure of how well is r
clustered
silw(r) =
(q(r) p(r))
[1, 1]
max(p(r), q(r))
When s(r) is:

I close to 1 . . . the object r is well clustered
I close to 0 . . . the object r is at the boundary of clusters
I less than 0 . . . the object r is probably placed in a wrong
cluster

Number of clusters selection using PAM : Example
Number of clusters selection using PAM: Example

In the following figure we have shown that if number of clusters
are k where (k=1,2,3,...,25) then silhouette width for every point is
computed and average is found. Finally, the cluster number which
gives the maximum average silhouette width plot is selected. It is
obvious in the figure that the number of clusters are 3 by taking
upto 25 clusters.


Support Vector Clustering (SVC)
Outliers detection criteria of Support Vector Clustering

(SVC)
Let Xi be a dataset with dimensionality d, SVC computes a sphere
of radius R and center a containing all these data. The
computation of such a smallest sphere is obtained from solving
minimization problem considering Lagrangian formulation which
produces following expression:
kx ak2 = (x.x) 2
N
P
i=1
i (x.xi ) +
N P
N
P
i j (xi .xj ) R2 , where
i=1 j=1
i is the Lagrangian multipliers. To test a data x for outlier, the

necessary condition is i 0.

Support Vector Clustering (SVC)
Outliers detection criteria of Support Vector Clustering (SVC) : Example
Outliers detection criteria of Support Vector Clustering

(SVC): Example

Tackling Curse of Dimensionality for Efficient CBIR

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Tackling Curse of Dimensionality for Efficient CBIR

Uploaded by

Copyright:

Available Formats

Tackling Curse of Dimensionality for Efficient Content Based Image Retrieval