Professional Documents
Culture Documents
Vector Machines
Martin Law
Outline
History of support vector machines (SVM)
Two classes, linearly separable
08/11/05
History of SVM
SVM is a classifier derived from statistical
learning theory by Vapnik and Chervonenkis
SVM was first introduced in COLT-92
SVM becomes famous when, using pixel maps
as input, it gives accuracy comparable to
sophisticated neural networks with elaborated
features in a handwriting recognition task
Currently, SVM is closely related to:
08/11/05
Many decision
boundaries can
separate these
two classes
Which one should
we choose?
Class 2
Class 1
08/11/05
Class 2
Class 1
08/11/05
Class 2
Class 1
Class 2
Class 1
08/11/05
m
CSE 802. Prepared by Martin
Law
08/11/05
w can be recovered by
08/11/05
Compute
and
classify z as class 1 if the sum is positive, and class 2
otherwise
08/11/05
A Geometrical Interpretation
Class 2
8=0.
6
10=0
5=0
4=0
9=0
Class 1
08/11/05
7=0
2=0
1=0.8
6=1.4
3=0
CSE 802. Prepared by Martin
Law
10
Some Notes
08/11/05
11
Class 1
08/11/05
12
We want to minimize
08/11/05
13
w is also recovered as
The only difference with the linear separable
case is that there is an upper bound C on i
08/11/05
14
Why transform?
Linear operation in the feature space is
equivalent to non-linear operation in input
space
The classification task can be easier with a
proper transformation. Example: XOR
08/11/05
15
(.)
08/11/05
( )
( ) ( ) ( )
( )
( )
( )
( ) ( )
( ) ( )
( )
( ) ( )
( )
( )
16
Example Transformation
08/11/05
17
Kernel Trick
08/11/05
18
08/11/05
19
08/11/05
20
Original
With
kernel
function
08/11/05
21
Original
With
kernel
function
08/11/05
22
Example
08/11/05
23
Example
08/11/05
24
Example
Value of discriminant function
class 1
08/11/05
class 1
class 2
2
25
Multi-class Classification
SVM is basically a two-class classifier
One can change the QP formulation to allow
multi-class classification
More commonly, the data set is divided into
two parts intelligently in different ways and
a separate SVM is trained for each way of
division
Multi-class classification is done by combining
the output of all the SVM classifiers
Majority rule
Error correcting code
Directed acyclic graph
CSE 802. Prepared by Martin
08/11/05
Law
26
Software
A list of SVM implementation can be found
at http://www.kernelmachines.org/software.html
Some implementation (such as LIBSVM)
can handle multi-class classification
SVMLight is among one of the earliest
implementation of SVM
Several Matlab toolboxes for SVM are also
available
08/11/05
27
08/11/05
28
Demonstration
08/11/05
29
Strengths
Weaknesses
08/11/05
30
Penalty
08/11/05
Penalty
Value off
target
CSE
802. Prepared by Martin
Law
Value off
target
31
08/11/05
Law
32
08/11/05
33
08/11/05
34
Conclusion
SVM is a useful alternative to neural
networks
Two key concepts of SVM: maximize the
margin and the kernel trick
Many active research is taking place on
areas related to SVM
Many SVM implementations are available
on the web for you to try on your data set!
08/11/05
35
Resources
http://www.kernel-machines.org/
http://www.support-vector.net/
http://www.support-vector.net/icml-tutorial
.
pdf
http://www.kernel-machines.org/papers/tuto
rial-nips.ps.
gz
http://www.clopinet.com/isabelle/Projects/
SVM/applist.html
08/11/05
36