Professional Documents
Culture Documents
Juho Rousu
9. September, 2015
Course content
Topics approximately:
I Introduction to kernel methods
I Supervised learning with kernels.
I Support vector machines.
I Ranking and preference learning
I Unsupervised learning with kernels.
I Kernels for structured data.
I Learning with multiple kernels and targets, structured output
Course logistics
Exercises
Course material
Questions
Kernel methods
I Algorithms are designed that work with arbitrary inner products (or
kernels) between inputs
I The same algorithm will work with any inner product (or kernel)
I This allows theoretical properties of the learning algorithm to be
investigated and the results will carry to all application domains
I Kernel will depend on the application domain; prior information is
encoded into the kernel
What is a kernel?
What is a kernel?
I Formally: a kernel function is an inner product (scalar product, dot
product) denoted by h·, ·i
I If φ(x) = (φ1 (x), φ2 (x), . . . , φD (x))T is a vector in feature space F,
the standard inner product in F is called the linear kernel
X
k(x, z) = h·, ·i = φ(x)T φ(z) = φj (x)φj (z)
j
1 1
d(x, x 0 )2 = kφ(x) − φ(x 0 )k2 =
2 2
1
= (φ(x) − φ(x 0 ))T (φ(x) − φ(x 0 )) =
2
1
= kφ(x)k2 − 2φ(x)T φ(x 0 ) + kφ(x 0 )k2
2
= 1 − φ(x)T φ(x 0 ) = 1 − k(x, x 0 )
Hilbert space*
Formally the underlying space of a kernel is required to be a Hilbert space
A Hilbert space is a real vector space H, with the following additional
properties
I Equipped with a inner product, a map h., .i, which satisfies for all
objects x, x 0 , z ∈ H
I linear: hax + bx 0 , zi = ahx, zi + bhx 0 , zi
I symmetric: hx, x 0 i = hx 0 , xi
I positive semi-definite: hx, xi ≥ 0, hx, xi = 0 if and only if x = 0
I Complete: every Cauchy sequence {hn }n≥1 of elements in H
converges to an element of H
I Separable: there is a countable set of elements {h1 , h2 , . . . , } in H
such that for any h ∈ H and every > 0 khi − hk < .
On this course, typically H = RD , where the dimension D is finite or
infinite. Both cases are Hilbert spaces.
XN N
X N
X
=h vi φ(xi ), vj φ(xj )i = k vi φ(xi )k2 ≥ 0
i=1 j=1 i=1
1 0 1 1 1 −1
A= C= E=
0 1 1 −1 −1 1
1 2 1 0 0 1
B= D= F =
1 1 0 1 1 1
MNIFEMLRIDEGLRLKIYKDTEGYYTIGIGHLLTKSPSLNAAA
KSELDKAIGRNTNGVITKDEAEKLFNQDVDAAVRGILRNAKLK
PVYDSLDAVRRAALINMVFQMGETGVAGFTNSLRMLQQKRWDE
AAVNLAKSRWYNQTPNRAKRVITTFRTGTWDAYKNL
Operations on kernels
Polynomial kernel
Polynomial kernel