0 views

Uploaded by varun3dec1

chap13

- Using SVMs for Time Series Prediction
- link list
- 515335_Olah Data Least Squaree
- Final Astar 15T
- filsafat.doc
- sdf
- Taylor Series
- Some Basic Ideas
- TenagaKerja(#12)
- Sparse Signal Recovery
- A Re-Learning Based Post-Processing Step For Brain Tumor Segmentation From Multi-Sequence Images
- Midterm 2010 Solutions
- schlkopf1998
- kernelPCA_scholkopf.pdf
- Thesis Georgi Nalbantov
- Modul 9 Wahyu Bimantara 175150218113027
- EXP_MATHS
- optimizationhw7
- Course Material on Dg Methods
- vorlesung_kt2_150115.pdf

You are on page 1of 27

INTRODUCTION

TO

MACHNE

LEARNNG

3RD EDTON

ETHEM ALPAYDIN

The MIT Press, 2014

alpaydin@boun.edu.tr

http://www.cmpe.boun.edu.tr/~ethem/i2ml3e

CHAPTER 13:

KERNEL MACHNES

Kernel Machines

3

first

Define the discriminant in terms of support vectors

The use of kernel functions, application-specific

measures of similarity

No need to represent instances as vectors

Convex optimization problems with a unique solution

Optimal Separating Hyperplane

4

if C1

X x , r t where r

t

t t t 1 x

1 if x t

C 2

find w and w0 such that

w T xt w0 1 for r t 1

w T xt w0 1 for r t 1

which can be rewritten as

r t w T xt w0 1

Margin

5

on either side

Distance of x to the hyperplane is w T xt w0

w

r t w T xt w0

We require , t

w

min w subject to r t w T xt w0 1, t

1 2

2

Margin

6

min w subject to r t w T xt w 0 1, t

1 2

2

Lp w t r t w T xt w 0 1

N

1 2

2 t 1

w r w x w 0 t

N N

1 2 t t T t

2 t 1 t 1

Lp N

0 w t r t xt

w t 1

Lp N

0 t r t 0

w 0 t 1

7

Ld w w w T t r t xt w0 t r t t

1 T

2 t t t

w w t

1 T

2 t

r r x x t

1 t s t s t T s

2 t s t

subject to t r t 0 and t 0, t

t

Most t are 0 and only a small number have t >0; they are

the support vectors

8

Soft Margin Hyperplane

9

r t w T x t w0 1 t

Soft error

t

t

New primal is

1

2

2

Lp w C t t t t r t w T x t w0 1 t t t t

10

Hinge Loss

11

0 if y t r t 1

Lhinge (y , r )

t t

1 y t t

r otherwise

n-SVM

12

1 1

min w - n t

2

2 N t

subject to

r t w T x t w 0 t , t 0, 0

Ld r r x x

1 N t s t s t T s

2 t 1 s

subject to

1

t t t

r 0 ,0 t

N t

, t

n

Kernel Trick

13

z = (x) g(z)=wTz

g(x)=wT (x)

The SVM solution

w t r t z t t r t xt

t t

gx w x r x

T t t

x

t T

gx t r t K xt , x

t

Vectorial Kernels

14

Polynomials of degree q:

K x , x x x 1

t T t q

K x, y xT y 1

2

x1y1 x 2 y 2 12

1 2 x1y1 2 x 2 y 2 2 x1 x 2 y1y 2 x12 y12 x 22 y 22

x 1, 2 x1 , 2 x 2 , 2 x1 x 2 , x , x 2

1

2 T

2

Vectorial Kernels

15

Radial-basis functions:

xt x 2

K xt , x exp

2s 2

Defining kernels

16

Kernel engineering

Defining good measures of similarity

String kernels, graph kernels, image kernels, ...

Empirical kernel map: Define a set of templates mi

and score function s(x,mi)

(xt)=[s(xt,m1), s(xt,m2),..., s(xt,mM)]

and

K(x,xt)= (x)T (xt)

Multiple Kernel Learning

17

K x, y K1 x, y K 2 x, y

K x, y K x, y

1 2

m

K x , y i K i x, y

i 1

t s r t r s i K i xt , x s

1

Ld t

t 2 t s i

g(x) t r t i K i xt , x

t i

t i

Multiclass Kernel Machines

18

1-vs-all

Pairwise separation

Error-Correcting Output Codes (section 17.5)

Single multiclass optimization

1 K

min w i C it

2

2 i 1 i t

subject to

w zt T xt w zt 0 w i T xt wi 0 2 it , i z t , it 0

SVM for Regression

19

f(x)=wTx+w0

Use the -sensitive error function

if r t f xt

e r , f x t

t t 0

r f x t

otherwise

min w C t t

1 2

2

t

r t w T x w0 t

w x w r

T

0

t

t

t , t 0

20

Kernel Regression

21

Kernel Machines for Ranking

22

but at least +1 unit margin.

Linear case:

1

min w i C it

2

2 t

subject to

w T xu w T xv 1 t , t : r u r v , it 0

One-Class Kernel Machines

23

min R 2 C t

t

subject to

x t a R 2 t , t 0

Ld x x r r x x

N

t t T s t s t s t T s

t t 1 s

subject to

0 t C , t 1

t

24

Large Margin Nearest Neighbor

25

D(xi, xj)=(xi-xj)TM(xi-xj)

For three instances i, j, and l, where i and j are of

the same class and l different, we require

D(xi, xl) > D(xi, xj)+1

and if this is not satisfied, we have a slack for the

difference and we learn M to minimize the sum of

such slacks over all i,j,l triples (j and l being one of k

neighbors of i, over all i)

Learning a Distance Measure

26

similar approach where M=LTL and learns L

Kernel Dimensionality Reduction

27

PCA on the

kernel matrix

(equal to

canonical PCA

with a linear

kernel)

Kernel LDA, CCA

- Using SVMs for Time Series PredictionUploaded byEleni Georga
- link listUploaded byNaveed Ahsan
- 515335_Olah Data Least SquareeUploaded byCherishe
- Final Astar 15TUploaded bySoulharvester Testa
- filsafat.docUploaded byFadhillah Ansyari
- sdfUploaded byDiego Espinoza
- Taylor SeriesUploaded byYunus Emre Özçelik
- Some Basic IdeasUploaded byMuhammad Chaudhry
- TenagaKerja(#12)Uploaded byRian Saputra
- Sparse Signal RecoveryUploaded byBhaskar Jaiswal
- A Re-Learning Based Post-Processing Step For Brain Tumor Segmentation From Multi-Sequence ImagesUploaded byAI Coordinator - CSC Journals
- Midterm 2010 SolutionsUploaded byErico Archeti
- schlkopf1998Uploaded bymalik_john6261
- kernelPCA_scholkopf.pdfUploaded byFlorent Bersani
- Thesis Georgi NalbantovUploaded bymilichko
- Modul 9 Wahyu Bimantara 175150218113027Uploaded byWahyuBimantara
- EXP_MATHSUploaded bySanjeev Kumar Pandey
- optimizationhw7Uploaded bySilvio de Paula
- Course Material on Dg MethodsUploaded byVittorio De Luca Bosso
- vorlesung_kt2_150115.pdfUploaded bybijoy kundu
- Syllabus_S_2009-10Uploaded byBachir El Fil
- Ddd SsssUploaded byVignesh R Upendiran
- Trials-2.57Uploaded byRajnish Patel
- Binary Cuckoo SearchUploaded byChetan Mishra
- Trial9 Ora 17051 Inform Answer Close Query.trcUploaded byManoj Kumar
- Structural Equation Modelling (SEM) Part 3 of 3Uploaded byPIE TUTORS
- Crossover Design1Uploaded byLaeeq R Malik
- lempel zivUploaded byabhi
- Data Mining and Neural NetworksAIMA(1)Uploaded bydineshgomber
- Linked List implementation in CUploaded byAzizan Wazir

- 9780262028189_TOC11.pdfUploaded byvarun3dec1
- Chapter 6- Chd Admin Institutions-rev-CCETUploaded byvarun3dec1
- 28-30-2004-p&pw(B)281009Uploaded byvarun3dec1
- Admissiongitiw ITI QuotaUploaded byvarun3dec1
- CN 123121.pdfUploaded byvarun3dec1
- Wear TableUploaded byvarun3dec1
- Rotation in Govt College of ArtsUploaded byvarun3dec1
- Kashmiri Migrant Press Note ChdUploaded byvarun3dec1
- Rights of Persons With Disabilities 5 PerUploaded byvarun3dec1
- CFP12321Uploaded byvarun3dec1
- D(Res-II)-DESWUploaded byvarun3dec1
- 1st May Puleet BrochureUploaded byvarun3dec1
- com_instUploaded byvarun3dec1
- Anti Ragging _ Ragging in College _ Anti Ragging AffidavitUploaded byvarun3dec1
- Anti Ragging _ Ragging in College _ Anti Ragging AffidavitUploaded byvarun3dec1
- i2ml3e-chap9.pptxUploaded byvarun3dec1
- Word2Vec Tutorial - The Skip-Gram Model · Chris McCormickUploaded byvarun3dec1
- Python TensorFlow Tutorial - Build a Neural Network - Adventures in Machine LearningUploaded byvarun3dec1
- 1506.00019.pdfUploaded bypreethamat208815
- Stochastic Gradient Descent - Mini-batch and More - Adventures in Machine LearningUploaded byvarun3dec1
- CN 123121.pdfUploaded byvarun3dec1
- Google NetUploaded byNitin Panj
- Beamer LogoUploaded byrghome
- Conv Neural NetsUploaded byArannya Monzur
- i2ml3e-chap12Uploaded byvarun3dec1
- lrec_skipgramsUploaded byvarun3dec1
- i2ml3e-chap1.pdfUploaded byvarun3dec1
- LSTM PaperUploaded byvarun3dec1
- i2ml3e-chap6Uploaded byvarun3dec1
- ISTC Admission-2017 Imp DatesUploaded byvarun3dec1

- TUGAS 1 (1)Uploaded byNurlailah Sidiq
- ADB Report on SEWUploaded byKasun Geethanga Gunasekara
- CL Mock CAT 10 2008Uploaded byapi-19746777
- Imperium IP Holdings v. Samsung Electronics Et. Al.Uploaded byPatentBlast
- Ieee 802Uploaded bytayyabamin12
- Informacion-AudiQ5-6-cylinder_direct_petrol_injection_engine_(3_2_ltr__4-valve).pdfUploaded bycarlosgargodoy
- A Transcritical CO2 Turbine-CompressorUploaded byjose
- RE - 1972-08Uploaded byAnonymous kdqf49qb
- sembhiUploaded byRaunaq Singh
- Sharp Panel Sol Dow 80ejeaUploaded byAnon Emous
- 6 Amplification CircuitsUploaded byHafiz Muhammad Luqman
- Bl Dikodougou - Nogotaha v1 MprUploaded bypapis
- Secret Codes for PhoneUploaded byLaky Lak
- D-Link-DWA-140-manual.pdfUploaded bybanexe
- Additive and Bio Remediation of SoilUploaded byapi-3721576
- ProMinent Beta b Solenoid Diaphragm Metering PumpsUploaded byRoga29
- Frame Relay and ATM WAN TechnologyUploaded byRakesh Kuniyal
- Calculos Para TvrUploaded byFer A. Molina Céspedes
- Monobond Etch&PrimeUploaded byhot_teeth
- progress_test_4.docUploaded byOsvaldo Velasquez Clavijo
- 11_chapter 3.pdfUploaded bySyahrul
- My Con Pds Sikafloor 81 EpocemUploaded bylaurenjia
- Exscript.enUploaded byjosalas
- 300 W MOSFET 144 MHz PA_Final VersionUploaded byxe1kya
- 8K-ESP8266 Sniffer Introduction en v0.3Uploaded byzaleks
- 05 Hydrostatic ValuesUploaded byRENGANATHAN P
- A demonstration of Exact String Matching Algorithms with CUDAUploaded byRaymond Tay
- English Written Communication ASSIGNMENTUploaded byMaefRcArjuna
- ~Mil Std 2073 1d (Appendix a)Uploaded bymtcengineering
- Electrolux Ew878fUploaded byYasmin