You are on page 1of 38

Classification:

A machine learning perspective


Emily Fox & Carlos Guestrin
Machine Learning Specialization
University of Washington
2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Part of a specialization

2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


This course is a part of the
Machine Learning Specialization

1. Foundations

4. Clustering 5. Recommender
2. Regression 3. Classification
& Retrieval Systems

6. Capstone

3 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


What is the course about?

2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


What is classification?
From features to predictions

ML
Data Classifier Intelligence
Method

Input x:
features derived Learn xy
from data
relationship Predict y:
categorical output,
class or label
5 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Sentiment classifier
Input x: Easily best sushi in Seattle.

Sentence Sentiment
Classifier

Output: y
Sentiment

6 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Classifier

Sentence
Classifier
from
review MODEL
Output: y
Input: x Predicted
class

7 2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Example multiclass classifier
Output y has more than 2 categories

Education

Finance

Technology

Input: x Output: y
Webpage
8 2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Spam filtering
Not spam

Spam

Input: x Output: y
Text of email,
9
sender, IP, 2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Image classification

Input: x Output: y
Image pixels Predicted object
10 2015 Emily Fox & Carlos Guestrin Machine Learning Specialization
Personalized medical diagnosis
Input: x Output: y
Healthy
Disease Cold
Classifier Flu
MODEL Pneumonia

11 2015 Emily Fox & Carlos Guestrin Machine Learning Specialization


Reading your mind
Inputs x are
brain region Output y
intensities
Hammer

House
12 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Impact of classification

2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Impact of classification

14 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Course overview

2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Course philosophy: Always use case studies &

Core
Visual Algorithm
concept

Advanced
Practical Implement
topics

I O N A L
OPT
16 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Overview of content

Models Algorithms Core ML


Linear Alleviating
Gradient
classifiers overfitting

Logistic Stochastic Handling


regression gradient missing data

Decision Recursive Precision-


trees greedy recall

Online
Ensembles Boosting
learning

17 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Course outline

2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Overview of modules

Models Algorithms Core ML


Alleviating
Linear classifiers Gradient
overfitting
Module 1 Modules 2 & 3
Modules 3 & 5

Handling missing
Logistic regression Stochastic gradient
data
Modules 1, 2, 3 Module 9
Module 6

Decision trees Recursive greedy Precision-recall


Modules 4 & 5 Module 4 Module 8

Ensembles Boosting Online learning


Module 7 Module 8 Module 9

19 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Module 1: Linear classifiers
Word Coecient
#awesome 1.0
#awful -1.5
Score(x) = 1.0 #awesome 1.5 #awful
#awful

Score(x) < 0

0
Score(x) > 0
0 1 2 3 4
#awesome
20 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 1: Logistic regression represents probabilities

P(y=+1|x,) = 1 .

1 + e- h(x)
T

21 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Module 2: Learning best classifier
Maximize likelihood over all possible w0,w1,w2

(w0=0, w1=1, w2=-1.5) = 10-6


#awful

(w0=1, w1=1, w2=-1.5) = 10-5

Best model with


4 gradient ascent:
3 Highest likelihood (w)
2 = (w0=1, w1=0.5, w2=-1.5)
1
(w0=1, w1=0.5, w2=-1.5) = 10-4
0
0 1 2 3 4
#awesome
23 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 3: Overfitting & regularization
True error
Classification
error

Training error

Model complexity

Use regularization penalty 2


to mitigate overfitting
(w)
(w) - ||w||2
25 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 4: Decision trees
Start

excellent poor
Credit?

fair
Income?
Safe Term?
high Low
3 years 5 years

Risky Safe Term? Risky

3 years 5 years

Risky Safe

26 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Module 5: Overfitting in decision trees
Decision Tree
Depth 1 Depth 3 Depth 10

Logistic Regression
Degree 1 features Degree 2 features Degree 6 features

27 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Module 5: Alleviate overfitting by learning simpler trees
Occams Razor: Among competing hypotheses,
the one with fewest assumptions should be
selected, William of Occam, 13th Century

Complex Tree Simpler Tree

Simplify

28 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Module 6: Handling missing data
Start

Credit Term Income y


excellent poor
excellent 3 yrs high safe Credit?

fair ? low risky fair


or unknown
fair 3 yrs high safe Income?
Safe Term?
poor 5 yrs high risky high Low
3 years 5 years or unknown
excellent 3 yrs low risky or unknown
fair 5 yrs high safe Risky Safe Term? Risky

poor ? high risky 3 years 5 years


or unknown
poor 5 yrs low safe
fair ? high safe Risky Safe

30 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Module 7: Boosting question
Can a set of weak learners be combined to
create a stronger learner? Kearns and Valiant (1988)

Yes! Schapire (1990)

Boosting

Amazing impact: simple approach widely used in


industry wins most Kaggle competitions
32 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 7: Boosting using AdaBoost
Income>$100K? Credit history? Savings>$100K? Market conditions?

Yes No Bad Good Yes No Bad Good


Safe Risky Risky Safe Safe Risky Risky Safe

f1(xi) = +1 f2(xi) = -1 f3(xi) = -1 f4(xi) = +1

Ensemble: Combine votes from many simple


classifiers to learn complex classifiers

33 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Module 8: Precision-recall
Goal: increase
# guests by 30%

Need an automated,
authentic
Reviews marketing campaign

Great quotes Spokespeople


Easily best sushi in Seattle.

Accuracy not most important metric

PRECISION RECALL
Did I (mistakenly) show a Did I not show a (great)
negative sentence??? positive sentence???
34 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Module 9: Scaling to huge datasets & online learning

4.8B webpages 500M Tweets/day 5B views/day

Stochastic gradient: tiny modification to gradient,


a lot faster, but annoying in practice
Avg. log likelihood

Gradient
Better

35 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Assumed background

2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Courses 1 & 2 in this ML Specialization
Course 1: Foundations
- Overview of ML case studies
- Black-box view of ML tasks
- Programming & data
manipulation skills

Course 2: Regression
- Data representation (input, output, features)
- Linear regression model
- Basic ML concepts:
ML algorithm
Gradient descent
Overfitting
Validation set and cross-validation
Bias-variance tradeo
Regularization

37 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Math background
Basic calculus
-Concept of derivatives
Basic vectors
Basic functions
-Exponentiation ex
-Logarithm

38 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Programming experience
Basic Python used
-Can pick up along the way if
knowledge of other language

39 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Reliance on GraphLab Create
SFrames will be used, though not required
- open source project of Dato
(creators of GraphLab Create)
- can use pandas and numpy instead
Assignments will:
1. Use GraphLab Create to
explore high-level concepts
2. Ask you to implement
all algorithms without GraphLab Create
Net result:
- learn how to code methods in Python
40 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization
Computing needs
Basic 64-bit desktop or laptop
Access to internet
Ability to:
-Install and run Python (and GraphLab Create)
-Store a few GB of data

41 2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization


Lets get started!

2015-2016 Emily Fox & Carlos Guestrin Machine Learning Specialization

You might also like