Professional Documents
Culture Documents
Spring 2010
Rong Jin
1
CSE847 Machine Learning
Instructor: Rong Jin
Office Hour:
Tuesday 4:00pm-5:00pm
Thursday 4:00pm-5:00pm
Textbook
Machine Learning
The Elements of Statistical Learning
Pattern Recognition and Machine Learning
Many subjects are from papers
Web site: http://www.cse.msu.edu/~cse847
2
Requirements
~10 homework assignments
Course project
Topic: visual object recognition
Data: over one million images with extracted
visual features
Objective: build a classifier that automatically
identify the class of objects in images
Midterm exam & final exam
3
Goal
Familiarize you with the state-of-art in
Machine Learning
Breadth: many different techniques
Depth: Project
Hands-on experience
Develop the way of machine learning thinking
Learn how to model real-world problems by
machine learning techniques
Learn how to deal with practical issues
4
Course Outline
5
Today’s Topics
Why is machine learning?
Example: learning to play backgammon
General issues in machine learning
6
Why Machine Learning?
Past: most computer programs are mainly
made by hand
Future: Computers should be able to program
themselves by the interaction with their
environment
7
Recent Trends
Recent progress in algorithm and theory
Growing flood of online data
Computational power is available
Growing industry
8
Three Niches for Machine Learning
Data mining: using historical data to improve
decisions
Medical records Æ medical knowledge
Software applications that are difficult to program by
hand
Autonomous driving
Image Classification
User modeling
Automatic recommender systems
9
Typical Data Mining Task
Given:
• 9147 patient records, each describing pregnancy and birth
• Each patient contains 215 features
Task:
• Classes of future patients at high risk for Emergency Cesarean Section 10
Data Mining Results
Learned Rules:
If Other-Delinquent-Account > 2
Number-Delinquent-Billing-Cycles > 1
Then Profitable-Costumer ? = no
If Other-Delinquent-Account = 0
(Income > $30K or Years-of-Credit > 3)
12
Then Profitable-Costumer ? = yes
Programs too Difficult to Program By Hand
ALVINN drives 70mph on highways
13
Programs too Difficult to Program By Hand
ALVINN drives 70mph on highways
14
Programs too Difficult to Program By Hand
Visual object recognition
Classify Bird Images
Positive Examples
2
Statistical Model
Train Test
Negative Examples 3
2
15
Image Retrieval using Texts
16
Software that Models Users
History What to Recommend?
Description:A homicide detective and a Description: A high-school boy
fire marshall must stop a pair of murderers is given the chance to write a story
who commit videotaped crimes to become about an up-and-coming rock band
media darlings as he accompanies it on their
concert tour.
Rating:
Recommend: ?No
Description: A biography of sports legend,
Muhammad Ali, from his early days to his
days in the ring
Description: A young
Rating: adventurer named Milo Thatch
Description: Benjamin Martin is drawn joins an intrepid group of
into the American revolutionary war against explorers to find the mysterious
his will when a brutal British commander lost continent of Atlantis.
kills his son. Recommend: ?Yes
17
Rating:
Netflix Contest
18
Relevant Disciplines
Artificial Intelligence
Statistics (particularly Bayesian Stat.)
Computational complexity theory
Information theory
Optimization theory
Philosophy
Psychology
…
19
Today’s Topics
Why is machine learning?
Example: learning to play backgammon
General issues in machine learning
20
What is the Learning Problem
Learning = Improving with experience at some task
Improve over task T
With respect to performance measure P
Based on experience E
Example: Learning to Play Backgammon
T: Play backgammon
P: % of games won in world tournament
E: opportunity to play against itself
21
Backgammon
25
Choose a Target Function
Goal:
Policy: π: b Æ m B = board
Choice of value ℜ = real values
function
V: b, m Æ ℜ
V: b Æ ℜ
26
Value Function V(b): Example Definition
27
Representation of Target Function V(b)
28
Example: Linear Feature
Representation
Features:
pb(b), pw(b) = number of black (white) pieces on board b
ub(b), ub(b) = number of unprotected pieces
tb(b), tb(b) = number of pieces threatened by opponent
Linear function:
V(b) = w0pb(b)+ w1pw(b)+ w2ub(b)+ w3uw(b)+ w4tb(b)+
w5tw(b)
Learning:
Estimation of parameters w0, …, w5
29
Tuning Weights
Given:
board b
Predicted value V(b)
Desired value V*(b)
Calculate
error(b) = (V*(b) – V(b))2
For each board feature fi
wiÅ wi + c×error(b)×fi
Stochastically minimizes
∑b (V*(b)-V(b))2
Gradient Descent Optimization
30
Obtain Boards
Random boards
Beginner plays
Professionals plays
31
Obtain Target Values
Person provides value V(b)
Play until termination. If outcome is
Win: V(b) Å 1 for all boards
Loss: V(b) Å -1 for all boards
Draw: V(b) Å 0 for all boards
Play one move: b Æ b’
V(b) Å V(b’)
Play n moves: b Æ b’Æ…Æ b(n)
V(b) Å V(b(n))
32
A General Framework
Mathematical Finding Optimal
Modeling Parameters
Statistics + Optimization
Machine Learning
33
Today’s Topics
Why is machine learning?
Example: learning to play backgammon
General issues in machine learning
34
Importants Issues in Machine Learning
Obtaining experience
How to obtain experience?
Supervised learning vs. Unsupervised learning
How many examples are enough?
PAC learning theory
Learning algorithms
What algorithm can approximate function well, when?
How does the complexity of learning algorithms impact the learning accuracy?
Whether the target function is learnable?
Representing inputs
How to represent the inputs?
How to remove the irrelevant information from the input representation?
How to reduce the redundancy of the input representation?
35