Professional Documents
Culture Documents
Spring 2017
David Parkes
Sasha Rush
What is machine learning?
Task Past Data Performance
(or “experience”) Measure
Data Prediction
now (action) now
The starting point...
http://ci.columbia.edu/ci/premba_test/c0331/images/s7/7176267017.gif
… where we are now...
Large
Complex Sources
Complex Structures
Example: Collaborative marketing
~70,000 examples, each has
Predict “conversion rate”
~3000 features
[85] "FEATURED_CATEGORY_EYEWEAR"
[96] "FEATURED_CATEGORY_HAIR_SALON_SPA"
[99] "FEATURED_MULTIPLE_CATEGORIES_BEAUTY_SPAS.SKIN_CARE"
[128] "FEATURED_MERCHANT_COOLIDGE_CORNER_YOGA"
[130] "FEATURED_MERCHANT_THE_CLAYROOM"
[133] "FEATURED_MERCHANT_MELLIE_HAIR_SALON"
… where we’re going
Robotics
Language/Vision
AI
Decision Making
Fundamentals
Loads of Comorbidities
Seizures (11-39%)
Tuberous Sclerosis (1-4%)
Intellectual Disability (40-69%)
Anxiety Disorders (7-84%)
Depression (4-58%)
ADHD (30-80%)
Data Prediction
now (action) now
How can Machine Learning help?
Discovering Subtypes
● Scientific Question: Do the patterns of co-occurring diagnoses
suggest a few subtypes in autism?
● Technical Question: Can we succinctly describe the patterns of
co-occurring diagnoses with a few hidden variables?
New Summary
patient of patient
type
Modeling Effort, Part I
Each patient belongs to an Autism (ASD) type.
Each ASD type is characterized by a set of comorbidities.
...
Modeling Effort, Part I
Each patient belongs to an Autism (ASD) type.
Each ASD type is characterized by a set of comorbidities.
Given
data
Modeling Effort, Part I
Each patient belongs to an Autism (ASD) type.
Each ASD type is characterized by a set of comorbidities.
3800
patients
...
Modeling Effort, Part II
Each ASD feature is characterized by a set of comorbidities.
Each patient has a combination of ASD features.
X = Z
Data: patient by code Patient factors: how much of Global features: what
matrix of counts feature A does patient n have? codes are in each feature?
Features Discovered
ASD
● Autism Spectrum Disorder
● Specific Delays in Development, Autism Spectrum Disorder,
Symptoms of Head and Neck, Speech Dysfunction
●
More ASD
Conduct Disorder with Specific, Physiological Developmental
Delays, Lack of expected normal physiological development in
childhood, Physiological Delays with Genetic Causes,Congenital/
Congenital Anomalies, Physiological Delays, Muscular Multisystem
Dystrophy
● Otitis Media, Hearing Loss, Congenital Anomalies ofInfection/Ear/
the Ear,
Autoimmune
Viral Infection, Diseases of the Ear, Asthma
● Seizures, Cerebral Palsy, Epilepsy, Intellectual Disability,
Epilepsy/
Abnormal Movements Intellectual Disability
● Depressive Disorder, Not Elsewhere Classified, Disturbance of
Emotions Specific to Childhood, Specific Delays in
Development, Disturbance of Conduct Not ElsewherePsychiatric
Classified, Acute Reaction to Stress
Machine Learning Taxonomy
ML
Supervised
Regression
Classification
Unsupervised
Clustering
Embeddings
RL
Machine Learning Taxonomy
ML
Supervised
Task Data (x,y) Metric
Regression
Classification
New Predicted
Unsupervised x y
Clustering
Embeddings
RL
Terminology: Regression
ML
Supervised
Regression
Classification
Unsupervised
Clustering
Embeddings
RL
Example: Virtu
Predictions of travel
time, price, supply,
demand
(Keith Chen)
(Keith Chen)
Terminology: Classification
ML
Supervised
Regression
Classification
Unsupervised
Clustering
Embeddings
RL
Example: Digit Classification
• Data: Handwritten US zip codes
29
Example: Image Recognition
Visualization of learned
filters for a deep
convolutional neural
network trained for
recognition on ImageNet
Example: Swype
Predict words from keyboard trajectories
Novel Product:
An easier way
to input text on
mobile devices
Detecting Copyright Violations on
Youtube...
(~300 hours of new video per minute)
Machine Learning Taxonomy
ML
Supervised
Task Data (x) Metric
Regression
Classification
summary
x
Unsupervised of x
Clustering
Embeddings
RL
Terminology: Clustering
ML
Supervised
Regression
Classification
Unsupervised
Clustering
Embeddings
RL
Example: News Clustering
Terminology: Embedding
ML
Supervised
Regression
Classification
Unsupervised
Clustering
Embeddings
RL
Example: Word Vectors
Example: Eigenfaces
https://www.cs.princeton.edu/~cdecoro/eigenfaces/
Example: Scanner Data
(S. Ng 2016)
Supervised Histories
Task s,a,r,s,a,r Metric
Regression
Classification
History Next
Unsupervised s,a,r,s,.. action a
Clustering
Embeddings
RL
Terminology:
Reinforcement Learning
ML
Supervised
Regression
Classification
Unsupervised
Clustering
Embeddings
Strategic Goals
Data Science
Specific Objectives
Data and Computing
Ecosystem
Long-term
Investments Visualization, System
Analytics Machine Learning Design Ac
Day-to-day
Operations Data Exploration Data Exploitation
(Unsupervised) (Supervised)
Technical Details
Technical Details
... Clustering Classification
Feature Allocation Regression
...
Factor Analysis SVM
PCA Logistic Regression
k-Means Decision Tree
... Topic Modeling Neural Net
Density Estimation kNN
Mixture Model Gaussian Process
HMM Random Forest
Vision, Mission
Strategic Goals
Data Science
Specific Objectives
Data and Computing
Ecosystem
Long-term
Investments Visualizatio
System
n,
Analytics
Machine Learning Design Ac
Day-to-day
Operations Data Exploration Data Exploitation
(Unsupervised) (Supervised)
Technical Details
Technical Details
... Clustering Classification
Feature Allocation Regression
...
Factor Analysis SVM
PCA Logistic Regression
k-Means Decision Tree
... Topic Modeling Neural Net
Density Estimation kNN
Mixture Model Gaussian Process
HMM Random Forest
Small changes can make a problem
a lot harder
Hidden Assumptions
● Parameters, hyperparameters
● Distributional properties
● Cluster sizes and proportions
● Number of iterations
● Presence of local optima
● Approximations to distributions
● Conditional independence, Markov property
Hidden Assumptions
● Parameters, hyperparameters
● Distributional properties
● Cluster sizes and proportions
● Number of iterations
● Presence of local optima
● Approximations to distributions
● Conditional independence, Markovianity
Solution: Evaluate Carefully
● What data should be collected?
● How should it be processed?
● How should we separate a test set?
● What algorithms and parameters are used?
● How do we define success?
● Carl Denton
Office Hours and Sections
Office Hours
● Tue 8-10pm: Quincy DH
● Wed 6-8pm: MD 119
● Thu 8-10pm: Currier DH
● Thu 8-10pm: Eliot DH
● Fri 10-Noon: MD First Floor Lounge