You are on page 1of 51

Hands-On Machine Learning

for Finance Professionals

Sandeep Khurana
Introduction to Machine Learning

What is machine learning, a brief history

What are different machine learning models?

What do they do? When to use them?

How to identify Features and how to tune Parameters?

Machine Learning case study using Azure ML Studio.

(c) QuantLeap Consulting


Agenda

1. Motivation for ML
2. ML algorithms
1. Classification
2. Regression
3. Introduction to Azure ML Studio
4. Environment setup
1. Data Preparation
2. Modeling of Problem
5. Introduction to Kaggle
Hands-on Case study

• Credit Risk classification


Sec 1: Introduction to
Machine Learning
Intelligence
Human Intelligence(Core Processes)* Artificial Intelligence**
Sensing and Perception Sensing and interpreting
Hearing Text2Speech
Speech Speech2Text
Vision Computer Vision
Language NLP
Temperature, Texture etc (Touch) Haptics
Knowledge Representation Knowledge Representation
Memory Memory
Learning Machine Learning
Reasoning/Calculation/Probabilistic Calculator/Programming/Decision
Assessment and Judgment trees/Bayesian
Control of Action Robotics
Motor nerves Motion control

*Cognitive Neuro-science (Gazzaniga, Ivry, Mangun) ** Artificial Intelligence: A modern approach (Russell and Norvig)
DE-CONSTRUCTING INTELLIGENCE
NATURAL LANGUAGE
to enable it to communicate successfully in English
PROCESSING

KNOWLEDGE REPRESENTATION knowledge representation to store what it knows or hears

to use the stored information to answer questions and to


AUTOMATED REASONING
draw new conclusions
machine learning to adapt to new circumstances and to
MACHINE LEARNING
detect and extrapolate patterns.

COMPUTER VISION to perceive objects

ROBOTICS to manipulate objects and move about


Source: Artificial Intelligence: A modern approach (Russell and Norvig)
MACHINE LEARNING
(c) QuantLeap Consulting
Learning
Learning Humans Machines
Training What is right and what is wrong Classified (labeled) text/images
What works or does not work

Training More “solved examples” the better Larger learning dataset > Better accuracy

Testing Solve unseen problems to assess learning Test algorithm on unseen data

Decompose and Apply past associations between components Apply past associations between components
Aggregate Text/Image of image/text to current image/text of image/text to current image/text

Matching/Classification Recall from memory the closest association to Recall from memory the closest association to
instance at hand instance at hand

Matching/Classification More clues to memory increase recall ability More features increase accuracy
(c) QuantLeap Consulting

Learning Dimensions in Machine Learning


Intuition

• Common Sense

Theory

• Math

Execution

• Programming

Application

• Domain knowledge

Interpretation

• Algorithm
Machine
Supervised Learning
Learning Unsupervised Learning

Reinforcement Learning
Supervised Learning

Dimensionality Association Time Series


Classification Regression
Reduction Rule Analysis

Anomaly
Prediction
Detection
Classification Algorithms

Naïve Logistic Decision Random


Bayes Regression Trees Forest

Neural
SVM K-NN
Networks
Unsupervised Learning
Learning “what normally happens”
No output
Clustering: Grouping similar instances
Other applications: Summarization, Association Analysis
Example applications
Customer segmentation in CRM
Image compression: Color quantization
Reinforcement Learning
Topics:
Policies: what actions should an agent take in a particular situation
Utility estimation: how good is a state (used by policy)

No supervised output but delayed reward


Credit assignment problem (what was responsible for the outcome)
Applications:
Game playing
Robot in a maze
Multiple agents, partial observability, ...
Classification

Example: Credit scoring

Differentiating between low-


risk and high-risk customers
from their income and savings

Discriminant: IF income > θ1 AND savings > θ2


THEN low-risk
ELSE high-risk
Classification: Applications

Binary/Categorical/Discrete Classification
Many problems can be structured as classification problems

Binary Recursion

Medical diagnosis: From symptoms to illnesses


Predict voting: From variables on facebook usage – Democrats vs Republicans
Web Advertising: Predict if a user clicks on an ad on the Internet
Finance/Accounting
Go/No Go decision modeling- Purchases, Market Entry, Credit Default
Prediction: Regression

Example: Price of a used car

x : car attributes

y : price

y = g (x | θ )

g ( ) model,

θ parameters

y = wx+w0
Approaches
(c) QuantLeap Consulting

Terminology: Machine Learning

Confusion
ROC curve
Matrix

N-Fold
Monte Carlo
Cross
simulation
validation
Sec 2: Azure Machine
Learning Studio
Azure ML modeling
Create account
Steps Build Model

Azure ML Studio Setup Deploy


Azure Machine Learning Studio
Common sense principles in ML problem formulation

Goal
• Often not clear or clearly articulated
• The “y”
Law of Minimum Force
• Occam’s Razor
• Tools are means to an end- fancy algorithms don’t impress, accurate results do
• Accuracy-simplicity tradeoff
Haste makes Waste
• Don’t rush in on ‘any’ data
• Seek data you need, not what you have
• Study underlying variable-target correlations
• Study summary statistics
Representativeness of Training Data
Formulating ML Problems: Data

Need for Pre-processing

Data format

Proxy variables
• Eg Clickstream data
• Eg rental values for socio-economic classification
Calculated variables
• Eg Difference vs Ratio

Features
• Use features that generalize across contexts
• Eg industry-standard fin ratios
05
Formulating DL Problems: Data

How about forgetting some data?


• Unlearn rows
• Unlearn columns
Feature processing
• Missing data
• Interaction terms
• Transformations
• Domain-specific features
• Variable-specific features
Updating data sources, model

Labeled data
• Train using data for all labels
• …and labels must be accurate. GIGO
Creating value from data

Experiment
• The first may not be the best.
• Threshold decisions.
• Multiple small modeling decisions
• Iterations

ML solves not just the same problem but similar problem


• Extracting essence of decision vs filtering

Model choice

Model integration
• Ensemble
7
Sec 3: Kaggle
Separate presentation
Sec 4: Support Vector
Machines
Classification Tasks

•Learning Task

–Given: Expression profiles of leukemia patients and healthy


persons.

–Compute: A model distinguishing if a person has leukemia


from expression data.

•Classification Task

–Given: Expression profile of a new patient + a learned model

–Determine: If a patient has leukemia or not.


Tennis Example

= do not play tennis


Temperature = play tennis

Humidity
Introduction: Linear Separators
Binary classification can be viewed as the task of separating classes in
feature space

wTx + b = 0
wTx + b > 0

wTx + b < 0
f(x) = sign(wTx + b)
Linear Separators
Which of the linear separators is optimal?

• All hyperplanes in Rd are


parameterized by a vector (w) and a
constant b.
• Can be expressed as w•x+b=0
(remember the equation for a
hyperplane from algebra!)
• Our aim is to find such a hyperplane

f(x)=sign(w•x+b),
that correctly classify our data.
Selection of a Good Hyper-Plane

Objective: Select a `good' hyper- ρ

plane using only the data!


r

Intuition: assuming linear


separability

(i) Separate the data

(ii) Place hyper-plane `far' from


data
Maximizing the margin

Recall: the distance from a


point (𝑥0 , 𝑦0 ) to a line A𝑥 + ρ

𝐵𝑦 + 𝑐 = 0 is r

𝐴𝑥0 +𝐵 𝑦0 +𝑐
𝐴2 +𝐵2

Distance from example xi to


the separator is

𝑤 𝑇 𝑥𝑖 + 𝑏
r= 𝑤
Classification Margin

Examples closest to the


r
hyperplane are support
vectors.
Margin ρ of the separator
is the distance between
support vectors.
Maximum Margin Classification

Maximizing the margin


is good according to
intuition and

Implies that only


support vectors
matter; other training
examples are
ignorable.
Maximum Margin Classification

Maximizing the margin is


good according to
intuition and

Implies that only support


vectors matter; other
training examples are
ignorable.
Soft Margin Classification

What if the training set is


not linearly separable?
ξi
Slack variables ξi can be
ξi
added to allow
misclassification of
difficult or noisy examples,
resulting margin called
soft.
Non-linear SVMs

Datasets that are linearly separable with some


x
noise work out great:
0

0 x

But what are we going to do ifx2 the dataset is just too hard?

0 x

How about… mapping data to a higher-dimensional space:


Non-linear SVMs: Feature spaces

General Idea: The original feature space can always be mapped to


some higher-dimensional feature space where the training set is
separable.

Φ: x → φ(x)
Nonlinear SVM - Overview

SVM locates a separating hyperplane in the feature space and


classify points in that space

It does not need to represent the space explicitly, simply by


defining a kernel function

The kernel function plays the role of the dot product in the feature
space.
Properties of SVM

Flexibility in choosing a similarity function


Sparseness of solution when dealing with large data sets
- only support vectors are used to specify the separating
hyperplane
Ability to handle large feature spaces
complexity does not depend on the dimensionality of the
feature space
Overfitting can be controlled by soft margin approach
Nice math property: a simple convex optimization problem which
is guaranteed to converge to a single global solution
Feature Selection
Weakness of SVM

It is sensitive to noise

- A relatively small number of mislabeled examples can dramatically


decrease the performance

It only considers two classes: How to do multi-class classification with SVM?

- Answer:

1) With output arity m, learn m SVM’s

SVM 1 learns “Output==1” vs “Output != 1”

SVM 2 learns “Output==2” vs “Output != 2”

SVM m learns “Output==m” vs “Output != m”

2)To predict the output for a new input, just predict with each SVM and
find out which one puts the prediction the furthest into the positive region.
Sec 5: Case Study
Hands-on Exercise on Azure Machine Learning studio
(c) QuantLeap Consulting

Overfitting
Thanks!
Contact us:

Sandeep Khurana
Founder
Quant-Leap Consulting
Hyderabad

QLCLLP@gmail.com
www.quantleapconsulting.com

You might also like