Professional Documents
Culture Documents
There are 23 variables used to predict weather a customer will default on its payment or not
Attribute information
X1 X2 X3 X4 X5 X6 – X11 X12 – X17 X18 – X23
Amount of the Gender Education Marital Age History of past Amount of bill Amount of previous
given credit status payment statement payment
Logistic K – Nearest
Decision Tree
Regression Neighbor (KNN)
Decision tree builds classification It is a statistical method for k-Nearest Neighbor is a lazy learning
or regression models in the form analyzing a data set in which there algorithm which stores all instances
of a tree structure are one or more independent correspond to training data points in
The tree is constructed in a top- variables that determine an n-dimensional space
down recursive divide-and- outcome
conquer manner
3
Classification Method Results
Decision Tree Logistic Regression KNN Clustering
Predicted Predicted Predicted
0 1 0 1 0 1
Actual 0 34 3 Actual 0 34 3 Actual 0 32 5
1 7 6 1 8 5 1 10 3
4
Ensemble Model
• Majority Voting
• We assigned the prediction for the observation as predicted by the majority of models
• Weighted Average
• Instead of taking simple average, we took weights based on the accuracy of the three models
5
Applications
•Fraud Detection • Identification of bank frauds such as money laundering or credit card default
6
Thank you