Professional Documents
Culture Documents
Ting Millette,
Wachovia Bank
Copyright © 2007, SAS Institute Inc. All rights reserved.
ABSTRACT
Methods
Logistic Regression
Dependent variable is whether or not the
customer is a good prospect
The independent variables are the
customer’s age, gender, zip code,
income, marital status, children, car,
saving account, check account and
mortgage.
Two factor interactions, polynomial terms
Copyright © 2007, SAS Institute Inc. All rights reserved.
Decision Tree
Split search - binary, multiway, L-way splits
Splitting Criteria - Gini index, Pearson Chi
Square, Entropy
Stopping Rule – Postpruning, Preprunning
Leaf Size:5, Max Branch:2, Max Depth:6
Min Categorical levels:5, Number of Rules:5
Methods
Neural Networks
Model selection criteria: Profit/loss,
Misclassification Rate, Average Error
Architecture: GLIM, MLP, ORBFEQ,
NRBFEH. NRBFEW, NRBFEQ
Direct Connection: No
Number of Hidden Units: 3
Max Iterations: 20
Copyright © 2007, SAS Institute Inc. All rights reserved.
MODEL SETUP
Confusion Matrix
Lift Chart
ROC Chart
Profit Chart
Model Assessment
Lift Chart
Regression has a lift of
175%
Neural network has a lift
of 310%
Decision tree has a lift of
335%
Model Assessment
Target Valid:Root Test:Root
R-square and Tool Target Event Root ASE ASE ASE
Average Square Error Neural
Network buyer 1 0.4198573 0.375608677 0.377804163
are better Regression buyer 1 0.42919849 0.390205283 0.359309433
measurement of Tree buyer 1 0.29295261 0.305571021 0.341392826
model fit than F test.
Decision Tree
outperforms the Target Misclassification Valid:Misclassification Test:
Tool Target Event Rate Rate Misclassification Rate
Logistic Regression Neural
and Neural Network Network buyer 1 0.230769231 0.177570094 0.175925926
model with smaller Regression buyer 1 0.237762238 0.177570094 0.166666667
average squared Tree buyer 1 0.097902098 0.112149533 0.148148148
error and smaller
misclassification.
Copyright © 2007, SAS Institute Inc. All rights reserved.
Validation:
Predicted probability is 90% versus a
random probability of 44% with cutoff
point 0.55; decision tree model has a lift
of 335%, which means it is more than 3
times more accurate than with no
modeling.
Optimization:
C= Fixed Cost + Per Customer Cost *
Mailing rate * Population Size
E= Expected Num of Buyers * Avg. Cust.
Value
= (Lift*Mailing Rate)*Avg. Cust. Value
Max Profit = E-C or ROI = E/C
Conclusion
Ting Millette
Wachovia Bank
831-277-1276
Ting.millette@wachovia.com
Question?