You are on page 1of 43

SAS Analytics:

The Power to
Deliver Profitable
Business Results
Analytics Consulting
SAS Institute – Darius Baer, Jim Hornell, &
Ross Bettinger

Copyright © 2004, SAS Institute Inc. All rights reserved. April 12, 2005
SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Other brand and product names are registered trademarks or Trademarks of their respective companies
Beyond BI with SAS Analytics

Objective

 Discuss the value of analytics as part of


the solution to business problems

 Demonstrate two examples of using


analytics to solve business problems

Copyright © 2005, SAS Institute Inc. All rights reserved. 2


Agenda
 Overview
• Why Analytics?
• Business Problems that can be addressed with analytics
• Analytic approaches to solving business problems
• Introduction to the two examples

 Marketing Performance Optimization


Trade Promotion Optimization

 Bank Call Center Text Mining


 Conclusion
Copyright © 2005, SAS Institute Inc. All rights reserved. 3
Volumes of Data – How to Extract
Maximum Utility
Data Information Knowledge
Intelligence
Hindsight Insight
Foresight
ETL OLAP Advanced Analytics
Sums and Means Drilldown
Statistical Predictions

 Exponential growth of Operational Decisions


corporate data and computing power in the
past two decades
• ETL with sums and means provides hindsight from corporate measurements
• OLAP with drilldown provides insight from the ETL data warehouse
• Only advanced analytics with statistical predictions provides foresight from
the ETL data warehouse
 Data Availability + Computing Power + Advanced Analytics →
Competitive Advantage and Best Decisions
Copyright © 2005, SAS Institute Inc. All rights reserved. 4
Interpreting the Variability of a
Population
 Means are useful. Understanding the distribution around the
mean and what contributes to that distribution is essential to
compare populations and make predictions

 Statistical techniques “predict” the future by apportioning


variance in the population to explanatory variables

 As sales change over time in a well defined pattern, future sales


can be predicted

 If the likelihood of buying a product is associated with


demographic characteristics, then we can predict how likely a
particular individual is to buy that product

 With a goal of maximum profits and knowing constraints within


which a company operates, we can solve a series of linear (or
non-linear) equations to obtain an optimal solution
Copyright © 2005, SAS Institute Inc. All rights reserved. 5
The Problem Defines the
Solution
 Business executives and analysts have always made
operational decisions
• Intuition and experience can be used
• Sums and means can provide an historical direction
• OLAP and drilldown can provide a better or more detailed perspective
• Only advanced analytics can provide a sophisticated point of view on the
future of the business

 The problem provides processes and parameters that must be


addressed by the solution
• How would you make the business decision if you did not have advanced
analytics?
• How can you structure your analysis to follow that process and use those
parameters?

Copyright © 2005, SAS Institute Inc. All rights reserved. 6


Problem Defines Solution –
Example 1
 Railroad must have efficient schedules to move freight
• Before computers, colored strings on a bulletin board were used – time on the
X-axis and distance on the Y-axis
• Constraints included no crossing of trains except at sidings and stations
 With computers, the business analyst could manipulate the trains
and visualize on the screen
• However, there was no guarantee of a “best” decision that produced optimal
usage of the tracks to move the most freight in the minimum amount of time
 With analytics, one takes the problem and goal as stated above
• One has constraints of the trains such as:
Minimum and Maximum departure and arrival times
Minimum and Maximum Speeds
Departure and Arrival Stations
Available routes
• The goal is solved for using an OR algorithmic approach with PROC
NETFLOW and visually represented on a screen
• Interaction is provided to the user to modify the analytic result as desired
Copyright © 2005, SAS Institute Inc. All rights reserved. 7
Problem Defines Solution –
 Example 2
Herbicide producer wants to deliver time sensitive herbicide to
farmers immediately prior to the planting of the corn
• Chemical company uses hindsight as to when the farmers planted the corn in
previous years
• Business experts also have a “sense” for whether the planting will be earlier or
later than previous years
 Since the problem is to know beforehand when the farmers will
plant their corn → Go visit the farmers!
• Farmer walks out of house in the morning and sticks wet finger in air to gauge
temperature, kicks dirt to gauge moisture, and looks over horizon to see if
neighbors are planting their corn.
 With analytics, one takes the problem and understands process
• Using a linear regression approach in each of 98 agricultural districts with
the following inputs:
− Daily temperatures combined as necessary in day groups
− Precipitation amounts grouped as appropriate
− Records of previous years plantings
• Each year and each district provide a regression equation
• Using a model selection approach provided a limited set of predictive
equations for the current year resulting in forecasts being within 2-3 days for
95 out of the 98 districts
Copyright © 2005, SAS Institute Inc. All rights reserved. 8
Analytic approaches to solving
business problems
 The best solutions often involve the combination of a
number of analytic techniques (as necessary) combined
with business rules that also constrain the solution

 SAS/OR – Finds optimal solution in system of constraints


 Enterprise Miner – Predictive modeling, e.g., which customers
are most profitable and/or most likely to respond to an offer
 ETS and HPF – Forecasting, e.g., what are the future sales or
demand based on history and other related factors
 SAS/STAT – Regression, ANOVA, Factor Analysis – how can
we explain the largest amount of variance using statistical
techniques
Copyright © 2005, SAS Institute Inc. All rights reserved. 9
Business Cases
 Marketing Performance Optimization /
Trade Promotion Optimization
• Understand and predict the ROI on promotions, advertising
and other mass marketing tactics
• What’s the optimum mix of marketing tactics?

 Bank Call Center Text Mining


• Explore use of text mining to add value to Bank modeling
efforts to predict attrition
• Analyze call center comments for additional lift in predicting
attrition from primary accounts

Copyright © 2005, SAS Institute Inc. All rights reserved. 10


– MPO/TPO –
Marketing Performance
Optimization
Trade Promotion Optimization
Jim Hornell
Analytical Consultant

April 12, 2005


SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Copyright © 2004, SAS Institute Inc. All rights reserved. Other brand and product names are registered trademarks or Trademarks of their respective companies
“Half of my
advertising is wasted;
I just don’t know
which half.”

-- John Wanamaker, retail


pioneer in the late 1800’s

Copyright © 2005, SAS Institute Inc. All rights reserved. 12


Questions, with historically few
answers
 Marketers have tried – for years –
to understand and predict the ROI
on promotions, advertising and
other mass marketing tactics
• How much does each marketing
tactic contribute?
• What is the effect of events and
activities I cannot control?
• What is the “right” level of spend?
Overall? By tactic?
• How do seasonality and geography
affect results?
• What’s the optimum mix of
marketing tactics?

Copyright © 2005, SAS Institute Inc. All rights reserved. 13


“The transformation of TPM [Trade Promotion
Modeling], in conjunction with MMM [Market Mix
Modeling], from a tactical to a more overarching
and encompassing strategic function is well on
the way.
At this very moment…the question of full
functionality is less of an ‘if’ , but ‘when.’”

-- Michael Forhez and Charlie Chase, in ‘Consumer Goods


Technology’, March 2005.

Copyright © 2005, SAS Institute Inc. All rights reserved. 14


The “When” is Now
 MPO/TPO is designed to:
• Calculate the business impact of
multiple marketing channels.
− In isolation
− In combination
• Consider any and all potential variables
- controllable and uncontrollable
• Allow for changes in variables and
desired outcomes with minimal effort
• Predict future business outcomes
based on specific marketing mix and
promotional scenarios
• Provide the platform for marketing mix
optimization

Copyright © 2005, SAS Institute Inc. All rights reserved. 15


Standard solutions vs.
MPO/TPO
Standard Solutions MPO/TPO

 Analytic short-comings  Calculates impact of


multiple variables – alone
and in combination
 Too fragile  Analytic framework exists
– extremely robust
 Inflexible  Change input and target
variables as needed
 Fixed in time  Accounts for changing
marketplace activity
 Not forward looking  Designed to be forward
looking – predicts future
outcomes

Copyright © 2005, SAS Institute Inc. All rights reserved. 16


The MPO/TPO Offering
 Foundational elements include:
• Flexible data model
• Model automation procedures
• User interface elements
− Interactive
− Web based
• Executable Master Marketing and Promotional Plan
• Marketing campaign scenario forecasts to test effectiveness and
cross product cannibalism

 Customized elements include:


• Client-specific data inventory
• Coverage of client specific markets and segments
• Coverage of client specific products
• Customized interface reflecting client needs
Copyright © 2005, SAS Institute Inc. All rights reserved. 17
Sample Variables for a Financial
Client
 The MPO/TPO offering considers the effect of multiple variables,
across multiple geographies, on marketing performance
• Product transaction data
• Advertising data
• Promotion data
• Direct marketing data
• Econometric data
• Demographic composition and segment distribution
• Share of market
• Share of voice
• PR activity
• Event / sponsorship activity
• Distribution data
• Brand data

Copyright © 2005, SAS Institute Inc. All rights reserved. 18


Sample Variables for a CPG
Client
 The MPO/TPO offering considers the effect of multiple
variables, across multiple distributors, on trade
promotion performance
• Syndicated data (AC Nielsen, IRI)
• Shipment and Order history
• Promotion calendars
• Fund allocations
• Pricing
• Brand/category/market development index

Copyright © 2005, SAS Institute Inc. All rights reserved. 19


The User Interface

Copyright © 2005, SAS Institute Inc. All rights reserved. 20


Accesses the Modeling
Procedure
 Assimilates past business
history using:
• Singular Value
Decomposition
• Linear regression with
Lagged Values
• Dynamic Neural Network
Modeling
 By correlation rather than
causal modeling
 Resulting in Week by
Week Forecasts over your
planning horizon.

Copyright © 2005, SAS Institute Inc. All rights reserved. 21


Which Links Business Results
to
Advertising and Promotional
Expenditures
Market Volume Lift Incremental
Volume Lift
New York 9,500 Moving away from a growing
2,400/pt 1,322/pt condition towards a plateau
condition:
Boston 41,378
5,200/pt 2,676/pt
e
Rat Incremental Lift = 0
Lift
Philadelphia 42,855
Sales Volume Lift vs. Spend

2,150/pt 641/pt

Insight: Different market areas demonstrate varying upside ad potential

Copyright © 2005, SAS Institute Inc. All rights reserved. 22


The Value Equation
Estimated Benefits Assumptions:
 Marketing Trade
Spend: $100,000,000
$10
(held constant)
$9  % of Marketing and
Improvement in

$8 Promotions Impacted:
Trade Spend

25% – 35% – 45%


$7
 10-20% improvement
$MM

$6 based on prior client


experience
$5

$4

$3 % of Impacted Promos
45%
$2 35%
25%
$1

$0
10% 15% 20%
% Improvement

Copyright © 2005, SAS Institute Inc. All rights reserved. 23


Delivery and Implementation
 SAS Software Foundation and Analytics
 Consulting for customization to business needs
• Requirements
− Client data access
− Customized analytics
− Customized reporting
• Design
• Customized Development
• Testing, Documentation, and Installation

 With Domain Partners


• THMG, Thompson Hill Marketing Group
• CSC, Computer Sciences Corporation

Copyright © 2005, SAS Institute Inc. All rights reserved. 24


The Commencement of a New
Era
 Advertising and promotional spending is coming
under increased scrutiny
 Getting the spend “right” is a complex problem
 More and more data are available
• Robust data management, sophisticated
modeling, and content expertise are ‘must haves’
to predict results and optimize spending
 SAS has assembled the right software, partners, and
experience to make this work
 Questions??

Copyright © 2005, SAS Institute Inc. All rights reserved. 25


Bank Call Center
Text Mining
Ross Bettinger
Analytical Consultant

April 12, 2005


SAS is a registered trademark or trademark of SAS Institute Inc. in the USA and other countries. ® indicates USA registration.
Copyright © 2004, SAS Institute Inc. All rights reserved. Other brand and product names are registered trademarks or Trademarks of their respective companies
How Can Text Mining Add
Value?
Text mining can reveal hidden concepts not previously known
 Clusters of terms may contain information about a customer’s
behavior unavailable from structured data
 Information content in clusters can be used to inform business
decisions
• Warranty: Do I see a trend of product failures from customer comments?
• Surveys: What do employees say about the reorganization? How do we use
that information to improve employee productivity?
• Medical: Are the proper medications being prescribed for patients based on their
verbal statements to the doctor?
• Insurance: What are the characteristics of fraudulent claims based on the text on
the claim?
• Call Center: Do I have enough drop-down categories to cover the information I
get from the free-form fields?
• Marketing: What are my customers thinking? What are their wants and needs?
Copyright © 2005, SAS Institute Inc. All rights reserved. 27
Objective

 Explore use of text mining to add value to


Bank modeling efforts to predict attrition
• Loss of deposits  less money to loan at interest 
adverse impact on Bank’s profits
 Analyze call center comments for additional lift in
predicting attrition from primary accounts
• Information in unstructured text may add significant
value to model performance when combined with
“traditional” data mining practices

Copyright © 2005, SAS Institute Inc. All rights reserved. 28


Agenda

 Discuss SEMMA methodology to build predictive


attrition models
• Sample, Explore, Modify, Model, Assess

 Discuss results of exploratory data analysis to


justify sampling approach
• Unusual properties of Bank call center data require
creativity
 Build DM and TM models
• Compare individual DM, TM models, DM + TM model

Copyright © 2005, SAS Institute Inc. All rights reserved. 29


Sampling

 Bank call center data collected from May, 2003-


June, 2003 (Numbers altered for confidentiality)
• 900,000 records at account level supplied to SAS
• Chose existing primary customers (750,000 records)
• Multiple calls per account required consolidation of data and
comments to single account-level observation
− After consolidation:
600,000 accounts in good standing
9,000 voluntary attritors (1.47% attrition rate)
4,500 involuntary attritors (0.73% attrition rate)
------------
613,500 accounts used in analysis

Copyright © 2005, SAS Institute Inc. All rights reserved. 30


Exploratory Data Analysis

 Findings
• Attritions are a “rare event” (voluntary attrition rate = 1.47%)
• Significant imbalance in comments
− 40% Blank, 30% Direct Mail

• Strong concentration of comments into few classes will affect


performance of text mining models

Copyright © 2005, SAS Institute Inc. All rights reserved. 31


EDA (continued)
 Observe similar distribution of comments in voluntary
attritor, nonattritor comments

 Since distribution of comments and “Direct Mail” is similar,


we will assume that these two kinds of comments may be
removed without affecting the analysis so that other
comments may “speak”

Copyright © 2005, SAS Institute Inc. All rights reserved. 32


EDA (Text Mining Node)

 Using complete data produced two clusters


• 20% sample of voluntary attritors, good accounts
Blank comment
Mostly Direct Mail Terms

 Omitting blank and “Direct Mail” comments eliminates


imbalance in comments, reveals more clusters (20% sample)

Copyright © 2005, SAS Institute Inc. All rights reserved. 33


Modify

 Perform “optimal binning” of interval variables with


respect to target variable to change them into ordinal
variables
• Represent continuous variable as set of ordered indicator
variables to better concentrate target variable into small
number of bins
• Variables Age_Yrs, Cust_Tenure_Mo, N_Phone_Calls were
transformed
− For example, Age_Yrs was binned into following intervals
0-24, 24-38, 38-75, 75+

Copyright © 2005, SAS Institute Inc. All rights reserved. 34


Model

 Modeled voluntary attrition to predict who would


deliberately close account
 Partitioned data
• 50% Training / 25% Validation / 25% Test (Holdout)
 Built stratified models based on voluntary attrition
• Used all voluntary attritors (N=9,000), randomly-selected
nonattritors (N=9,000)
• Data Mining model (no text-based information)
• Text Mining model (only text-based information)
• Hybrid Data + Text Mining model
− structured data + structured text-based information

Copyright © 2005, SAS Institute Inc. All rights reserved. 35


Assess

 Results for test (holdout) dataset


 Model Node Misclas AUC Lift
• DM NN .3808 .6632 1.56
• TM Tree .4135 .5884 1.28
Best Model
• Hybrid NN .3840 .6578 1.62
− Misclas is misclassification rate
− AUC is area under ROC curve
− Lift is top 5% lift
 Hybrid model has similar misclassification rate, AUC as
DM model but higher lift
 Conclude that combining DM + TM provides strongest
performance in predicting voluntary attrition

Copyright © 2005, SAS Institute Inc. All rights reserved. 36


Applying Results of Text
Mining
 Combine blank, “Direct Mail”, Text Miner- clustered
comments to determine voluntary attrition “lift”

Copyright © 2005, SAS Institute Inc. All rights reserved. 37


Applying Results of Text
Mining (cont’d)
 Use cluster membership as “trigger”
• Cluster 3 has lift of 4.59
− Terms:
– Trigger is life cycle event: marriage, birth of
child, buying a home, death, …
• Cluster 5 has lift of 2.37
− Terms:
– Trigger is financial distress: bankruptcy

Copyright © 2005, SAS Institute Inc. All rights reserved. 38


Applying Results of Text
Mining (cont’d)
 Combine blank, “Direct Mail”, Text Miner- clustered
comments to determine involuntary attrition “lift”

Copyright © 2005, SAS Institute Inc. All rights reserved. 39


Concept Linking

 Which terms are related to “dep”?

Copyright © 2005, SAS Institute Inc. All rights reserved. 40


Value Proposition

 Use Enterprise Miner to extract information from


“structured” data
 Use Text Miner to turn “unstructured” text into
“structured” data for “traditional” data mining
 Use Enterprise Miner and Text Miner give you an
unbeatable combination for business advantage

Copyright © 2005, SAS Institute Inc. All rights reserved. 41


Conclusion
 Hindsight with ETL and Sums & Means is Good
• Important to get a view into your data
 Insight with OLAP and Drilldown is Better
• You obtain a better sense of where your business is now
and at whatever level of summary or detail you want
 Foresight with Analytics is Best
• You obtain a confidence of where your business is going in
the future so that you can take appropriate action now to be
prepared.

Beyond BI with SAS Analytics


Copyright © 2005, SAS Institute Inc. All rights reserved. 42
Copyright © 2003,
2005, SAS Institute Inc. All rights reserved. 43

You might also like