Professional Documents
Culture Documents
Economics
Chapter 11
Multiple Regression and Model
Building
Learning Objectives
1. Explain the Linear Multiple Regression Model
2. Describe Inference About Individual Parameters
3. Test Overall Significance
4. Explain Estimation and Prediction
5. Describe Various Types of Models
6. Describe Model Building
7. Explain Residual Analysis
8. Describe Regression Pitfalls
Types of
Regression Models
1 Explanatory Regression 2+ Explanatory
Variable Models Variables
Simple Multiple
Non- Non-
Linear Linear
Linear Linear
Models With Two or More
Quantitative Variables
Types of
Regression Models
1 Explanatory Regression 2+ Explanatory
Variable Models Variables
Simple Multiple
Non- Non-
Linear Linear
Linear Linear
Multiple Regression Model
General form:
y = b 0 + b1 x1 + b 2 x2 + L + b k xk + e
k independent variables
1 2 or More 1
Quantitative Quantitative Qualitative
Variable Variables Variable
Dependent Independent
(response) (explanatory)
variable variables
First-Order Model With
2 Independent Variables
Relationship between 1 dependent and 2
independent variables is a linear function
Model
E ( y ) = b 0 + b1 x1 + b 2 x2
Assumes no interaction between x1 and x2
Effect of x1 on E(y) is the same regardless of x2
values
Population Multiple
Regression Model
Bivariate model:
yi = b 0 + b1 x1i + b 2 x2i + e i
y (Observed y)
Response b0 ei
Plane
x2
x1 (x1i , x2i)
E ( y ) = b 0 + b1 x1i + b 2 x2i
Sample Multiple
Regression Model
Bivariate model:
yi = b0 + b1 x1i + b2 x2i + e i
y (Observed y)
Response b^0
Plane e^i
x2
x1 (x1i , x2i)
yi = b0 + b1 x1i + b2 x2i
No Interaction
E(y) = 1 + 2x1 + 3x2
E(y)
E(y) = 1 + 2x1 + 3(3) = 10 + 2x1
12
E(y) = 1 + 2x1 + 3(2) = 7 + 2x1
8 E(y) = 1 + 2x1 + 3(1) = 4 + 2x1
0 x1
0 0.5 1 1.5
Effect (slope) of x1 on E(y) does not depend on x2 value
Parameter Estimation
Regression Modeling
Steps
1. Hypothesize Deterministic Component
2. Estimate Unknown Model Parameters
3. Specify Probability Distribution of Random
Error Term
Estimate Standard Deviation of Error
4. Evaluate Model
5. Use Model for Prediction & Estimation
First-Order Model
Worksheet
Case, i yi x1i x2i
1 1 1 3
2 4 8 5
3 1 3 2
4 3 5 6
: : : :
b^1 b^2
^
2. Slope (b2)
Number of responses to ad is expected to increase
by .2805 (28.05) for each 1 unit (1,000) increase in
circulation holding ad size constant
Estimation of 2
Regression Modeling
Steps
1. Hypothesize Deterministic Component
2. Estimate Unknown Model Parameters
3. Specify Probability Distribution of Random
Error Term
Estimate Standard Deviation of Error
4. Evaluate Model
5. Use Model for Prediction & Estimation
Estimation of 2
n - (k + 1)
SSE
s= s = 2
n - (k + 1)
Calculating s2 and s
Example
You work in advertising for the
New York Times. You want to
find the effect of ad size (sq.
in.), x1, and newspaper
circulation (000), x2, on the
number of ad responses (00), y.
Find SSE, s2, and s.
Analysis of Variance
Computer Output
Analysis of Variance
Source DF SS MS F P
Regression 2 9.249736 4.624868 55.44 .0043
Residual Error 3 .250264 .083421
Total 5 9.5
SSE S2
.250264
s =
2
= .083421
6-3
s = .083421 = .2888
Evaluating the Model
Regression Modeling
Steps
1. Hypothesize Deterministic Component
2. Estimate Unknown Model Parameters
3. Specify Probability Distribution of Random
Error Term
Estimate Standard Deviation of Error
4. Evaluate Model
5. Use Model for Prediction & Estimation
Evaluating Multiple
Regression Model Steps
1. Examine variation measures
2. Test parameter significance
Individual coefficients
Overall model
3. Do residual analysis
Variation Measures
Evaluating Multiple
Regression Model Steps
1. Examine variation measures
2. Test parameter significance
Individual coefficients
Overall model
3. Do residual analysis
Multiple Coefficient of
Determination
Proportion of variation in y explained by all x
variables taken together
n-1
R 2
a = 1-
n-(k+1)
( 1- R )
2
Estimation of R2 and Ra2
Example
You work in advertising for the
New York Times. You want to
find the effect of ad size (sq. in.),
x1, and newspaper circulation
(000), x2, on the number of ad
responses (00), y. Find R2 and
Ra2.
Excel Computer Output
Solution
R2
Ra2
Testing Parameters
Evaluating Multiple
Regression Model Steps
1. Examine variation measures
2. Test parameter significance
Individual coefficients
Overall model
3. Do residual analysis
Inference for an Individual
Parameter
Confidence Interval
bi ta 2 sb df = n (k + 1)
i
Hypothesis Test
Ho: i = 0
Ha: i 0 (or < or > )
Test Statistic
bi
t=
sb
i
Confidence Interval
Example
You work in advertising for the
New York Times. You want to
find the effect of ad size (sq. in.),
x1, and newspaper circulation
(000), x2, on the number of ad
responses (00), y. Find a 95%
confidence interval for 1.
Excel Computer Output
Solution
b1 sb
1
Confidence Interval
Solution
.204921 3.182(.058822)
.0177 b1 .3921
Hypothesis Test Example
You work in advertising for the
New York Times. You want to find
the effect of ad size (sq. in.), x1, and
newspaper circulation (000), x2,
on the number of ad responses
(00), y. Test the hypothesis that the
mean ad response increases as
circulation increases (ad size
constant). Use = .05.
Hypothesis Test
Solution
H0: b 2 = 0 Test Statistic:
Ha: b 2 0
a = .05
df = 6 - 3 = 3
Critical Value(s):
Decision:
Reject H0
.05
Conclusion:
0 2.353 t
Excel Computer Output
Solution
b 2 sb
2
Hypothesis Test
Solution
H0: b 2 = 0 Test Statistic:
Ha: b 2 0 b2 .280492
a = .05 t= = = 4.089
S b .068602
df = 6 - 3 = 3 2
Critical Value(s):
Decision:
Reject H0 Reject at a = .05
.05
Conclusion:
There is evidence the mean
0 2.353 t ad response increases as
circulation increases
Excel Computer Output
Solution
b2
t=
sb
2
PValue
Evaluating Multiple
Regression Model Steps
1. Examine variation measures
2. Test parameter significance
Individual coefficients
Overall model
3. Do residual analysis
Testing Overall Significance
Shows if there is a linear relationship
between all x variables together and y
Hypotheses
H0: b1 = b2 = ... = bk = 0
No linear relationship
Ha: At least one coefficient is not 0
At least one x variable affects y
Testing Overall Significance
Test Statistic
MS ( Model )
F=
MS ( Error )
Degrees of Freedom
1 = k 2 = n (k + 1)
k = Number of independent variables
n = Sample size
Testing Overall
Significance Example
You work in advertising for the
New York Times. You want to
find the effect of ad size (sq. in.),
x1, and newspaper circulation
(000), x2, on the number of ad
responses (00), y. Conduct the
global Ftest of model
usefulness. Use = .05.
Testing Overall Significance
Solution
H0: 1 = 2 = 0
Test Statistic:
Ha: At least 1 not zero
a = .05
1 = 2 2 = 3
Critical Value(s):
Decision:
a = .05
Conclusion:
0 9.55 F
Testing Overall Significance
Computer Output
k
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Prob>F
Model 2 9.2497 4.6249 55.440 0.0043
Error 3 0.2503 0.0834
C Total 5 9.5000 MS(Model)
n (k + 1) MS(Error)
Testing Overall Significance
Solution
H0: 1 = 2 = 0
Test Statistic:
Ha: At least 1 not zero
a = .05 4.6249
F= = 55.44
1 = 2 2 = 3 .0834
Critical Value(s):
Decision:
Reject at a = .05
a = .05
Conclusion:
There is evidence at least 1
0 9.55 F of the coefficients is not zero
Testing Overall Significance
Computer Output Solution
Analysis of Variance
Sum of Mean
Source DF Squares Square F Value Prob>F
Model 2 9.2497 4.6249 55.440 0.0043
Error 3 0.2503 0.0834
C Total 5 9.5000
MS(Model)
MS(Error)
P-Value
Interaction Models
Types of
Regression Models
Explanatory
Variable
1 2 or More 1
Quantitative Quantitative Qualitative
Variable Variables Variable
8
E(y) = 1 + 2x1 + 3(0) + 4x1(0) = 1 + 2x1
4
0 x1
0 0.5 1 1.5
Effect (slope) of x1 on E(y) depends on x2 value
Interaction Model Worksheet
F P-Value
Interaction Test
Solution
H0: b 3 = 0 Test Statistic:
Ha: b 3 0
a = .05
df = 6 - 2 = 4
Critical Value(s):
Decision:
Reject H0 Reject H0
.025 .025
Conclusion:
-2.776 0 2.776 t
Excel Computer Output
Solution
b3
t=
sb
3
Interaction Test
Solution
H0: b 3 = 0 Test Statistic:
Ha: b 3 0 t = 1.8528
a = .05
df = 6 - 2 = 4
Critical Value(s):
Decision:
Reject H0 Reject H0 Do no reject at a = .05
.025 .025
Conclusion:
There is no evidence of
-2.776 0 2.776 t interaction
SecondOrder Models
Types of
Regression Models
Explanatory
Variable
1 2 or More 1
Quantitative Quantitative Qualitative
Variable Variables Variable
E ( y ) = b 0 + b1 x + b 2 x 2
Linear effect
Second-Order Model
Relationships
y b2 > 0 y b2 > 0
x1 x1
y b2 < 0 y b2 < 0
x1 x1
Second-Order Model
Worksheet
2
Case, i yi xi xi
1 1 1 1
2 4 8 64
3 1 3 9
4 3 5 25
: : : :
Create x2 column.
Run regression with y, x, x2.
2 Order Model Example
nd
F P-Value
2 Parameter Test Solution
2 test indicates curvilinear relationship exists
t P-Value
Types of
Regression Models
Explanatory
Variable
1 2 or More 1
Quantitative Quantitative Qualitative
Variable Variables Variable
2Relationship
Independent between 1Variables
dependent and 2
independent variables is a quadratic
function
Useful 1st model if non-linear relationship
suspected
Model
E ( y ) = b 0 + b1 x1i + b 2 x2i + b 3 x1i x2i
+b x + b x
2
4 1i
2
5 2i
Second-Order Model
Relationships
y b4 + b5 > 0 y b4 + b5 < 0
x2 x2
x1 x1
1 2 or More 1
Quantitative Quantitative Qualitative
Variable Variables Variable
0 x1
0
Nested Models
Comparing Nested Models
Contains a subset of terms in the complete (full) model
Tests the contribution of a set of x variables to the
relationship with y
Null hypothesis H0: bg+1 = ... = bk = 0
Variables in set do not improve significantly the
model when all other variables are included
Used in selecting x variables or models
Part of most computer programs
Selecting Variables
in Model Building
Selecting Variables in Model
Building
A butterfly flaps its wings in Japan, which causes it
to rain in Nebraska. -- Anonymous
x x
Residual Plot
for Equal Variance
x x
Fan-shaped.
Standardized residuals used typically.
Residual Plot
for Independence
x x
Plot of standardized
(student) residuals
Regression Pitfalls
Regression Pitfalls
Parameter Estimability
Number of different xvalues must be at least one
more than order of model
Multicollinearity
Two or more xvariables in the model are correlated
Extrapolation
Predicting yvalues outside sampled range
Correlated Errors
Multicollinearity
y Interpolation
Extrapolation Extrapolation
x
Sampled Range
Conclusion
1. Explained the Linear Multiple Regression Model
2. Described Inference About Individual Parameters
3. Tested Overall Significance
4. Explained Estimation and Prediction
5. Described Various Types of Models
6. Described Model Building
7. Explained Residual Analysis
8. Described Regression Pitfalls