Professional Documents
Culture Documents
Chap 11-1
Chapter Topics
Chapter Topics
(continued)
Chap 11-3
Purpose of Regression
Analysis
No Relationship
Chap 11-5
Chap 11-6
Dependen
t
(Response
) Variable
Slope
Coefficient
m Error
Yi X i i
Population
Regression
YX
Line
(conditional mean)
Independent
(Explanatory
) Variable
Chap 11-7
(continued)
(Observed Value of Y) = Yi
X i i
i = Random Error
YX X i
(Conditional Mean)
X
Observed Value of Y
Chap 11-8
Yi b0 b1 X i ei
Sample
Slope
Coefficient
Residual
(continued)
i 1
Yi Yi
e
2
i 1
2
i
b0 provides an estimateof
b1 provides and estimateof
Chap 11-10
(continued)
Yi b0 b1 X i ei
Y
ei
Yi X i i
b1
YX X i
b0
Observed Value
Y i b0 b1 X i
X
Chap 11-11
Interpretation of the
Slope and the Intercept
E Y | X
1
measures the change in
X
Chap 11-12
Interpretation of the
Slope and the Intercept
(continued)
is the estimated
b E Y | X 0
E Y | X
b1
Xthe estimated change in the
is
average value of Y as a result of a oneunit change in X.
Chap 11-13
Store
Square
Feet
Annual
Sales
($1000)
1
2
3
4
5
6
7
1,726
1,542
2,816
5,555
1,292
2,208
1,313
3,681
3,395
6,653
9,543
3,318
5,563
3,760
Chap 11-14
12000
10000
8000
6000
4000
2000
0
0
Excel Output
1000
2000
3000
4000
5000
6000
Square Feet
Chap 11-15
Coefficients
Intercept
1636.414726
X Variable 1 1.486633657
Chap 11-16
Yi =
4000
2000
15
4
.
6
3
6
1
Xi
7
8
1. 4
0
0
1000
2000
3000
4000
5000
6000
Square Feet
Chap 11-17
Interpretation of Results:
Example
Yi 1636.415 1.487 X i
The slope of 1.487 means that for each increase of
one unit in X, we predict the average of Y to
increase by an estimated 1.487 units.
The model estimates that for each increase of one
square foot in the size of the store, the expected
annual sales are predicted to increase by $1487.
Chap 11-18
Chap 11-19
Measure of Variation:
The Sum of Squares
SST
Total
=
Sample
Variability
SSR
Explained
Variability
SSE
Unexplained
Variability
Chap 11-20
Measure of Variation:
The Sum of Squares
(continued)
Measure of Variation:
The Sum of Squares
(continued)
SSE =(Yi - Yi )2
SST = (Yi - Y)
_
SSR = (Yi - Y)2
Xi
_
Y
X
Chap 11-22
Explanatory Power of
Regression
Variations in
store sizes not
used in
explaining
variation in
sales
Sizes
Sales
Variations in
sales explained
by the error
term SSE
Variations in sales
explained by sizes or
variations in sizes
used in explaining
variation in sales
SSR
Chap 11-23
SS
MS
Significanc
e
F
Regressio
n
SSR
MSR
=SSR/p
MSR/MSE
P-value of
the F Test
Residuals
n-p1
MSE
SSE =SSE/(n-p1)
Total
n-1
SST
Chap 11-24
Measures of Variation
The Sum of Squares: Example
Excel Output for Produce Stores
Degrees of freedom
ANOVA
df
SS
MS
Regression
30380456.12
30380456
Residual
1871199.595 374239.92
Total
32251655.71
Regression (explained) df
Error (residual) df
Total df
F
81.17909
SSE
SSR
Significance F
0.000281201
SST
Chap 11-25
The Coefficient of
Determination
SST
Total Sum of Squares
2
Chap 11-26
Explanatory Power of
Regression
Sales
Sizes
r
2
SSR
SSR SSE
Chap 11-27
Coefficients of Determination (r
2
) and Correlation (r)
Y r2 = 1, r = +1
^=b +b X
Y
i
0
1 i
Y r2 = 1, r = -1
^=b +b X
Y
i
X
Y
^=b +b X
Y
i
0
1 i
X
1 i
r2 = 0, r = 0
^ =b +b X
Y
i
0
1 i
X
Chap 11-28
SYX
SSE
n2
i 1
Y Yi
n2
Measures of Variation:
Produce Store Example
Excel Output for Produce Stores
r2 = .94
Regression Statistics
Multiple R
0.9705572
R Square
0.94198129
Adjusted R Square 0.93037754
Standard Error
611.751517
Observations
7
Syx
Linear Regression
Assumptions
Normality
Chap 11-31
Y
X2
X1
X
Chap 11-32
Residual Analysis
Purposes
Examine linearity
Evaluate violations of assumptions
Chap 11-33
X
X
e
X
Not Linear
Linear
Chap 11-34
Studentized Residual
SRi
SYX
ei
1 hi
where
1
hi
n
X X
X X
2
i 1
SR
SR
Heteroscedasticity
Homoscedasticity
Chap 11-36
Residual Analysis:Excel
Output for Produce Stores
Example
Observation
1
2
3
4
5
6
7
Excel Output
Predicted Y
4202.344417
3928.803824
5822.775103
9894.664688
3557.14541
4918.90184
3588.364717
Residuals
-521.3444173
-533.8038245
830.2248971
-351.6646882
-239.1454103
644.0981603
171.6352829
Residual Plot
1000
2000
3000
4000
Square Feet
5000
6000
Chap 11-37
Residual Analysis
for Independence
e
)
assumption
i
i 1
Should be close to 2.
i 2
e
i 1
2
i
Durbin-Watson Statistic
in PHStat
Chap 11-39
5
p=1
p=2
dL
dU
dL
dU
15
1.08
1.36
.95
1.54
16
1.10
1.37
.98
1.54
Chap 11-40
Using the
Durbin-Watson Statistic
H 0:
Reject H0
(positive
autocorrelation)
dL
Inconclusive
Accept H0
(no autocorrelatin)
dU
4-dU
Reject H0
(negative
autocorrelation)
4-dL
4
Chap 11-41
Residual Analysis
for Independence
Graphical Approach
Not Independent
Independent
e
Time
Cyclical Pattern
Time
No Particular Pattern
Test statistic
b1 1
t
where Sb1
Sb1
d. f . n 2
SYX
n
(X
i 1
X)
Chap 11-43
Square
Feet
Annual
Sales
($000)
1,726
1,542
2,816
5,555
1,292
2,208
1,313
3,681
3,395
6,653
9,543
3,318
5,563
3,760
Estimated
Regression
Equation:
Yi = 1636.415 +1.487Xi
The slope of this
model is 1.487.
Is square footage
of the store
affecting its annual
Chap 11-44
sales?
Test Statistic:
From Excel Printout
b1 Sb1
Reject
.025
Decision:
Reject H0
Reject
.025
-2.5706 0 2.5706
Conclusion:
There is evidence that
square footage affects
annual sales. Chap 11-45
b1 tn 2 Sb1
Test statistic
SSR
1
SSE
n 2
Relationship between
a t Test and an F Test
H1: 1 0
t
n2
(Linear dependency)
F1,n 2
Chap 11-48
H0: 1 = 0
From Excel Printout
ANOVA
H1: 1 0
df
SS
MS
F Significance F
.05
Regression
1 30380456.12 30380456.12 81.179
0.000281
numerator Residual
5 1871199.595 374239.919
df = 1
Total
6 32251655.71
denominator
df 7 - 2 = 5
Decision: Reject H0
Reject
Conclusion:
= .05
6.61
F1,n 2
Purpose of Correlation
Analysis
Chap 11-50
Purpose of Correlation
Analysis
(continued)
Chap 11-51
r = -1
r = -.6
r=0
r = .6
r=1
Chap 11-52
Features of and r
Unit free
Range between -1 and 1
The closer to -1, the stronger the
negative linear relationship
The closer to 1, the stronger the positive
linear relationship
The closer to 0, the weaker the linear
relationship
Chap 11-53
Hypotheses
H1: 0 (correlation)
Test statistic
t
where
r
n2
2
r r2
X
i 1
X
i 1
X Yi Y
Y Y
i 1
Chap 11-54
Is there any
evidence of a
linear relationship
between the
annual sales of a
store and its
square footage
at .05 level of
significance?
Regression Statistics
Multiple R
0.9705572
R Square
0.94198129
Adjusted R Square 0.93037754
Standard Error
611.751517
Observations
7
Example:
Produce Stores Solution
r
.9706
t
9.0099
2
1 .9420
r
5
n2
Critical Value(s):
Reject
.025
Reject
.025
-2.5706 0 2.5706
Decision:
Reject H0
Conclusion:
There is evidence of a
linear relationship at 5%
level of significance
Yi tn 2 SYX
t value from table
with df=n-2
(Xi X )
1
n
n
2
(Xi X )
2
i 1
Chap 11-57
Prediction of Individual
Values
Prediction interval for individual
response Yi at a particular Xi
Addition of one increases width of interval
from that for the mean of Y
Yi tn 2 SYX
1 (Xi X )
1 n
n
2
(Xi X )
2
i 1
Chap 11-58
Interval Estimates
for Different Values of X
Y
Confidence
Interval for the
mean of Y
Prediction Interval
for a individual Yi
X
b
1 i
+
Yi = b0
A given X
X
Chap 11-59
Square
Feet
Annual
Sales
($000)
1
2
3
4
5
6
7
1,726
1,542
2,816
5,555
1,292
2,208
1,313
3,681
3,395
6,653
9,543
3,318
5,563
3,760
Predict the
annual sales for
a store with
2000 square
feet.
Regression Model
Obtained:
Yi = 1636.415 +1.487Xi
Chap 11-60
SYX = 611.75
Yi tn 2 SYX
tn-2 = t5 = 2.5706
1
( X i X )2
n
4610.45 612.66
n
2
(
X
X
)
i
i 1
Chap 11-61
square-foot store
Predicted Sales Yi = 1636.415 +1.487Xi = 4610.45 ($000)
X = 2350.29
Yi tn 2 SYX
SYX = 611.75
tn-2 = t5 = 2.5706
1 ( X i X )2
1 n
4610.45 1687.68
n
2
(
X
X
)
i
i 1
Chap 11-62
Chap 11-63
Pitfalls of Regression
Analysis
(continued)
Chapter Summary
Chapter Summary
(continued)