Professional Documents
Culture Documents
Decision Making
Simple Linear
Regression
Lecture Outlines
Scatter Plots
Correlation Analysis
Simple Linear Regression Model
Estimation and Significance Testing
Coefficient of Determination
Confidence and Prediction Intervals
Analysis of Residuals
403.7
Regression Analysis ?
Regression analysis is used for modeling
the mean of response variable Y as a
function of predictor variables X 1, X2,..,
X k.
When K = 1, it is called simple regression
analysis.
403.7
Random Sample
Y: Response Variable,
X: Predictor Variable
For each unit in a random sample of n, the
pair
(X, Y) is observed resulting a random
sample:
(x1, y1), (x2, y2),... (xn, yn)
403.7
Scatter Plot
Scatter Plot is a graphical displays of the
sample (x1, y1), (x2, y2),... (xn, yn) by n
points in 2-dimension.
It will suggest if there is a relationship
between X and Y
403.7
PeopleM
25
20
15
16
21
26
Nielsen
403.7
Yesterda
-1
-1
Today
403.7
403.7
Y X
, and
403.7
Estimation
Simple linear regression analysis estimates the
mean of
y X
y a bx
Y (linear trend)
by
a y bx
and
( x x )( y y )
b
(x x)
2
403.7
10
Standard deviation
Standard deviation (s) of the sample of
n points in the scatter plot around the
estimated regression line y a bx
is:
y y
n2
403.7
11
H 0 : 0 vs. H a : 0
compute t-statistic and its p value:
b - 0
t - statistic
sb
403.7
12
Coefficient of Determination:
R2
A quantification of the significance of
estimated model y a bx is denoted by
R2.
R2 > 85% = significant model
R2 < 85% = model is perceived as
inadequate
Low R2 will suggest a need for additional
predictors for modeling the mean of Y
403.7
13
Correlation Coefficient: r
The correlation coefficient r is the square
root of R2. It is a number between -1 and 1.
Closer r is to -1 or 1, the stronger is the
linear trend
Its sign is positive for increasing trend
(slope b is positive)
Its sign is negative for decreasing trend
(slope b is negative)
403.7
14
y s.e. y
2. compute:
403.7
15
What is s.e. y ?
i.e. Standard Error of
For estimating
y,
2
1
( x x0 )
s.e.( y ) s
n (x x)2
For Predicting Y,
( x x0 )
1
s.e.( y ) s 1
2
n
(
x
x
)
403.7
16
Analysis of Residuals
Residuals are defined:
ei y i y i , i 1, 2,....n
normally distributed.
403.7
17
Analysis of Residuals
(cont)
Plot of residual ei against observed
predictor values xi will help ascertain
homogeneity assumption.
random appearance = homogeneity of
variance assumption is valid.
non-random appearance
=homogeneity assumption is not valid
and variance is dependent on predictor
values.
403.7
18