Professional Documents
Culture Documents
Linear Regression
Linear regression g is a statistical p procedure that uses relationships to predict unknown Y scores based on the X scores from a correlated variable. variable
Predicted Y Scores
The symbol Y stands for a predicted Y score E Each h Y i is our b best t prediction di ti of f the th Y score at a corresponding X, based on the linear relationship that is summarized by the g line regression
Y = bX + a
7
a = Y (b)( X )
Example 1
For the following data set, set calculate the linear regression i equation. i
X
1 2 3 4 5 6
Y
8 6 6 5 1 3
10
11
12
Errors in Prediction
13
Variance
The variance of the Y scores around Y is th average the r squared sq r d difference diff r n b between t n th the actual Y scores and their corresponding Y predicted Y scores.
2 ( SY i one way to is t describe d rib th the average r error rr r ) when using linear regression to predict Y scores.
S = S (1 r )
2 Y 2 Y 2
14
SY = SY 1 r
15
Example 3
Using the same data set, calculate the standard error of the estimate.
X
1 2 3 4 5 6
Y
8 6 6 5 1 3
16
17
18
19
20
21
Assumption 1
The first assumption p of linear regression g is that the data are homoscedastic
22
Homoscedasticity
Homoscedasticity occurs when the Y scores are spread out to the same degree g at every y X.
23
Heteroscedasticity
Heteroscedasticity occurs when the spread in Y is not equal l throughout h h the h relationship. l i hi
24
Assumption 2
The second assumption p of linear regression g is that the Y scores at each X form an approximately normal distribution
25
26
27
28
When we do not use the relationship, we use the overall mean of the Y scores (Y ) as everyones predicted Y. The error here is the difference between the actual Y scores and the Y that we predict di t th they got t (Y Y ). When we do not use the relationship p to 2 predict scores, our error is SY .
29
r=
N (XY ) (X )(Y ) [ N ( X 2 ) ( X ) 2 ] [ N ( Y 2 ) ( Y ) 2 ]
31
32
Example 4
Using the same data set, calculate the proportion of variance accounted for and the proportion of variance not accounted for. for
X
1 2 3 4 5 6
Y
8 6 6 5 1 3
33
Example 5
A researcher measures how positive a persons mood is and how Participant Mood (X) Creativity (Y) 1 10 7 creative he or she 2 8 6 3 9 11 is, obtaining the 4 6 4 interval scores on 5 5 5 6 3 7 th table: the t bl
7 8 9 10 7 2 4 1
34
4 5 6 4
Example 5
Participant Mood (X) Creativity (Y) 1 10 7 2 8 6 3 9 11 4 6 4 5 5 5 6 3 7 7 7 4 8 2 5 9 4 6 10 1 4
35
Example 5
Participant Mood (X) Creativity (Y) 1 10 7 2 8 6 3 9 11 4 6 4 5 5 5 6 3 7 7 7 4 8 2 5 9 4 6 10 1 4
36
Example 5
Participant Mood (X) Creativity (Y) 1 10 7 2 8 6 3 9 11 4 6 4 5 5 5 6 3 7 7 7 4 8 2 5 9 4 6 10 1 4
C) If your prediction is in error, rr r what h t is the amount of error you expect to have?
37
38