You are on page 1of 7

ANOVA Table and Prediction intervals

(note: the actual calculations/formulas are shown below, but on homework and
exams you need only read Excel printouts to answer most questions about
regression)
Step 1: Decide which variable is x and which is y
y

the variable that depends on the other variable;


or, the variable that you are trying to predict

the variable whose values affect the other variable;


or, the variable whose values help predict the other variable.

Example:

Does # units sold depend on price?

y
x
Or, the problem might be stated as:
Use this data to predict the # units sold at a given price

y
x

Step 2: Obtain data:


i
1
2
3
:
n

x
x1
x2
x3
:
xn

y
y1
y2
y3
:
yn

i
1
2
3
4
5

x
10
20
30
40
50

y
990
980
970
950
920

Example:

5 values for each variable


so, n = 5

Step 3: Obtain the five sums


x
x1
x2
x3
:
xn
___

y
y1
y2
y3
:
yn
___

x2
x12
x22
x32
:
xn2
___

xy
x1y1
x2y2
x3y3
:
xnyn
_____

x y
i

y2
y12
y22
y32
:
yn2
___
2
i

Example:
x
10
20
30
40
50
___

y
990
980
970
950
920
____

xy
9,900
19,600
29,100
38,000
46,000
_______

x2
100
400
900
1,600
2,500
_____

y2
980,100
960,400
940,900
902,500
846,400
________

150

4,810

142,600

5,500

4,630,300

2
i

Step 4: Find the estimated coefficients (note: the actual calculations are shown, but
you only need to read Excel printouts for homework and the next exam)
Formulas:
b1

b0

1
xi y i
n
1
2
xi2 n xi

yi

y b1 x

1
(150)(4,810)
5
=
1
5,500 (150) 2
5

142,600

Example:

b1 =

b0 =

y b1 x =

150)
4,810
( 1.7)
5
5

1,700
= -1.7
1,000

= 1,013

Step 5: Find Sums of Squares and s2 (note: the actual calculations are shown, but
you only need to read Excel printouts for homework and the next exam)
Example:
SST

1
2
2
= y I yi
n

SST

= 4,630,300 -

SSR

= b1 (numerator of b1)

SSR

= (-1.7)(-1700)

= 2,890

SSE

= SST -SSR

SSE

= 3,080 2,890

s 2

s2

SSE
n2
s 2

190
52
63.3333

1
(4,810)2 = 3,080
5

190

= 63.3333
= 7.958224

Step 6: Create ANOVA table (note: the actual calculations are shown, but you only
need to read Excel printouts for homework and the next exam)
Source
Regression
Error
Total

d.f.
1
n-2
n-1

SS
SSR
SSE
SST

MS
MSR = SSR/d.f
MSE = SSE/(n-2)

F
MSR/MSE

d.f.
1
3
4

SS
2890
190
3080

MS
F
2890/1= 2890
45.6316
190/3 = 63.3333

Example:
Source
Regression
Error
Total
n=5, so n-2 = 3

and n-1 = 4

Step 7: Conduct F and t tests. (Note: these tests give exactly the same conclusions for
Simple Linear Regression; but they differ for Multiple Linear Regression; the F test is
explained in Chapter 17 page 679) (note: the actual calculations are shown, but you
only need to read Excel printouts for homework and the next exam)
H0: 1 = 0
H1: 1 0
F-statistic =

Example: Suppose = .05


MSR
MSE

F-statistic =

Critical (table) F value = F (1 ,n-2)


t-statistic =

b1
s b1

MSR
2890
=
= 45.6316
MSE
63.3333

Critical F = F.05 (1 ,3) = 10.13

(d.f. = n -2)

t-statistic =

b1
1.7
=
=s b1
.251661

6.75511
where sb1 =

2
i

1
n

x
i

sb1 =

7.958224
1000

= .251661

t/2 = t.025 (d.f. = n-2 = 3) = 3.182

denominator in formula for b1


Conclusion: Reject H0; there is a

significant relationship between y and x

Step 8: Calculate r2 (note: the actual calculations are shown, but you only need to
read Excel printouts for homework and the next exam)
r2

SSR
SST

Interpretation: r2 is the proportion (or %) of the variation in the y variable that is caused
by the changing values of the x variable.

Example:

r2

2890
3080

= .9383 (or, 93.83%)

the y variable

Interpretation: 93.83 % of the variation in the # of units sold can be attributed the
changing values of price.
the x variable

Step 9: Use the regression equation for prediction and/or estimation


(Prediction and confidence intervals do require a little more than just reading the
Excel printout. You need is the sum of squares of the x deviations that appears in
the denominator of the fraction under the square root sign - use Excel to calculate
this value & then plug it in to the prediction or confidence interval formula)

For a given (i.e., particular) value of x (call it xg), the estimated y value for this x value
is found by simply putting xg into the estimated regression equation:

= estimated y (when x = xg)

= b0 + b1xg

Confidence Interval for the average of all y values whenever x = xg:

t / 2 s

(xg x) 2
1

n x i2 1 x i 2
n

d.f. = n-2
Note:

2
i

1
N

x
i

is the denominator of the calculation for b1

Example: to estimate the average sales, for all times in the future when
the price is set at xg = $35 using a 95% confidence interval:

= b0 + b1xg = 1013 1.7(35) = 953.5

95% confidence = .05 t/2 = t.025 (n-2 = 3 d.f.) = 3.182


Interval = 953.5 (3.182)(7.9582)
t/2

1 (35 30) 2
= 953.5 12.01175

5
1000

denominator of b1

Prediction Interval for a single y value when x = xg:

t / 2 s

(xg x) 2
1
1
n x i2 1 x i 2
n

d.f. = n-2

the extra 1 under the square root sign


is the only difference from the confidence interval formula

Example: to estimate the sales for a particular week in which the price
is set at xg = $35 using a 95% prediction interval:

= b0 + b1xg = 1013 1.7(35) = 953.5 (same as for conf. interval)

95% confidence = .05 t/2 = t.025 (n-2 = 3 d.f.) = 3.182

Interval = 953.5 (3.182)(7.9582) 1


t/2

1 (35 30) 2
= 953.5 28.02742

5
1000

denominator of b1

You might also like