You are on page 1of 23

SUMMARY OUTPUT

Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations

0.931468903
0.867634318
0.855225035 If the Adjusted R squre in decreasing with the addition of a variable then t
4153.770855
36

ANOVA
df
Regression
Residual
Total

3
32
35

SS
MS
F
Significance F
3619064866 1.21E+09 69.91816842 3.84E-14
552121994.1 17253812
4171186860

Coefficients
Standard Error
t Stat
P-value
Lower 95%
-1292.518556
11842.92945 -0.10914 0.913774235 -25415.8
43.57264654
3.629219357 12.00607 2.16354E-13 36.18017
883.9995737
83.15073842 10.63129 4.96484E-12 714.6271
62.56808263
115.7174928 0.540697 0.592460622 -173.141

Intercept
Machine Hours
Batches
Attendance

RESIDUAL OUTPUT
Observation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20

Predicted Overhead Cost


98988.60293
85483.89809
92316.45563
82264.42281
100324.7233
107411.2358
115353.2529
75508.01215
104568.8617
88082.77892
106951.6134
96689.71433
104724.673
99366.62052
101325.1602
112780.2166
111286.9019
90400.27066
97178.04261
108331.6286

Residuals
809.3970697
2320.10191
1364.544367
-2.422813371
6643.276651
513.7641725
1933.747144
1359.987853
1432.138274
655.2210812
-1121.613359
-7959.714326
-4100.672951
-509.6205216
1296.839818
-4721.216597
-1232.901859
1491.729342
1514.957392
2198.37135

21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

98204.9765
95344.37893
98265.99636
102463.7959
91537.72977
83108.81321
117174.2335
95683.57998
101188.6953
97938.87526
84887.97591
95455.06928
113447.22
110291.6959
92489.31373
112640.5645

-1321.9765
4248.621074
-3701.996357
3288.204102
1686.27023
-7710.813212
-4037.233482
-10074.57998
-2690.695336
3864.124737
3483.024092
6963.930722
3735.780043
-2463.695857
-4457.313728
5302.435457

e addition of a variable then the added variable is harming your model and you should eliminate it.

ignificance F

Upper 95%Lower 95.0%


Upper 95.0%
22830.74 -25415.8 22830.74
50.96512 36.18017 50.96512
1053.372 714.6271 1053.372
298.2769 -173.141 298.2769

SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations

0.930819542
0.866425021
0.858329567
4108.99309
36

ANOVA
df
Regression
Residual
Total

2
33
35

SS
MS
F
3614020661 1807010330 107.0261279
557166199.1 16883824.22
4171186860

Coefficients
Standard Error
t Stat
P-value
3996.678209
6603.650932 0.605222512 0.549170949
43.53639812
3.5894837 12.12887472 1.04645E-13
883.6179252
82.25140753 10.74289124
2.6114E-12

Intercept
Machine Hours
Batches

RESIDUAL OUTPUT
Observation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21

Predicted Overhead Cost


98391.35059
85522.33322
92723.59538
82428.09201
100227.9028
107869.3954
114933.4723
75117.13407
104910.3677
88553.83418
107350.1156
96596.62525
105066.7568
99463.45145
101603.311
112800.1888
111055.5759
90940.05044
96712.63492
107725.8963
97610.85237

Residuals
1406.649409
2281.666779
957.4046174
-166.0920107
6740.097234
55.60458248
2353.52769
1750.865928
1090.632275
184.1658179
-1520.115584
-7866.62525
-4442.756845
-606.4514511
1018.688967
-4741.188802
-1001.575934
951.9495609
1980.365084
2804.10374
-727.8523694

22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

95313.15629
97794.73098
102930.3164
92070.54941
82713.38076
117634.1764
96150.08087
101277.6428
98155.91213
84741.835
95609.88762
113279.0892
109936.5195
92022.1465
112227.6396

4279.843712
-3230.730981
2821.683607
1153.450589
-7315.38076
-4497.176373
-10541.08087
-2779.64283
3647.087872
3629.165001
6809.112379
3903.910818
-2108.519545
-3990.146503
5715.360447

Significance F
3.75374E-15

Lower 95%
-9438.550632
36.23353862
716.2761784

Upper 95% Lower 95.0% Upper 95.0%


17431.90705 -9438.550632 17431.90705
50.83925761 36.23353862 50.83925761
1050.959672 716.2761784 1050.959672

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50

A
B
C
D
E
F
Hansa is the production manager at a pharma company. One of her KPI's is keeping the overhead costs under control.
The accounts department gives her overhead cost numbers, AFTER the month is over. By then, it is too late to do anything.
To take pre-emptive action, she wants to predict the Overhead cost before the month is over.
She believes the O/H costs may be related to two variables:
1. The number of hours for which the machines have run in the factory
2. The number of batches of tablets produced. (After each batch, the machines stop working, and some adjustments are made)
3. Average attendance % of workers
She has information about the first two around the start of the month, when the production plan is firmed up.

If a relationship can be found between Overhead cost, and these variables, then Hansa can choose to cut down on
some discretionary expenditure, so that the current months Overhead cost is under control.
Here is data from the past 3 years. Can you help Hansa find a solution?
Month
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

Machine Hours
1539
1284
1490
1355
1500
1777
1716
1045
1364
1516
1623
1376
1327
1178
1491
1667
1769
1104
1196
1794
1379
1448
1505
1420
1475
1118
1433
1589
1585
1493
1124
1536
1678
1723
1413
1390

Batches
31
29
27
22
35
30
41
29
47
21
37
37
49
50
37
41
34
44
46
29
38
32
32
42
27
34
58
26
32
33
36
28
41
35
30
54

Attendance
93
83
77
81
85
76
90
90
78
76
77
85
78
82
79
83
87
75
91
93
93
84
91
76
75
90
76
76
82
80
86
81
86
89
91
90

Overhead Cost
99798
87804
93681
82262
106968
107925
117287
76868
106001
88738
105830
88730
100624
98857
102622
108059
110054
91892
98693
110530
96883
99593
94564
105752
93224
75398
113137
85609
98498
101803
88371
102419
117183
107828
88032
117943

Predicted

SOLUTION:
Overhead Predicted = 3997 + 43 (Machine Hours) + 883 (Batches)
Only m/c hrs
m/c & Batch
All Three

cost predicted = 48621+(m/c Hrs)*34.7


Predicted Cost = 3996.678 + 43.536(m/c hours)+883.618(Batches)
Pedicted Cost = -1292+43.57*m/c hrs+884*batches+62.5*attendace

CORREL function tells u


Close to zero means ver
Close to +1 means high
Close to -1 means high n

Overhead vs Machine Hours


130000
120000
110000
100000

Correlation

90000
Series1

80000
70000
60000
1000

1200

1400

1600

1800

2000

Overhead vs Batches
Correlation

130000
120000
110000
100000
90000
Series1

80000
70000
60000
15

25

35

45

55

65

Overhead vs Attendance
130000
120000
110000
100000

Correlation

100000
90000
Series1

80000
70000
60000
70

75

80

85

90

95

CORREL function tells us correlation


Close to zero means very little correlation
Close to +1 means high positive correlation
Close to -1 means high negative correlation

Correlation

0.63188453

Correlation

0.52054353

Correlation

0.01823866

Summary

ANOVA Table
Explained
Unexplained

Regression Table
Constant
Machine Hours
Batches

Multiple R
0.9308

R-square
0.8664

Degrees of freedom Sum of squares


2
3614020661
33
557166199.1

Coefficient
3996.678
43.536
883.618

Standard Error
6603.651
3.589
82.251

Adjusted R-square
0.8583

Mean of squares
1807010330
16883824.22

t-value
0.6052
12.1289
10.7429

Our conclusions
Predicted Cost=3996.678 + 43.536 (Machine Hours) + 883.618 (Batches)
R-square of 0.86 indicates that a majority of the variance in Cost is explained by these two variables.

Std Error of estimate (=4108) indicates that 2/3 of predicted costs will fall within plus -minus 4108 of ac
95% of predicted costs will fall within plus-minus 8216 of actua
Estimation of population's regression equation:
Machine Hours and Batches are both significant explainers, because
p-value is smaller than 10%.
Also
t-value is larger than 3
Also
the confidence interval does not include zero

StErr of estimate
4108.993

F-Ratio
107.0261

p-value
0.5492
<0.0001
<0.0001

p-value
<0.0001

Confidence interval 95%


Lower
Upper
-9438.551 17431.907
36.234
50.839
716.276
1050.96

xplained by these two variables.

ll fall within plus -minus 4108 of actual costs.


all within plus-minus 8216 of actual cost

SUMMARY OUTPUT
Regression Statistics
Multiple R
R Square
Adjusted R Square
Standard Error
Observations

0.993208114
0.986462357
0.983561433
352.748277
18

ANOVA
df
Regression
Residual
Total

3
14
17

SS
MS
126939063.6 42313021
1742038.857 124431.3
128681102.5

Coefficients
Standard Error
t Stat
-4880.839034
1800.817392 -2.71035
0.102088211
0.015684223 6.508975
120.4333299
9.93753637 12.11903
269.5364867
9.84992192 27.36433

Intercept
Population
Advertising
Previous_Advertising

RESIDUAL OUTPUT
Observation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

Predicted Sales
16083.88494
12708.15385
12825.6401
16310.29931
15890.47949
12913.64014
17219.24103
20597.52432
19606.26641
14938.67785
17463.33395
16771.10968
9395.129542
12172.48156
13937.42528
15458.01438
16000.88158
16814.81659

Residuals
-370.8849402
228.8461462
46.35990062
-83.29930731
-502.4794887
266.3598625
-20.24103189
76.47567636
743.733593
-494.677853
66.6660549
-60.10968493
319.8704577
75.51844485
-81.42527873
-173.0143828
-380.8815805
343.1834119

Significance F
340.0511387
2.6E-13

P-value
Lower 95% Upper 95%Lower 95.0%
Upper 95.0%
0.016910427 -8743.21 -1018.47 -8743.21 -1018.47
1.38122E-05 0.068449 0.135728 0.068449 0.135728
8.23941E-09 99.11943 141.7472 99.11943 141.7472
1.48129E-13 248.4105 290.6625 248.4105 290.6625

SUMMARY OUTPUT
Regression Statistics
Multiple R
0.972365562
R Square
0.945494787
Adjusted R Square
0.938227425
Standard Error
683.8026462
Observations
18
ANOVA
df
Regression
Residual
Total

SS
MS
F
Significance F
2 1.22E+08 60833656 130.1015 3.34E-10
15 7013791 467586.1
17 1.29E+08

Coefficients Standard Error t Stat


P-value Lower 95% Upper 95%Lower 95.0%
6670.49354 592.6303 11.25574 1.03E-08 5407.332 7933.655 5407.332
117.7730459 19.24762 6.118837 1.96E-05 76.74772 158.7984 76.74772
261.7672155 18.95336 13.81112 6.19E-10 221.3691 302.1653 221.3691

Intercept
Advertising
Previous_Advertising

RESIDUAL OUTPUT
Observation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18

Predicted Sales
16878.97092
13672.43354
13541.32792
16747.8653
16290.10569
13541.32792
17336.73053
20543.26792
19496.64308
14850.164
17336.73053
16421.21131
9392.606168
11696.06886
13332.00295
14758.61207
15255.92538
16015.0059

Residuals
-1165.97
-735.434
-669.328
-520.865
-902.106
-361.328
-137.731
130.7321
853.3569
-406.164
193.2695
289.7887
322.3938
551.9311
523.997
526.3879
364.0746
1142.994

Upper 95.0%
7933.655
158.7984
302.1653

SUMMARY OUTPUT
Regression Statistics
Multiple R
0.512236
R Square
0.262386
Adjusted R Square
0.164037
Standard Error
2515.512
Observations
18
ANOVA
df
Regression
Residual
Total

Intercept
Population
Advertising

SS
MS
F
Significance F
2 33764074 16882037 2.667915 0.102028
15 94917028 6327802
17 1.29E+08

Coefficients
Standard Error t Stat
P-value Lower 95% Upper 95%Lower 95.0%
6183.303 12514.08 0.494108 0.628387 -20489.8 32856.44 -20489.8
0.050079 0.111023 0.451067 0.658394 -0.18656 0.286718 -0.18656
160.4131 70.09632 2.288466 0.03704 11.00629 309.8198 11.00629

RESIDUAL OUTPUT
Observation

Predicted SalesResiduals
1 14527.54 1185.464
2 13687.11 -750.111
3 15419.19 -2547.19
4 16313.05 -86.0521
5 13925.99 1462.014
6 15462.36 -2282.36
7 17265.6 -66.6039
8 18107.28 2566.719
9 14933.22 5416.776
10 15794.63 -1350.63
11 17385.34 144.6579
12 12683.52 4027.476
13 13741.21 -4026.21
14 14583.43 -2335.43
15 15427.01 -1571.01
16 15613.16 -328.165
17 15949.92 -329.916
18 16287.42 870.5819

Upper 95.0%
32856.44
0.286718
309.8198

Can you predict sales of a restaurant in its 19th year?


Data on Sales and Other Potentially Relevant Variables for a Particular Restaurant are below.
Year
Sales
1
15713.00
2
12937.00
3
12872.00
4
16227.00
5
15388.00
6
13180.00
7
17199.00
8
20674.00
9
20350.00
10
14444.00
11
17530.00
12
16711.00
13
9715.00
14
12248.00
15
13856.00
16
15285.00
17
15620.00
18
17158.00
19 ???????

Population
102558.00
101792.00
104347.00
106180.00
106562.00
105209.00
109185.00
109976.00
110659.00
111844.00
111576.00
113784.00
112482.00
116487.00
117316.00
117830.00
118148.00
118481.00
121069.00

Advertising
Previous_Advertising
20.00
30.00
15.00
20.00
25.00
15.00
30.00
25.00
15.00
30.00
25.00
15.00
35.00
25.00
40.00
35.00
20.00
40.00
25.00
20.00
35.00
25.00
5.00
35.00
12.00
5.00
16.00
12.00
21.00
16.00
22.00
21.00
24.00
22.00
26.00
24.00
28.00
26.00

a Particular Restaurant are below.


Estimated
16083.88482
12708.15373
12825.63997
16310.29918
15890.47936
12913.64001
17219.2409
20597.5242
19606.26628
14938.67772
17463.33381
16771.10955
9395.129402
12172.48141
13937.42514
15458.01424
16000.88144
16814.81645
17858.96037 Ans

Estimated Sales = -4880.839+0.10208821*(Population)+120.4333*(Advertising

opulation)+120.4333*(Advertising this yr)+269.536487*(Advertising orv yr)

Month

Cost
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

Units
45623
46507
43343
46495
47317
41172
43974
44290
29297
47244
43185
42658
39178
41198
43505
35805
39181
40248
28157
34761
45148
33447
45686
45296
37179
41199
31259
37705
42757
47332
44914
46105
45972
46295
45218
45357

601
738
686
736
756
498
828
671
305
637
499
578
641
452
674
475
536
527
275
495
568
418
694
653
471
669
298
399
549
863
764
800
609
667
705
637

55000
50000
45000
40000
35000
30000
25000
0

200

400

200

400

50000
45000
40000
35000
30000
25000

Conclusion:
Sometimes, a non-linear relationship exists between
In such cases, we do non-linear regression.

Units vs Cost

y = 30.533x + 23651
R = 0.7359

Units
Linear (Units)

400

600

800

1000

Units vs Cost
y = -0.06x2 + 98.35x + 5792.8
R = 0.8216

Units
Poly. (Units)

400

600

800

1000

on-linear relationship exists between two things.


e do non-linear regression.

You might also like