Professional Documents
Culture Documents
Regression
QUESTION 1
(i) Correlation Coefficient r
a) A perfect correlation
A perfect correlation can be of two types: perfect positive correlation and perfect negative
correlation. A perfect positive correlation indicates the result of correlation coefficient r =
+1 (Jain and Ohri, 2010). This means the both variables have a perfect positive relationship
with each other; for example, the relationship of supply and price. One the other hand, a
perfect negative correlation indicates the result of correlation coefficient r = -1 (Jain and
Ohri, 2010). This means the both variables have a perfect inverse relationship with each
other; for example, the relationship of demand and supply. Figure 1 demonstrates the
examples of each where all points lie on a single line.
c) A null correlation
A null or no correlation illustrates no relationship between two variables X and Y. In this case
the result of correlation coefficient r is zero. Figure 3 illustrates an example of null
correlation where all values scattered and do not lie on a single line.
XS
50
54
52
21
70
247
XS = Sales (000)
YA = Advert (000)
YA
8
31
12
12
15
78
XS YA
400
1674
624
252
1050
4000
XS2
2500
2916
2704
441
4900
13461
YA2
64
961
144
144
225
1538
n XSYA XS YA
n XS 2 XS n YA2 YA
2
5(4000) (247)(78)
5(13461) (247) 2 5(1538) (78)2
20000 19266
67305 61009 7690 6084
734
6296 1606
734
79.347 40.075
734
3179.839
R 0.231
Discussion on Results
The value of correlation coefficient R near to 0 represents a relatively low association
between sales and advertisement. This means that company was reluctant to develop any
adequate policy for advertising their products/services. Therefore, it is recommended to
company to increase its advertising budget in upcoming years.
QUESTION 2
Marks
Frequency
Midpoints
Cumulative Freq.
(%)
(f)
(x)
(c.f)
0 10
10 20
20 30
30 40
40 50
50 60
60 70
70 80
80 90
90 100
15
14
18
18
20
8
18
24
11
5
151
5
15
25
35
45
55
65
75
85
95
15
29
47
65
85
93
111
135
146
151
2 f BM
Median L1
w
Fm
151
2 65
Median 40
*10
20
Median 40
75.5 65 *10
Median 40
10.5 *10
20
20
Q1 class (37.75)
Median 40 (5.25)
Median 45.25
1
Mode Lm
1 2
6
*10
6 13
Mode 70
Mode 70 (3.158)
Mode 73.158
4 f BQ1
Q 1 LQ1
w
f Q1
151
4 29
Q 1 20
*10
18
Q 1 20
37.75 29 *10
Q 1 20
8.75 *10
18
18
Q 1 20 (0.4861)*10
Q 1 20 4.861
Q 1 24.861
3N
4 f Q 3
Q 3 LQ 3
w
fQ3
3(151)
4 111
Q 3 70
*10
24
Q 3 70
Q 3 70
2.25
*10
24
Q 3 70 (0.09375)*10
Q 3 70 0.9375
Q 3 70.9375
Q.D
1
Q3 Q1
2
Q.D
1
70.9375 24.861
2
Q.D
46.0765
2
Q.D 23.03825
QUESTION 3
(i) Product Moment Coefficient of Correlation
Y
16
10
X3
5
7
YX3
80
70
Y2
256
100
X32
25
49
33
15
77
59
75
57
88
26
456
7
3
9
1
8
3
12
15
70
231
45
693
59
600
171
1056
390
3395
1089
225
5929
3481
5625
3249
7744
676
28374
n YX 3 Y X 3
n Y 2 Y n X 3 2 X 3
2
10(3395) (456)(70)
10(28374) (456) 2 10(656) (70) 2
33950 31920
283740 207936 6560 4900
2030
75804 1660
2030
275.325 40.743
2030
11217.566
R 0.181
49
9
81
1
64
9
144
225
656
Interpretation of Results
The result of correlation coefficient r closer to zero does not show strong association
between Y (number of copies sold) and X3 (number of competing books). This means the number
of copies sold are not much dependent on the number of competing books.
Variables
Removed
Variables Entered
Cost, Number of Competing, Pages,
Advertising Budget a
Method
. Enter
R Square
Adjusted R Square
.959a
.921
.857
10.97068
Tab. 3 ANOVA b
Model
Sum of Squares
df
Mean Square
Regression
6978.620
1744.655
Residual
601.780
120.356
Total
7580.400
Sig.
14.496
.006a
Tab. 4 Coefficients a
Unstandardised Coefficients
Model
1
Standardized
Coefficients
Sig.
1.006
.361
Std. Error
82.227
81.735
Pages
.126
.032
.670
3.918
.011
Advertising Budget
-.484
2.877
-.206
-.168
.873
Number of Competing
.428
7.909
.063
.054
.959
-4.946
2.808
-.456
-1.761
.139
(Constant)
Cost
Beta
On the other hand, Adjusted R-Squared indicates statistical shrinkage. The adjusted R-Square
deals with the severity of extra predictor variables and penalizes for the additional predictor
variables (Albright, 2013). The adjusted R-Square is actually the proportion of dependent and
independent variables and can be supportive in the selection of the appropriate model. But in
this regression model the shrinkage level .064 (.921-.857) is quite low which indicates the
relevancy of independent and dependent variables.
c. Significance of the Model-Significance F
The table 3 shows the statistical significance of the unpredictability of independent variables
(cost, number of competing, pages, and advertising budget) for dependent variable (copies
sold) through F and Significance observations. The Analysis of Variance (ANOVA) in table 3
with slightly high p-value (0.06) at a level of 0.05 represents a non-linear relationship among
the variables. This shows the probability that the association among variables is not due to
chance. Many researchers believe that the significance level should be less than 0.05 mark
(Upton and Cook, 2001; Caldwell, 2009). The higher F value 14.49 (1744.65/120.356) also
indicates the inappropriate relationships among variables.
d. Interpretation of the Model-based on Size and Sign
The table 4 gives the idea of the coefficient part about regression equation. The regression
equation of this model can be written as follows.
Y = 82.227 + .126 (pages) .484 (Advertising budget) + .428 (number of competing) 4.946
(cost)
It is evident from sig. column that accept pages with 0.11 significance value, none of the
other independent variables (cost, number of competing, and advertising budget) is a
significant predictor of dependent variable (copies sold). Similarly, the largest Beta value
0.670 also indicates that it is the best predictor of copies sold.
(iv) Predicting Number of Copies Sold for a Book
It is given that Pages = 350, Budget = 35, Competing books = 6, and Cost = 6
Putting these values in the regression equation below:
Y = 82.227 + .126 (pages) .484 (Advertising budget) + .428 (number of competing) 4.946
(cost)
Y = 82.227 + .126 (350) .484 (35) + .428 (6) 4.946 (6)
Y = 82.227 + 44.1 16.94 + 2.568 29.676
Y = 82.279
Therefore it is predicted that approximately 82 copies sold for information provided.
References
Albright, B. (2013). Essentials of Mathematical Statistics. USA, Burlington: Jones & Bartlett
Publishers
Caldwell, S. (2009). Statistics unplugged. 3rd edition, USA, Belmont: Cengage Learning.
Upton, G.J.G. and Cook, I.T. (2001). Introducing statistics. 2nd edition, Oxford: Oxford
University Press.
Jain, T.R. and Ohri, V.K. (2010). Statistics for economics. India: FK Publications