You are on page 1of 21

SIMPLE CORRELATION AND REGRESSION

ANALYSIS

Correlation Analysis is used to measure the


strength of the association between numerical
variables.

Coefficient of Correlation is used to


indicate the strength of the linear relationship
between the two variables (x and y) that will
be independent of their respective scales of
measurement.
The
measure
of
linear
correlation commonly used in statistics is
called Pearson Product-moment coefficient of
correlation.

Coefficient of determination (r2) is used to


determine how well the least square regression
line fits the sample data. It is very useful in
assessing how much errors of prediction of y
can be reduced by using the information
provided by x.

How to compute the value of coefficient of


determination

Compute the coefficient of correlation and


square its value.
The value of coefficient of correlation (r) is from
1 to 1, therefore the value of coefficient of
determination (r2) is between 0 to 1.

Find out if there is correlation between advertising


expenses and sales of the company. Compute the
correlation coefficient and coefficient of determination.
Month

Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec

Advertisin
g
X

Sales

1.2
1.4
0.5
2.1
2.0
1.6
1.0
0.6
0.8
1.8
1.9
1.5

21.2
21.8
17.0
25.5
26.2
22.5
19.5
17.3
17.5
24.0
23.8
22.3

Using the same example, compute the correlation coefficient


and coefficient of determination. Interpret the results.
Month

Advertisin
g
X

Sales
Y

XY

X2

Y2

Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec

1.2
1.4
0.5
2.1
2.0
1.6
1.0
0.6
0.8
1.8
1.9
1.5

21.2
21.8
17.0
25.5
26.2
22.5
19.5
17.3
17.5
24.0
23.8
22.3

25.44
30.52
8.50
53.55
52.40
36.00
19.50
10.38
14.00
43.20
45.22
33.45

1.44
1.96
0.25
4.41
4.00
2.56
1.00
0.36
0.64
3.24
3.61
2.25

449.44
475.24
289.00
650.25
686.44
506.25
380.25
299.29
306.25
576.00
566.44
497.29

Total

16.4

258.6

372.16

25.72

5682.14

Interpretation: There is a very high degree of positive correlation


between the advertising expenses and sales of R & B
Construction Company.

Coefficient of Determination (r2)


r2 = 0.98572 = 0.9716 0.97

Interpretation: Using the least square regression equation,


approximately 97.16% of the variability in y (dependent variable)
can be explained by x
(independent variable). It means that the least square regression
line/equation does a good job in predicting the sales of the
company given the advertising expenses.

Regression Analysis is used primarily for the


purpose of prediction. It helps to determine the
relationship that may exist between variables.
The goal is to develop a statistical model that can
be used to predict the values of a dependent
variable based on the values of at least one
independent variable. The value being predicted
or explained is called the dependent variable,
while the variable that is used to predict or
explain the dependent variable is called the
independent variable.
Linear Regression is the method used to
determine
the
relationship
between
two
variables through a linear equation called leastsquare regression equation.

The Least Square Regression


Equation

Y = a + bX

where a is the y-intercept (the value of


y at point where x is equal to zero)

b is the slope of the line the represents


the equation

Y is the dependent variable to be


predicted

Scatter Diagram is a graph in which each


observation is represented by a dot. The
explanatory or independent variable is
scaled on the x-axis and the dependent or
explained variable is scaled on the y-axis.

Trend line is a line that represents the


series of points that are plotted in which the
sum of the vertical distances of the points
above the line is approximately equal to the
sum of the vertical distances of the points
below the line.

The management at R & B Construction Company is interested in


determining the extent of the relationship between advertising and
sales. The companys general manager has collected the following
data on advertising expenditures and gross sales for the past
months:
Month
Advertisin Sales
g
X
Y
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec

1.2
1.4
0.5
2.1
2.0
1.6
1.0
0.6
0.8
1.8
1.9
1.5

21.2
21.8
17.0
25.5
26.2
22.5
19.5
17.3
17.5
24.0
23.8
22.3

There are three trend line drawn in the scatter diagram. Trend
line B is the best trend line because of the following reasons:

a. The line passes through the points.


b. The line approximates the general direction of the points.

Using trend line B, at 2.5 advertising cost, the sales is


27.5. It means when the advertising cost is P250,000 the
sales of the company will amount to P2,750,000.

Finding the least square regression equation


Month

Advertisin
g
X

Sales
Y

XY

X2

Y2

Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Nov
Dec

1.2
1.4
0.5
2.1
2.0
1.6
1.0
0.6
0.8
1.8
1.9
1.5

21.2
21.8
17.0
25.5
26.2
22.5
19.5
17.3
17.5
24.0
23.8
22.3

25.44
30.52
8.50
53.55
52.40
36.00
19.50
10.38
14.00
43.20
45.22
33.45

1.44
1.96
0.25
4.41
4.00
2.56
1.00
0.36
0.64
3.24
3.61
2.25

449.44
475.24
289.00
650.25
686.44
506.25
380.25
299.29
306.25
576.00
566.44
497.29

Total

16.4

258.6

372.16

25.72

5682.14

Least Square Regression Equation

Y = a + bx

Y = 13.80 + 5.67x

Using the least square regression equation


solve for y when x = 2.5.

Y = 13.80 + 5.67x
= 13.80 + 5.67(2.5)
Y = 27.975
When the advertising cost is P250,000, the
sales of the
company amounts to P2,797,500.

Using the least square regression equation solve for y


when x = 3.0 and x = 3.7.

When x = 3.0

Y = 13.80 + 5.67x
= 13.80 + 5.67(3.0)
Y = 30.81

When the advertising cost amounts to P300,000, the


sales
of the company will be P3,081,000.

When x = 3.7

Y = 13.80 + 5.67x
= 13.80 + 5.67(3.7)
Y = 34.779

When the advertising cost amounts to P370,000, the


sales
of the company will be P3,477,900.

You might also like