You are on page 1of 4

1. Differentiate between correlation and regression.

Explain with suitable examples using


data.
Correlation is a measure of the degree of relatedness of variables. It can help a business
researcher determine, for example, whether the stocks of two companies rise and fall in any
related manner. For a sample of pairs of data, correlation analysis can yield a numerical value
that represents the degree of relatedness of the two stock prices over time.
Several measures of correlation are available, the selection of which depends mostly on the level
of data being analyzed. Ideally, researchers would like to solve for r , the population coefficient
of correlation. However, because researchers virtually always deal with sample data, this section
introduces a widely used sample coefficient of correlation, r. The statistic r is the Pearson
product-moment correlation coefficient, named after Karl Pearson.
Regression analysis is the process of constructing a mathematical function that can be used to
predict or determine one variable by another variable or other variables. The most elementary
regression model is called simple regression or bi- variate regression involving two variables in
which one variable is predicted by another variable. The variable to be predicted is called the
dependent variable and is designated as y. The predictor is called the independent variable, or
explanatory variable, and is designated as x.
The regression line
The equation of the regression line is determined by
Y=mx+b
Where, m=slope of the line
B=y intercept of the line
Example: Suppose a departmental store gives in service training to its salesman which is
followed by a test. Suppose there are 9 salesmen who have give the test. The test score- scored
by them along with sales made by the salesman are as follows:
Test Scores

14

19

24

21

26

22

15

20

19

Sales (in INR


1,000)

31

36

48

37

50

45

33

41

39

The management is considering whether it should terminate the services of the salesman who did
not do well in the test. To check, whether the termination of services of those salesman who had

not done well is test or secured lowest score and be found out by calculating correlation between
the test scores and sales made by them.
However, if the management want to fix a minimum sales volume of certain amount , Lets say
Rs 30,000 from each sales man, what could be the minimum test scores secured by sales man
which will be indicating termination of thier services? For this, line of regression is used to find
out minimum test score.
Let X denotes the test scores of the salesmen and Y denote their corresponding sales (in Rs1000).
Calculation for Correlation & Regression
X

x=X-x=X-20

y=Y=y= Y-40

x2

y2

xy

14

31

-6

-9

36

81

54

19

36

-1

-4

16

24

48

16

64

32

21

37

-3

-3

26

50

10

36

100

60

22

45

25

10

15

33

-5

-7

25

49

35

20

41

19

39

-1

x= 0

y= 0

X= 180 Y= 360

x2= 120 y2= 346 xy= 193

xx = X / N = 180 / 9 = 20
byx = Coefficient of regression of Y on X
= xy / x 2 = 193 / 120 = 1.6083
= Y / N = 360 / 9 = 40
bxy = Coefficient of regression of X on Y
= xy / y 2 = 193 = 346 = 0.5578
Karl Pearsons correlation coefficient r between x and y is given by:

r 2 = byx . bxy = 1.6083 x 0.5578 = 0.8971


r = Square root of 0.8971 = 0.9471
Since, the regression coefficients are positive, r is also positive
r = + 0.9471
Alter:
rxy =xy / Square root of x2 . y2
= 193 / square root of 120 x 346
= 193 / square root of 41520
= 193 / 203. 7646
= 0.9471
Thus, we see that there is a very high degree of positive correlation between the test scores (x)
and the sales (in Rs 1000) (Y). This justifies the proposal for the termination of service of those
with low test scores.
Regression Equations
To obtain the test scores (X) for given sales (Y), we use the equation of the line of regression of
X on Y.
The equation of line of regression of X on Y is"
X - xx = bxy ( Y - )
X 20 = 0.5578 (Y 40) = 0.5578Y 22.312
X = 0.5578Y 22.312 + 20
X = 0.5578Y 2.312 ............................................... (i)
Hence, to ensure the continuation of service, the minimum text scores (X) corresponding to a
Minimum sales volume (y) of Rs. 30,000 = 30 ('000 Rs.) is obtained on putting Y = 30 in (i)
and is given by:
X = 0.5578 x 30 2.312 = 16.734 2.312
= 14.422 = 14

To estimate the sales volume (Y) of a salesman with given test scores (x), we use the line of
regression of Y on X, which is given by:
Y = byx (X - xx)
Y 40 = 1.6083 (X 20) = 1.6083X 32. 1660
Y 40 = 1.6083X 32.1660 + 40
Y = 1.6083X 32.1660 + 40
Y = 1.6083X + 7.8340
Y = 1.6083 x 14.422 + 7.8340 = 23. 195 + 7.8340
Y = 31.029 = 31

You might also like