You are on page 1of 6

MEASURES OF RELATIONSHIP

There are variables in nature that are related in such a way that if we know one of
them, the others can be estimated. For example, bright parents will most likely have
bright children. So if we know the IQ of the parents, we can make an educated guess of
their children’s IQ. The farther you travel in a vehicle, the more gasoline you consume.
The higher the sun in the horizon, the shorter is the shadow of the objects.

CORRELATION

A Correlation is a relationship or association between two variables.

A direct or positive relationship between two variables


implies that an increase in value of one of the variables
corresponds to an increase in value of the other variable.

A inverse or negative relationship between two variables


means that an increase in the value of one variable corresponds
to decrease in the value of the other variable.

A zero relationship exists between two variables if an


increase in one is not accompanied by either an increase or a
decrease in another.

In the language of statistics, the relationship between two variables is


termed as the correlation between two variables. Thus, we have
correspondingly positive correlation, negative correlation and zero
correlation. We say that there is a positive correlation between achievement
in English and Mathematics, a negative correlation between pressure and
volume at constant temperature and zero (or no) correlation between IQ and
mental ability and weight.

These conclusions are descriptive and they may not be sufficient to


understand the meaning of correlation. There is a need to be more precise
in expressing relationships between two variables. To be more precise
means to be able to express this relationship in numerical terms.

A correlation coefficient is a numerical measure of the


linear relationship between two variables.
Based on the formula derived by Carl Pearson, the correlation
coefficient has a range extending from -1 to +1.

-1 -0.5 0 +0.5 +1

Consider the number line. Correlations coefficients between +0.5 and


+1 are considered highly positive, while correlation between -1 and -0.5 are
considered highly negative. Correlations lower that +0.5 are considered
mildly positive, while correlation higher than -0.5 are considered mildly
negative. Finally correlations close to zero imply that no correlation exists
between the two variables. A more precise meaning attached to each
coefficient is dealt in inferential statistics.

The correlation coefficients are solved using respective derived


formula. Each is used depending on the type of data about the variables
one is dealing with. Recall that there are 4 types of data: nominal
dichotomous, ordinal, interval and ratio.

PEARSON PRODUCT-MOMENT CORRELATION

By assuming a linear relationship between two quantities x and y, the


famous British statistician, Carl Pearson derived a formula for finding the
correlation between x and y expressed as a number. The formula named in
his honor: Pearson Product-Moment Correlation Coefficient.

The Pearson Product-Moment Correlation coefficient rxy is a measure


of the linear correlation of two variables which are either ratio or interval.

n n

∑ (xi - xx̅ )(yi - yx̅) ∑ (zx )( z y )


i=1 i=1
rxy = ------------------------------ = ---------------------
(n – 1) (sx) (sy) n -1
Where xi = any x value
yi = any y value
xx̅ = mean of x
yx̅ = mean of y
sx = standard deviation of x
sy = standard deviation of y
n = number of pairs of x and y
zx = standard score for x
zy = standard score for y

The term n – 1 is used for samples, while n is used when dealing with
populations. The location of n – 1 or n in the denominator makes rxy dependent on the
size of the sample.

A more convenient form is derived by expanding the numerator and simplifying sx


and sy.

n n n

n ∑ (xi) (yi) - ∑ (xi ) ∑ (yi )


i=1 i=1 i=1
rxy = -------- -----------------------------------
n n n n

n ∑ (xi ) 2 – (∑ xi ) 2 n ∑ (yi ) 2
- (∑ yi ) 2

i=1 i=1 i=1 i=1

SPEARMAN’S RANK CORRELATION COEFFICIENT

When the two variables to be correlated are both measured in the


ordinal scale, the Spearman’s Rank Correlation Coefficient is used. For
example, 10 candidates for a managerial position were rank in their
presentation of business plan.
The British psychologist Charles Spearman (1863-1945) derived a
formula for rank correlation rs, the formula is

6∑ (xi - yi) ²
i=1
rs = 1 - ------------------
n ( n² – 1)

PHI COEFFICIENT

Dealing with nominal dichotomous variables, the most appropriate correlation


coefficient to use is called the Phi Coefficient. Refer to the Table of Data for the Phi
Coefficient

Variable x

1 2

1 a b a+b
Variable y
2 c d c+ d

a+c b+d

The Phi Coefficient rφ is

ad – bc
rφ = ---------------------------------------------------------------------------------------
------------------------------------------------------------------------
(a + b)(c + d)(a + c)(b +d)
This formula was first derived by Carl Pearson in 1901.

The Phi coefficient rφ is the measure of the correlation between


two real nominal dichotomous variables.

POINT-BISERAL CORRELATION
There is another correlation that is a special case of the Pearson product moment
correlation. It is called the Point-Biseral Correlation rpb . It correlates a real dichotomous
variable with an interval variable. For example, the score x in a test correlated with
gender y categorized as male (1) or female (0).

The formula is derived from the Pearson r:

xx̅1 - xx̅0 n1 no
rpb = ----------- ---------
sx n(n-1)

where xx̅1 = the mean of those which are labeled 1 in the real dichotomous, y

xx̅0 = the mean of those which are labeled 0 in the real dichotomous, y

n1 = the number of samples labeled 1 in y

n0 = the number of samples labeled 0 in y

n = the total number of samples n = n0 + n1

sx = the standard deviation of all the x values


The point-biseral correlation measures the correlation
between a real dichotomous variable and an interval variable,

You might also like