You are on page 1of 20


Correla,on and Associa,on
Pearsons Correla,on
Special Correla,ons
Par,al, Part (Semi-par,al) and Mul,ple Correla,on
Tes,ng Dierence between Correla,on Coecients
Methods for Ordinal Data: Spearmans rho and
Kendals tau
Addi,onal Methods for Ordinal Data: Goodman and
Kruskals Gamma, Kendalls Coecient of
Methods with Categorical/Nominal Data
Basic Aspects of Correla,on:
Direc,on, Strength, and Linearity

Direc,on: Posi,ve, nega,ve, and no correla,on

Strength: r2 (r square) = % of variance

Linearity: Linear rela,on and nonlinear rela,ons

Types of Correla,ons and Associa,on
Con,nuous variable, ordinal variables, nominal
Tes,ng Hypothesis for Correla,on
Rho. Popula,on correla,on
Correla,on and Causality
(a) X causes Y, (b) Y causes X, (c) a third variable Z causes
both X and Y and XY are not causally related, (d) X and Y
cause each other cyclically (one aTer another, generally
over ,me), (e) X causes Z which causes Y, but X may (not)
directly cause Y, (f) X and Y are not causally related and
correla,on is spurious.
Pearsons Correla,on
Karl Pearson (1896)
Logic of Pearsons r: XY XY
N Covxy = xy = N !
(X X)(Y Y )
i i
Covxy = xy = i=1
Cov XY
xy r= !
rxy = ! S X SY
x y

(X X)(Y Y )
i i
86 86
Covxy = sxy = i=1
= = = 9.55 !
n 1 10 1 9
Small Data Example
(6,12),(11,15),(5,10), (12,19), (16,16), (10,17), (8,12),
(7,11), (14,18), (11,20)

(X X)(Y Y )
i i
86 86
Covxy = sxy = i=1
= = = 9.55 !
n 1 10 1 9
Cov XY 9.55 9.55
r= = = = +0.761
S X SY 3.527 3.559 12.555
Adjusted Correla,on Coecient: Unbiased coecient
(1 r 2 )(n 1)
radj = 1 !
(1 - 761 )(10 - 1) 0.4207 9
radj = 1 - = 1- = 1 - 0.4733= 0.7257
10 - 2 8
Hypothesis Tes,ng About Popula,on
Correla,on: Signicance of Correla,on
H0 : = 0
HA : 0
Null: popula,on correla,on is zero
r n2
t= 2
1 r

r n - 2 0.7611 10 - 2
t= = = 3.3188
1- r2 2
1 - 0.7611
t(8) = 2.306, p = .05 t(8) = 3.555, p = .01
The t is signicant at .05 level
Decision: reject Null
Tes,ng Hypothesis That
Popula,on Correla,on is Specic Value

Fishers Z transforma,on of r.
r ' = .5 log e
1+ r
! 1+
' = .5 log e !
1 r 1

z r ' ' r ' '

z= = = !
Sr ' Sr ' 1
.998 .549
H 0 : = .5 ! z=
= 1.189 !
Condence Interval (CI) on
Popula,on Correla,on

p {r1 r2 } = 1 !

CI( ') = r ' ( z Sr ' ) = r ' z !
n 3

CI( ) = .253 .94 !

Power Calcula,on of Correla,on and
Determining the Sample Size
Power () is dened as
z1 For upper sided test
Sr '

= z1
For lower sided test !
Sr '

z z1 /2 + z z1 /2 For two-sided test
Sr ' Sr '

Where z = r ' ' !

z1 z
Sample size is es,mated by n = 3 + !
Assump,ons Underlining
Pearsons r
Independence among pairs of score.
The popula,on of X and the popula,on of
Y follow normal distribu,on and the
popula,on pair of scores of X and Y has a
normal bivariate distribu,on.
Factors Inuencing Correla,on

Sampling restricted range

Combining heterogeneous subgroups
Presence of bivariate outliers
Absence of linearity
Special Correla,ons
(Correla,ons for Dichotomous Data
on Either or Both Variable)
Pearsons Correla,on
Point-biserial correla,on (rpb)
One variable is con,nuous (e.g., height) and another variable is
dichotomous (e.g., sex)
rpb, t-test, and eect-size
Phi Coecient
Both variables are dichotomous (e.g., sex and buying a product)
Phi and chi-square = n

Pearsons formula for computa,on (some texts provide separate

formula, not needed)
Special Correla,ons
(Correla,ons for Dichotomous Data
on Either or Both Variable)
Non-Pearson Correla,ons
Biserial correla,on (rb)
One variable is con,nuous (e.g., height) and another variable is dichotomous
(e.g., sex)
Y1 Y0 P0 P1

rb =
Y h
Tetrachoric Correla,on (rtet)
Both variables are dichotomous (e.g., sex and buying a product)

rtet = cos
Computed as: ad
1 +
Par,al, Part (Semi-par,al) and
Mul,ple Correla,on
Par,al Correla,on (rp)
Pearsons correla,on between two variables X and Y controlled for third
variable Z (rXY.Z).
Z can be one variable or a set of variables (Z1, Z2,, Zk).

rXY rXZ rYZ rP n v

rp = rXY .Z = t=
(1 rXZ 2
)(1 rYZ2 ) 1 rP2

Semi-par,al Correla,on (rsp)
Either X or Y is controlled for Z while compu,ng correla,on between X and Y
rSP = rX (Y .Z ) =
1 rYZ2
Mul,ple Correla,on: Correla,on between Y and set of X variables
Tes,ng Dierence Between
Correla,on Coecients
Tes,ng Dierence in Two Independent Correla,ons
1+ r
Fisher ' s z = r ' = tanh ( r ) = .5 log

1 r
r '1 r '2
1 1
n1 3 n2 3 2
Tes,ng Dierence k ( n j 3) r ' j
in More Than Two 2 = ( n j 3) r '2j k
j =1

Independent Correla,ons j =1
( n j 3)
j =1
Tes,ng Dierence in Two
Nonindependent Correla,ons
Dierence between two dependent correla,ons
(e.g., correla,ons obtained on same sample)
S,eger (1980)

(n 1)(1 + ryz )
T = (rxy rxz )
n 1
2 R + r 2
(1 ryz ) 3

n 3
Methods for Ordinal Data:
Spearmans rho and Kendals tau
Spearman is Pearsons correla,on for ranked data
Kendalls tau Coecient ()
Kendalls tau is based on inversions in the data
nC nD
n( n 1)

2(2n + 5)
9n(n 1)
Addi,onal Methods for Ordinal
Goodman and Kruskals Gamma
nC nD
nC + nD

nC + nD
n (1 G 2 )

Kendalls Coecient of Concordance

Variance of R j

Maximum possible value for R j

12 T j2 3 ( n + 1)
k 2 n ( n 2 1) n 1
Methods with Categorical/
Nominal Data
Chi-square test for independence
Phi coecient, con,ngency coecient (C) and
Cramers V
Odds ra,o
Likelihood Ra,o test
Cochran-Mantel-Haenszel sta,s,c
Cohens Kappa

You might also like