You are on page 1of 20

Chapter

5
Correla,on and Associa,on
Outline
Pearsons Correla,on
Special Correla,ons
Par,al, Part (Semi-par,al) and Mul,ple Correla,on
Tes,ng Dierence between Correla,on Coecients
Methods for Ordinal Data: Spearmans rho and
Kendals tau
Addi,onal Methods for Ordinal Data: Goodman and
Kruskals Gamma, Kendalls Coecient of
Concordance
Methods with Categorical/Nominal Data
Basic Aspects of Correla,on:
Direc,on, Strength, and Linearity

Direc,on: Posi,ve, nega,ve, and no correla,on

Strength: r2 (r square) = % of variance

Linearity: Linear rela,on and nonlinear rela,ons


Types of Correla,ons and Associa,on
Con,nuous variable, ordinal variables, nominal
variables
Tes,ng Hypothesis for Correla,on
Rho. Popula,on correla,on
Correla,on and Causality
(a) X causes Y, (b) Y causes X, (c) a third variable Z causes
both X and Y and XY are not causally related, (d) X and Y
cause each other cyclically (one aTer another, generally
over ,me), (e) X causes Z which causes Y, but X may (not)
directly cause Y, (f) X and Y are not causally related and
correla,on is spurious.
Pearsons Correla,on
Karl Pearson (1896)
Logic of Pearsons r: XY XY
N Covxy = xy = N !
(X X)(Y Y )
i i
N
Covxy = xy = i=1
!
N
Cov XY
xy r= !
rxy = ! S X SY
x y
n

(X X)(Y Y )
i i
86 86
Covxy = sxy = i=1
= = = 9.55 !
n 1 10 1 9
Small Data Example
(6,12),(11,15),(5,10), (12,19), (16,16), (10,17), (8,12),
(7,11), (14,18), (11,20)
n

(X X)(Y Y )
i i
86 86
Covxy = sxy = i=1
= = = 9.55 !
n 1 10 1 9
Cov XY 9.55 9.55
r= = = = +0.761
S X SY 3.527 3.559 12.555
Adjusted Correla,on Coecient: Unbiased coecient
(1 r 2 )(n 1)
radj = 1 !
n2
2
(1 - 761 )(10 - 1) 0.4207 9
radj = 1 - = 1- = 1 - 0.4733= 0.7257
10 - 2 8
Hypothesis Tes,ng About Popula,on
Correla,on: Signicance of Correla,on
H0 : = 0
!
HA : 0
Null: popula,on correla,on is zero
r n2
t= 2
!
1 r

r n - 2 0.7611 10 - 2
t= = = 3.3188
1- r2 2
1 - 0.7611
t(8) = 2.306, p = .05 t(8) = 3.555, p = .01
The t is signicant at .05 level
Decision: reject Null
Tes,ng Hypothesis That
Popula,on Correla,on is Specic Value

Fishers Z transforma,on of r.
r ' = .5 log e
1+ r
! 1+
' = .5 log e !
1 r 1

z r ' ' r ' '


z= = = !
Sr ' Sr ' 1
n3
.998 .549
H 0 : = .5 ! z=
.378
= 1.189 !
Condence Interval (CI) on
Popula,on Correla,on

p {r1 r2 } = 1 !

1
CI( ') = r ' ( z Sr ' ) = r ' z !
n 3

CI( ) = .253 .94 !


Power Calcula,on of Correla,on and
Determining the Sample Size
Power () is dened as
z
z1 For upper sided test
Sr '

= z1
z
For lower sided test !
Sr '

z z1 /2 + z z1 /2 For two-sided test
Sr ' Sr '

Where z = r ' ' !


2
z1 z
Sample size is es,mated by n = 3 + !
z
Assump,ons Underlining
Pearsons r
Independence among pairs of score.
The popula,on of X and the popula,on of
Y follow normal distribu,on and the
popula,on pair of scores of X and Y has a
normal bivariate distribu,on.
Factors Inuencing Correla,on
Coecient

Sampling restricted range


Combining heterogeneous subgroups
Presence of bivariate outliers
Absence of linearity
Special Correla,ons
(Correla,ons for Dichotomous Data
on Either or Both Variable)
Pearsons Correla,on
Point-biserial correla,on (rpb)
One variable is con,nuous (e.g., height) and another variable is
dichotomous (e.g., sex)
rpb, t-test, and eect-size
Phi Coecient
Both variables are dichotomous (e.g., sex and buying a product)
Phi and chi-square = n
2

Pearsons formula for computa,on (some texts provide separate


formula, not needed)
Special Correla,ons
(Correla,ons for Dichotomous Data
on Either or Both Variable)
Non-Pearson Correla,ons
Biserial correla,on (rb)
One variable is con,nuous (e.g., height) and another variable is dichotomous
(e.g., sex)
Y1 Y0 P0 P1

rb =
S
Y h
Tetrachoric Correla,on (rtet)
Both variables are dichotomous (e.g., sex and buying a product)

1800
rtet = cos
Computed as: ad
1 +
bc
Par,al, Part (Semi-par,al) and
Mul,ple Correla,on
Par,al Correla,on (rp)
Pearsons correla,on between two variables X and Y controlled for third
variable Z (rXY.Z).
Z can be one variable or a set of variables (Z1, Z2,, Zk).

rXY rXZ rYZ rP n v


rp = rXY .Z = t=
(1 rXZ 2
)(1 rYZ2 ) 1 rP2

Semi-par,al Correla,on (rsp)
Either X or Y is controlled for Z while compu,ng correla,on between X and Y
rXY rXZ rYZ
rSP = rX (Y .Z ) =
1 rYZ2
Mul,ple Correla,on: Correla,on between Y and set of X variables
Tes,ng Dierence Between
Correla,on Coecients
Tes,ng Dierence in Two Independent Correla,ons
1+ r
Fisher ' s z = r ' = tanh ( r ) = .5 log
-1

1 r
r '1 r '2
z=
1 1
+
n1 3 n2 3 2
k
Tes,ng Dierence k ( n j 3) r ' j
in More Than Two 2 = ( n j 3) r '2j k
j =1

Independent Correla,ons j =1
( n j 3)
j =1
Tes,ng Dierence in Two
Nonindependent Correla,ons
Dierence between two dependent correla,ons
(e.g., correla,ons obtained on same sample)
S,eger (1980)

(n 1)(1 + ryz )
T = (rxy rxz )
n 1
2 R + r 2
(1 ryz ) 3

n 3
Methods for Ordinal Data:
Spearmans rho and Kendals tau
Spearman is Pearsons correla,on for ranked data
Kendalls tau Coecient ()
Kendalls tau is based on inversions in the data
nC nD
%=
n( n 1)
2

%
z=
2(2n + 5)
9n(n 1)
Addi,onal Methods for Ordinal
Data
Goodman and Kruskals Gamma
nC nD
G=
nC + nD

nC + nD
z=G
n (1 G 2 )

Kendalls Coecient of Concordance


W%=
Variance of R j

Maximum possible value for R j

12 T j2 3 ( n + 1)
W%=
k 2 n ( n 2 1) n 1
Methods with Categorical/
Nominal Data
Chi-square test for independence
Phi coecient, con,ngency coecient (C) and
Cramers V
Odds ra,o
Likelihood Ra,o test
Cochran-Mantel-Haenszel sta,s,c
Cohens Kappa

You might also like