You are on page 1of 56

PS 601 Notes Part II

Statistical Tests

Notes Version - March 8, 2005

Revised

Slide # 1

Statistical tests

We can use the properties of


probability density functions to
make probability statements about
the likelihood of events occurring.
The standard normal curve provides
us with a scale or benchmark for the
likelihood of being at (or above or
below) any point on the scale
Revised

Slide # 2

Standard normal values

Note for instance that if we look at


the value 1.5 under the standard
normal table, we find the value .
4332.
This means that the probability of
having a standard normal value
greater than 1.5 is .5 - .4332 = .
0668
Revised

Slide # 3

In Applied Terms

If IQ has a mean of 100, and a


standard deviation of 20, what is the
probability that any given individuals
IQ will be greater than or equal to 130.

Standardize the score of 130


X i X i X 130 100 30
Zi

1.5

s
20
20

Look up 1.5 in the standard normal table

P ( X i 1.5) .5 .4332 .0668


Revised

Slide # 4

Two-tailed hypotheses

In general our hypothesis is:

If the sample mean is too high or too low,


we suspect that it did not.

Did the sample come from some particular


population?

Thus, we must check to see if the sample mean


is either significantly higher, or significantly
lower.

This is called a two-tailed test.


When in doubt, most tests are best done as
two-tailed ones
Revised

Slide # 5

The One Tailed Hypothesis

Sometimes we suspect, or hypothesize,


direction
e.g. The average income for West
Virginia will be significantly lower than
the country as a whole.

HA: Xbar <

This is a one-tailed test


We ignore the tail in the direction not
hypothesized
Revised

Slide # 6

The Z-test

The z-test is based upon the standard


normal distribution.

X
Z

In this case we are making statements


about the sample mean, instead of the
actual data values

Revised

Slide # 7

The Z-test (cont.)

Note that the Z-test is based upon


two parts.

The standard normal transformation

Xi
Zi

The standard deviation of the


sampling distribution.

sx
n
Revised
Slide # 8

The Z-test an example

Suppose that you took a sample of 25 people off


the street in Morgantown and found that their
personal income is $24,379
And you have information that the national average
for personal income per capita is $31,632 in 2003.
Is the Morgantown significantly different from the
National Average
Sources:

Revised

(1) Economagic
(2) US Bureau of Economic Analysis

Slide # 9

What to conclude?

Should you conclude that West Virginia


is lower than the national average?

Is it significantly lower?
Could it simple be a randomly bad
sample

Assume that it is not a poor sampling technique

How do you decide?


Revised

Slide # 10

Example (cont.)

We will hypothesize that WV income is lower


than the national average.

HA: WVInc < USInc (Alternate Hypothesis)


H0: WVInc = USInc (Null Hypothesis)

Since we know

the national average ($31,632),


and standard deviation (15000),

OK I made this up to make the problem simpler

we can use the z-test to make decide if WV is indeed


statistically significantly lower than the nation

Revised

Slide # 11

Example (cont.)

Using the z-test, we get


X 24379 31632
z

2.42

15000
n
25

OK so what?
Revised

Slide # 12

The Probability of a Type I


error

We would like to infer that WV had a lower income than


the national average, but we must examine whether we
simply got these numbers by chance in a random
sample
We would like to not make mistakes when we make
statistical decisions.
We know we will.
With statistical inference, we have the ability to decide
how often we find it acceptable to be wrong by
random chance.
Thus we set the probability of making a Type I error.

P(Type I error) = = ?
By convention =.05

Revised

Slide # 13

The Critical Value of Z


(cont)

Ok, now we know z


We know that we can make probability
statements about z, since it is from the
standard normal distribution
We know that if z =1.96 then the area out
in the tail past 1.96 is equal to .025
This means that the likelihood of
obtaining a value of z > 1.96 by random
chance in any given sample is less than .
025.
Revised

Slide # 14

The Critical Values of Z to


memorize

Two tailed hypothesis

Reject the null (H0) if z 1.96, or z


-1.96

One tailed hypothesis

If HA is Xbar > , then reject H0 if z


1.645
If HA is Xbar < , then reject H0 if z
-1.645
Revised

Slide # 15

Z test example (cont.)

Suppose we decided to look at a different state, say


Oregon.
Would you try a 1-tailed test?

Which way? HA: Xbar > or HA: Xbar <

Without an a priori reason to hypothesize higher oir lower,


use the 2-tailed test
Assume Oregon has a mean of 29,340, and that we
collected a somewhat larger sample, say 100.
Using the z-test, we get

X 29340 31632
z

1.528

15000
n
100

What
would we conclude? What if n=25? 1000?Slide
Revised

# 16

The applicability of the ztest

We frequently run into a problem with


trying to do a z test.
The sample size may be below the
number needed for the CLT to apply
(N~30)
While the population mean () may be
frequently available, the population
standard deviation () frequently is not.
Thus we use our best estimate of the
population standard deviation the
sample standard deviation (s).
Revised

Slide # 17

The t test

When we cannot use the population standard


deviation, we must employ a different statistical
test
Think of it this way:

The sample standard deviation is biased a little low, but we


know that as the sample size gets larger, it becomes closer
to the true value.
As a result, we need a sampling distribution that makes
small sample estimates conservative, but gets closer to the
normal distribution as the sample size gets larger, and the
sample standard deviation more closely resembles the
population standard deviation.
Thus we need the Students t

Revised

Slide # 18

The t-test (cont.)

The t-test is a very similar formula.

X
t
s
n

Note the two differences

using s instead of
The resultant is a value that has a tdistribution instead of a standard normal one.

Revised

Slide # 19

The t distribution

The t distribution is a statistical


distribution that varies according to the
number of degrees of freedom (Sample
size 1)
As df gets larger, the t approximates the
normal distribution.
For practical purposes, the t-distribution
with samples greater than 100 can be
viewed as a normal distribution.
Revised

Slide # 20

Selecting the critical value


t-dist

Selecting the critical value of the tdistribution requires these steps.

Determine whether one- or two-tailed test.


Select level (=.05)
Determine degrees of freedom (n-1)
Find value for t in appropriate column (table
if one- & two-tailed tests are separate tables)
Critical value of t is at intersection of df row
and -level column.

Revised

Slide # 21

Interpreting t-value

The t-test formula gives you a value that


you can compare to the critical value.
If:

Conducting a two tailed test, if the calculated


t-value is outside the range of t to +t, we
conclude that the sample is significantly
different that the population.
Note that a t-value that exceeds the critical
value means that the probability of that t is
less than the selected -level.
Hence if t > C.V . of t, then p(t) < (say .05)

Revised

Slide # 22

Interpreting t-value one


tailed test

The t-test formula gives you a value that you


can compare to the critical value.
If:

Conducting a one-tailed test, if the calculated t-value


is greater that the critical value of t, or less than
(critical value of t), we conclude that the sample is
significantly different that the population.

Choice of t or t is determined by the one-tailed test


direction.

Note that a t-value that exceeds the critical value


means that the probability of that t is less than the
selected -level.
Hence if t > C.V . of t, then p(t) < (say .05)

Revised

Slide # 23

T-test example

Suppose we decided to look at Oregon, but do not know the


population standard deviation

Would you try a 1-tailed test?


Which way? HA: Xbar > or HA: Xbar <

Like the z-test, without an a priori reason to hypothesize


higher or lower, use the 2-tailed test
Assume Oregon has a mean of 29,340, and that we collected a
sample of 169.
Using the t-test, we get

X 29340 31632
t n 1

1.9864
s
15000
n
169

What would we conclude? What if n=25? 1000?

Revised

Slide # 24

Two-sample t-test

Frequently we need to compare the


means of two different samples.
Is one group higher/lower than some
other group?
e.g. is the Income of blacks significantly
lower than whites?
The two-sample t difference of means
test is the typical way to address this
question.
Revised

Slide # 25

Examples

Is the income of blacks lower than


whites?
Are teachers salaries in West
Virginia and Mississippi alike?
Is there any difference between
the background well and the
monitoring well of a landfill?
Revised

Slide # 26

The Difference of means


Test

Frequently we wish to ask questions


that compare two groups.

Is the mean of A larger (smaller) than B?


Are As different (or treated differently) than
Bs?
Are A and B from the same population?

To answer these common types of


questions we use the standard twosample t-test
Revised

Slide # 27

The Difference of means


Test

The standard two-sample t-test is:


t

Revised

X1 X 2
2
1

2
2

s
s

n1 n2

Slide # 28

The standard two sample ttest

In order to conduct the two sample ttest, we only need the two samples
Population data is not required.
We are not asking whether the two
samples are from some large population.
We are asking whether they are from the
same population, whatever it may be.

Revised

Slide # 29

Assumptions about the


variance

The standard two-sample t-test makes


no assumptions about the variances of
the underlying populations.
Hence we refer to the standard test as
the unequal variance test.
If we can assume that the variances of
the tow populations are the same, then
we can use a more powerful test the
equal variance t-test.
Revised

Slide # 30

The Equal Variance test

If the variances from the two


samples are the same we may use a
more powerful variation
t

X1 X 2

1
1
s

n1 n2
2
t

Where

2
2

1
s

1
s
1
2
2
s2 1
t

Revised

n1 n2 2

Slide # 31

Which test to Use?

In order to choose the appropriate


two-sample t-test, we must decide
if we think the variances are the
same.
Hence we perform a preliminary
statistical test the equal variance
F-test.
Revised

Slide # 32

The Equal Variance F-test

One of the fortunate properties on


statistics is that the ratio of two
variances will have an F distribution.
Thus with this knowledge, we can
perform a simple test.
F( n1 1,n2 1)

Revised

s 2 l arg er
2
s smaller
Slide # 33

Interpretation of F-test

If we find that P(F) > .05, we conclude


that the variances are equal.
If we find that P(F) .05, we conclude
that the variances are unequal.
We then select the equal or unequalvariance t-test accordingly.
The F distribution

Revised

Slide # 34

Degrees of freedom

Note that the degrees of freedom


is different across the two tests

Equal variance test

Df = n1 +n2-2

Unequal variance test

Df = complicated real number not


integer

Revised

Slide # 35

Contingency Tables

Often we have limited measurement of


our data.
Contingency Tables are a means of
looking at the impact of nominal and
ordinal measures on each other.
They are called contingency tables
because one variables value is
contingent upon the other.
Also called cross-tabulation or crosstabs.
Revised

Slide # 36

Contingency Tables

The procedure is quite simple and


intuitively appealing
Construct a table with the
independent variable across the top
and the dependent variable on the
side
This works fairly well for low
numbers of categories (r,c < 6 or so)
Revised

Slide # 37

Contingency Tables
An example

Presidents are often suspected of using


military force to enhance their popularity.
What do you suppose the data actually look
like?
Any conjectures
Lets categorize presidents as using force,or
not, and as having popularity above and below
50%
Are there definition problems here?
Which is independent and which is dependent?
Revised

Slide # 38

Contingency Tables
Presidential Approval

Use of
Military
Force

Not
Used
Used
Total

Revised

<
50%

>
50%

16
70%

28
41%

44
48%

7
30%

40
59%

47
52%

23
100%

68
100%

91
100%

Slide # 39

Measures of Independence

Are the variables actually contingent


upon each other?
Is the use of force contingent upon the
presidents level of popularity?
We would like to know if these variables
are independent of each other, or does
the use of force actually depend upon
the level of approval that the president
have at that time?
Revised

Slide # 40

2 Test of Independence

The 2 Test of Independence gives


us a test of statistical significance.
It is accomplished by comparing
the actual observed values to
those you would expect to see if
the two variables are independent.

Revised

Slide # 41

2 Test of Independence

Formula

r ,c

i 1, j 1

ij

Ei j

E ij

Where
RowTotal * ColumnTota l
E ij
GrandTotal
Revised

Slide # 42

Chi-Square Table (2)


Cell Observe
d

Expected

Obs-Exp

(O-E)2

(O-E)2/E

16

44*23/91=11.
12

164.882=23.81 23.82/11.12=2.1
11.12=4.88
4

28

44*68/91=32.
88

28-32.88=23.82/32.88=.07
2
4.88
4.88 =23.81
2

40

47*23/91=11. 7-11.88=-4.88
23.82/1.88=2.00
88
4.882=23.81
68*47/91=35.
12

404.882=23.81 23.82/35.12=.06
35.12=4.88
8
=5.55

Revised

Slide # 43

Interpreting the 2

The Table gives us a 2 of 5.55 with 1 degree


of freedom [d.f. = (r-1)*(c-1)]
The critical value of 2 with 1 degree of
freedom is 3.84 (see 2 Table)
We therefore conclude that Presidential
popularity and use of force are related.
We technically reject the null hypothesis that
Presidential popularity and use of force are
independent.
Note: 2 is influenced by sample size.

It ranges from 0.0 to .

Revised

Slide # 44

Corrected 2 measures

Small tables have slightly biased


measures of 2
If there are cell frequencies that
are low, then there are some
adjustments to make that correct
the probability estimates that 2
provides.
Revised

Slide # 45

Yates Corrected 2

For use with a 2x2 table with low cell


frequencies (5<n<10)

2
Yates

r ,c

f e .5

r 1,c 1

fe

If there are any cell frequencies < 5,


the 2 is invalid.

Use Fishers Exact Test

Revised

Slide # 46

Measures of Association

Not only do we want to see whether the


variables of a cross-tabulation are
independent, we often want to see if the
relationship is a strong or weak one.
To do this, we use what are referred to
as measures of association.
The level of measurement determines
what measure of association we might
use.
Revised

Slide # 47

Measures of Association

We group them according to whether


the variables are nominal or ordinal.
If one variable is nominal, use
nominal measures.
If both are ordinal, use an ordinal
measure.
If either is interval, generally we use
a different statistical design.
Revised

Slide # 48

Measures based on 2

Contingency Coefficient
Kramers V

Revised

Slide # 49

Yules Q

May be used on any 2x2 table, nominal or


ordinal
If we define out table with cell counts as

a b
c d

Yules Q is calculated as:

ad bc
Q
ad bc

Q ranges from 0 to 1.0


Q compares concordant pairs to discordant pairs
Revised

Slide # 50

Phi

Revised

Slide # 51

Gamma

Will equal 1.0 if any cell is empty

Revised

Slide # 52

Lambda

Asymmetric measure of
association

Calculation depends on whether the


column variable or the row variable is
independent

Revised

Slide # 53

Ordinal Measures

Goodman & Kruskals Gamma

For Ordinal x Ordinal tables

May also be used if one of the variables is


a nominal dichotomy

Revised

Slide # 54

Lambda

Asymmetrioc

Revised

Slide # 55

Tau-b & Tau-c

Similar to Gamma
If r=c, use tau-b; if r<>c, use tau-c

Revised

Slide # 56

You might also like