You are on page 1of 28

NON PARAMETRIC STATISTICS

Nonparametic Statistics
Non parametric or Distribution-free methods
Hypothesis testing procedures that assume no knowledge
whatsoever about the distributions of the underlying
populations, except that they are continuous.
For ordinal data
Analysis of rank

Ex:
2 judges rank 5 brands of soda by assigning a rank of 1 to
the best soda, 2 to the second best and, so forth.
Non parametric test determine wheter there is any
agreement between the two judges
1. Sign Test
Sign test is used to test hypotheses on a population
median.
In testing H 0 : ~ ~ 0 againts the appropriate
alternatives, on a random sample of size n,

~ with (+)
Replace each sample value exceeding 0

Replace each sample value less than ~ with (-)


0

When a sample value 0 , exclude from the analysis.


~

Test statistic for sign test binomial random var X,


representing the number of (+) size in the random sample.
1. Sign Test
One-sided
~
H0 : ~ ,
0

H : ~~ ,
1 0

Reject H0 in favor of H1 if the proportion of (+) is sufficiently


less than , that is when x is small.
P P ( X x , when p 1/2)

If the computed P-value is less than or equal to significance


level reject H0 in favor of H1
1. Sign Test
One-sided
~
H0 : ~ ,
0

H : ~~ ,
1 0

Reject H0 in favor of H1 if the proportion of (+) is sufficiently


greater than , that is when x is large.
P P ( X x , when p 1/2)

If the computed P-value is less than or equal to significance


level reject H0 in favor of H1
1. Sign Test
Two sided
~
H0 : ~ ,
0

H : ~
~ ,
1 0

Reject H0 in favor of H1 if the proportion of (+) is significantly


less than or greater than , that is when x is sufficiently small
or sufficiently large
if x n/2, P 2P ( X x , when p 1/2)
if x n/2, P 2P ( X x , when p 1/2)
If the computed P-value is less than or equal to significance
level reject H0 in favor of H1
1. Sign Test
If n> 10 use normal curve approximation with
~ np

npq
Example 1

The following data represent the number of hours that a


rechargeable hedge trimmer operates before recharge is
required:
1,5 2,2 0,9 1,3 2,0 1,6 1,8 1,5 2,0 1,2 1,7
Use the sign test to test the hypothesis at the 0,05 level of
significance that this particular trimmer operates with a median
of 1,8 hours before requiring a recharge.
Example 2
A taxi company is trying to decide wheter the use of radial tires instead of
regular belted tires improves fuel economy. Sixteen cars are equipped with
radial tires and driven over a prescribed test course. Without changing
drivers, the same cars are then equipped with the regular belted tires and
driven once again over the test course. The gasoline consumption, in
kilometers per liter, is given in as follow:
Car 1 2 3 4 5 6 7 8
Radial Tires 4,2 4,7 6,6 7,0 6,7 4,5 5,7 6,0
Belted Tires 4,1 4,9 6,2 6,9 6,8 4,4 5,7 5,8
Car 9 10 11 12 13 14 15 16
Radial Tires 7,4 4,9 6,1 5,2 5,7 6,9 6,8 4,9
Belted Tires 6,9 4,9 6,0 4,9 5,3 6,5 7,1 4,8
Can we conclude at the 0,05 level of significance that cars equipped with
radial tires obtain better fuel economy than those equipped with regular
tires?
2. Signed-Rank Test
Sign test utilized only the (+) and (-) of the difference
between the observations and ~ 0 , but does not
consider the magnitude of these differences.

Signed-rank test or Wilcoxon Signed-rank test


utilized both direction and magnitude.

Use the Signed-rank test table


2. Signed-rank Test
In testing H 0 : ~ ~ againts the appropriate alternatives, on
0
a random sample of size n,
Subtract 0 from each sample value, discarding all the
~
differences (d) equal to zero.
Rank the remaining differences without regard to sign
Rank 1 is assigned to the smallest absolute d, 2 to the next
smallest, and so on. When the absolute value of two or more
is the same, assigned each the average of their ranks.
If the H 0 : ~ ~ 0 is true, the total of ranks corresponding to the
positive differences (w+) should nearly equal to the total of the
ranks corresponding to negative differences (w-).
Reject the null hypothesis when the value of the appropriate
statistic W+, W-, or W is sufficiently small.
2. Signed-rank Test
Two samples with Paired Observations
H0 H1 Compute
~
~
0 w+
~
~ ~
~
0 w-
0
~
~
0 w
~
1
~
2 w+
~ ~
1 2 w-
~
~ ~ ~
1 2 1 2 w

The null hypothesis is rejected if the computed value w+, w-, or


w is less than or equal to the appropriate tabled value.
Example 3
Rework example 1 by using signed-rank test.
Example 4
It is claimed that a college senior can increase his score in the major field
area of the graduate record examination by at least 50 points if he is
provided with sample problems in advance. To test this claim, 20 college
seniors are divided into 10 pairs such that each matched pair has almost the
same overall quality point average for their first 3 years in college. Sample
problems and answers are provided at random to one member of each pair 1
week prior to the examination. Test the null hypothesis at =0,05 that
sample problems increase the scores by 50 points againts the alternative
hypothesis that the increase is less than 50 points.
The examination scores are given as follow:
Pair
1 2 3 4 5 6 7 8 9 10
With sample problems 531 621 663 579 451 660 591 719 543 575
Without sample problems 509 540 688 502 424 683 568 728 530 524
3. Wilcoxon Rank-Sum Test
The Wilcoxon Rank-Sum Test an appropriate alternative to
the two-sample t-test.
In testing H 0 :
~
1
~ againts the appropriate alternative,
2
Select a random sample for each population, n 1 for a smaller
sample, and n2 for larger sample.
Arrange the n1+n2 observations of both samples in ascending

order, and subtitute the rank of 1, 2, ..., n1+n2 for each


observation.
w1= the sum of ranks corresponting to n1 observation

w2= the sum of ranks corresponding to n2 observation


Use the Critical Values for the Wilcoxon Rank Sum Test
3. Wilcoxon Rank-Sum Test
(n 1 n 2 )(n 1 n 2 1)
w1 w2
2
(n 1 n 2 )(n 1 n 2 1)
w2 w1
2
n (n 1) n (n 1)
u1 w1 1 1 or u 2 w 2 2 2
2 2
H0 H1 Compute
~
1
~
2 u1
~
~
1 2
~
1
~
2 u2
~
1
~
2 u
Reject H0 If the observed value of u1, u2, or u less than or equal to the
tabled critical value.
Example 5
The nicotine content of two brands of cigarettes,
measured in miligrams, was to be as follows:
Brand A 2,1 4,0 6,3 5,4 4,8 3,7 6,1 3,3
Brand B 4,1 0,6 3,1 2,5 4,0 6,2 1,6 2,2 1,9 5,4

Test the hypothesis, at the 0,05 level of significance,


that the median nicotine content of the two brands are
equal againts the alternative that they are unequal.
3. Wilcoxon Rank-Sum Test
Normal theory Approximation for Two Samples
When both n1 and n2 exceed 8, the sampling distribution of U1 (or U2)
approaches the normal distribution with
n 1 n 2 ( n 1 n 2 1)
n1n 2 2u1
Mean, u1 and variance, 12
2
U u1
Z
1
When n2 is greater than 20, use the statistic: u1

The use of the Wilcoxon rank-sum test is not restricted to nonnormal


populations. It can be used in place of the two-sample t-test when the
populations are normal, although the power will be smaller. The Wilcoxon rank-
sum test is always superior to the t-test for decidedly nonnormal populations
4. Kruskal-Wallis Test
To test the null hypothesis H0 that k independent samples are
from identical population, compute:
12 k
r 2i
h
n ( n 1) i 1 n i
3( n 1),

where
ri the assume value of R i ,
i 1,2,... , k

If h falls in the critical region H>2 with v=k-1 degrees of


freedom, reject H0 at the -level of significance; otherwise, fail
to reject H0
Example 6
In an experiment to determine which of three different missile
systems is preferable, the propellant burning rate is measured.
The data, after coding, are given in table below. Use the Kruskal-
Wallis test and a significcance level of =0,05 to test the
hypothesis that the propellant burning rates are the same for
three missile systems.

Missile Systems
1 2 3
24,0 16,7 22,8 23,2 19,8 18,1 18,4 19,1 17,3
19,8 18,9 17,6 20,2 17,8 17,3 19,7 18,9
18,8 19,3
5. Runs Test
A useful technique for testing the H0 that the observations
have indeed been drawn at random
A run a subsequence of one or more identical symbols
representating a common property of the data.
Runs test devides the data into two mutually exclusive
categories
Let n1 be the number of symbols associated with the category
that occurs the least
n2 be the number of symbols that belong to other category

The sample size n= n1 + n2


5. Runs Test
12 people are polled to find out if they use a certain product.
We would question the assumed randomness of the sample
if all 12 people were of the same sex.
A sequence of the experiment
M M F F F M F F M M M M
The groupings are called runs
The runs test for randomness is based on the random variabel
V, the total number of runs that occur in the complete sequence
Using table L.19 for Runs Test P(Vv* when H0 is true)
Example 7
A machine is adjusted to dispense acrylic paint thinner into a
container. Would you say that the amount of paint thinner being
dispensed by this machine varies randomly if the contents of
the next 15 containers are measured and found to be
3,6 3,9 4,1 3,6 3,8 3,7 3,4 4,0 3,8 4,1 3,9 4,0 3,8 4,2 4,1

liters? Use a 0,1 level of significance


5. Runs Test
The runs test, altough less powerfull, can also be used as an
alternative to the Wilcoxon two sample test to test the claim
that two random samples come from populations having the
same distribution and equal means.
When n1 and n2 increase in size, sampling distribution of V
approaches the normal distribution with mean:
2n 1 n 2 2 2n 1 n 2 (2n 1 n 2 n 1 n 2 )
V 1 and variance V
n1 n2 (n 1 n 2 )2 (n 1 n 2 1)
When n1 and n2 are both greater than 10, we could use the
statistic V V
Z
V
Tolerance Limits
Two sided tolerance limits
For any distribution of measurement, two-sided tolerance limits are
indicated by the smallest and largest observation in a sample size of n,
where n is determined so that one can assert with 100 (1-)% confidence
that at least the proportion 1- of the distribution is included between
the sample extremes
One-sided tolerance limits
For any distribution of measurement, a one-sided tolerance limits is
indicated by the smallest (largest) observation in a sample size of n,
where n is determined so that one can assert with 100 (1-)% confidence
that at least the proportion 1- of the distribution will exceed the
smallest (be less than the largest) observations in the sample.
Rank Correlation Coefficient
Also knows as Spearman rank correlation coefficient denoted
by rs
A non parametric measure of association between two
variables X and Y is given by the rank correlation coefficient
6 n

di
2
rs 1
n (n 2 1) i 1

Where di is the difference between the ranks assigned to xi and


yi and n is the number of pairs data
The value of rs will range from -1 to +1

The value -1 and +1 indicates perfect association between X


and Y, the value close to 0 means the variables uncorrelated
Example 8
The table below show the miligrams of tar and nicotine found in 10 brands of
cigarettes. Calculate the rank correlation coeffient to measure the degree of
relationship between tar and nicotine content in cigarettes.
Cigarette Brand Tar Content Nicotine Content
Viceroy 14 0,9
Marlboro 17 1,1
Chesterfield 28 1,6
Kool 17 1,3
Kent 16 1,0
Raleigh 13 0,8
Old Gold 24 1,5
Philip Morris 25 1,4
Oasis 18 1,2
Players 31 2,0

You might also like