You are on page 1of 11

Test 13

The Kolmogorov-Smirnov Test for


Two Independent Samples
(Nonparametric Test Employed with Ordinal Data)
I. Hypothesis Evaluated with Test and Relevant Background
Information
Hypothesis evaluated with test Do two independent samples represent two different populations?
Relevant background information on test The Kolmogorov-Smirnov test for two independent samples was developed by Smimov (1939). Daniel (1990) notes that because of the
similarity between Smirnov's test and a goodness-of-fit test developed by Kolmogorov (1933)
the test to be
(the Kolmogorov-Smirnov goodness-of-fit test for a single sample (Test 7)),
discussed in this chapter is often referred to as the Kolmogorov-Smirnov test for two independent samples (although other sources (Conover (1980, 1999)) simply refer to it as the
Smirnov test).
Daniel (1990), Marascuilo and McSweeney (1977), and Siege1 and Castellan (1988) note
that when a nondirectional/two-tailedalternative hypothesis is evaluated, the KolmogorovSmirnov test for two independent samples is sensitive to any kind of distributional difference
(i.e., a differencewith respect to 1ocationJcentraltendency, dispersion/variability,skewness, and
kurtosis). When a directional/one-tailed alternative hypothesis is evaluated, the test evaluates
the relative magnitude of the scores in the two distributions.
As is the case with the Kolmogorov-Smirnov goodness-of-fit test for a single sample
discussed earlier in the book, computation ofthe test statisticfor the Kolmogorov-Smirnov test
for two independent samples involves the comparison of two cumulative frequency distributions. Whereas the Kolmogorov-Smirnovgoodness-of-fit test for a single sample compares
the cumulative frequency distribution of a single sample with a hypothesized theoretical or
empirical cumulative frequency distribution, the Kolmogorov-Smirnov test for two independent samples comparesthe cumulative frequency distributions oftwo independent samples. If,
in fact, the two samples are derived from the same population, the two cumulative frequency
distributions would be expected to be identical or reasonably similar to one another. The test
protocol for the Kolmogorov-Smirnov test for two independent samples is based on the
principle that ifthere is a significant difference at any point along the two cumulative frequency
distributions, the researcher can conclude there is a high likelihoodthe samplesare derived from
different populations.
The Kolmogorov-Smirnov test for two independent samples is categorized as a test of
ordinal data because it requires that cumulative frequency distributions be constructed (which
requires that within each distribution scores be arranged in order of magnitude). Further
clarification of the defining characteristics of a cumulative frequency distribution can be found
in the Introduction, and in Section I of the Kolmogorov-Smirnov goodness-of-fit test for a
Copyright 2004 by Chapman & Hal/CRC

454

Handbook of Parametric and Nonparametric Statistical Procedures

single sample. Since the Kolmogorov-Smirnovtest for two independent samples represents
a nonparametric alternative to the t test for two independent samples (Test ll), the most
common situation in which a researcher might elect to employ the Kolmogorov-Smirnovtest
to evaluate a hypothesis about two independent samples (where the dependent variable represents intervallratiomeasurement) is when there is reason to believe that the normality and/or
homogeneity of variance assumption of the t test have been saliently violated. The
Kolmogorov-Smirnov test for two independent samples is based on the following
assumptions: a) All of the observations in the two samples are randomly selected and
independent of one another; and b) The scale of measurement is at least ordinal.

11. Example
Example 13.1 is identical to Examples 1 1.1112.1 (which are evaluated with the t test for two
independent samples and the Mann-Whitney U test (Test 12)).
Example 13.1 In order to assess the efficacy of a new antidepressant drug, ten clinically
depressedpatients are randomly assigned to one of two groups. Five patients are assigned to
Group 1, which is administered the antidepressant drug for a period of six months. The other
five patients are assigned to Group 2, which is administered a placebo during the same sixmonth period. Assume thatprior to introducing the experimental treatments, the experimenter
confirmed that the level of depression in the two groups was equal. After six months elapse all
ten subjects are rated by apsychiatrist (who is blind with respect to a subject's experimental
condition) on their level of depression. The psychiatrist's depression ratings for the five
subjects in each groupfollow (the higher the rating, the more depressed a subject): Group 1:
11, 1,0,2,0; Group 2: 1 1, 11, 5, 8,4. Do the data indicate that the antidepressant drug is
effective?

111. Null versus Alternative Hypotheses


Prior to reading the null and alternative hypotheses to be presented in this section, the reader
should take note of the following: a) The protocol for the Kolmogorov43mirnov test for two
independent samples requires that a cumulativeprobability distribution be constructed for each
of the samples. The test statistic is defined by the point that represents the greatest vertical
distance at any point between the two cumulative probability distributions; and b) Within the
framework of the null and alternativehypotheses, the notation F(X) represents the population
distribution from which the j 'h sample/group is derived. FJX) can also be conceptualized as
representing the cumulative probability distribution for the population from which the j Ih
samplelgroup is derived.
Null hypothesis

HO:F , O

F2(X)for all values ofX

(The distribution of data in the population that Sample 1 is derived from is consistent with the
distribution of data in the population that Sample 2 is derived from. Another way of stating
the null hypothesis is as follows: At no point is the greatest vertical distance between the cumulative probability distribution for Sample 1 (which is assumed to be the best estimate of the
cumulative probability distribution ofthe population from which Sample 1 is derived) and the
cumulative probability distribution for Sample 2 (which is assumed to be the best estimate ofthe
cumulative probability distribution of the population from which Sample 2 is derived) larger
than what would be expected by chance, if the two samples are derived from the same
population.)
Copyright 2004 by Chapman & Hal/CRC

Test 13

Alternative hypothesis

HI: Fl(X) + F2(X) for at least one value ofX

(The distribution of data in the population that Sample 1 is derived from is not consistent with
the distribution of data in the population that Sample 2 is derived from. Another way of stating
this alternative hypothesis is as follows: There is at least one point where the greatest vertical
distance between the cumulative probability distribution for Sample 1 (which is assumed to be
the best estimate ofthe cumulative probability distribution ofthe population from which Sample
1 is derived) and the cumulative probability distribution for Sample 2 (which is assumed to be
the best estimate ofthe cumulative probability distribution ofthe population from which Sample
2 is derived) is larger than what would be expected by chance, if the two samples are derived
from the same population. At the point of maximum deviation separating the two cumulative
probability distributions, the cumulative probability for Sample 1 is either significantly greater
or less than the cumulative probability for Sample 2. This is a nondirectional alternative hypothesis and it is evaluated with a two-tailed test.)

H,: Fl(X) > F2(X) for at least one value ofX


(The distribution of data in the population that Sample 1 is derived from is not consistent with
the distribution of data in the population that Sample 2 is derived from. Another way of stating
this alternative hypothesis is as follows: There is at least one point where the greatest vertical
distance between the cumulative probability distribution for Sample 1 (which is assumed to be
the best estimate ofthe cumulative probability distribution ofthe population from which Sample
1 is derived) and the cumulative probability distribution for Sample 2 (which is assumed to be
the best estimate ofthe cumulative probability distribution ofthe population from which Sample
2 is derived) is larger than what would be expected by chance, if the two samples are derived
from the same population. At the point of maximum deviation separating the two cumulative
probability distributions, the cumulative probability for Sample 1 is significantly greater than
the cumulative probability for Sample 2. This is a directional alternative hypothesis and it is
evaluated with a one-tailed test.)
or

H, : Fl(X) < FLX) for at least one value ofX


(The distribution of data in the population that Sample 1 is derived from is not consistent with
the distribution of data in the population that Sample2 is derived from. Another way of stating
this alternative hypothesis is as follows: There is at least one point where the greatest vertical
distance between the cumulative probability distribution for Sample 1 (which is assumed to be
the best estimate ofthe cumulative probability distribution ofthe population from which Sample
1 is derived) and the cumulative probability distribution for Sample 2 (which is assumed to be
the best estimate ofthe cumulative probability distribution ofthe population from which Sample
2 is derived) is larger than what would be expected by chance, if the two samples are derived
from the same population. At the point of maximum deviation separating the two cumulative
probability distributions, the cumulative probability for Sample 1 is significantly less than the
cumulative probability for Sample 2. This is a directional alternative hypothesis and it is
evaluated with a one-tailed test.)
Note: Only one of the above noted alternative hypotheses is employed. If the alternative
hypothesis the researcher selects is supported, the null hypothesis is rejected.
Copyright 2004 by Chapman & Hal/CRC

Handbook of Parametric and Nonparametric Statistical Procedures

456

IV. Test Computations


As noted in Sections I and 111, the test protocol for the Kolmogorov-Smirnov test for two
independent samples contrasts the two sample cumulative probability distributions with one
another. Table 13.1 summarizes the steps that are involved in the analysis. There are a total
of n = 10 scores, with nl = 5 scores in Group 1 and n, = 5 scores in Group 2.
Table 13.1 Calculation of Test Statistic for Kolmogorov-Smirnov Test
for Two Independent Samples for Example 13.1
A

ix,1

S,(X)

iX*)

s,0

s,tX) - s,o

0,o
1
2

215 = .40
315 = -60
415 = .80
415 = .SO
415 = . 80
415 = .SO
515 = 1.OO

0
0
0

4
5
8
11,11

115 = .20
215 = .40
315 = .60
515=1.00

11

.40- 0 =.40
.60 - 0 =.60
30- 0 = M = M
.SO - .20 = .60
.SO- .40 = .40
.SO - .60 = .20
1.00-1.00=.00

The values represented in the columns of Table 13.1 are summarized below.
The values of the psychiatrist's depression ratings for the subjects in Group 1 are recorded
in Column A. Note that there are five scores recorded in Column A, and that if the same score
is assigned to more than one subject in Group 1, each of the scores of that value is recorded in
the same row in Column A.
Each value in Column B represents the cumulative proportion associated with the value
of the X score recorded in Column A. The notation S,(X) is commonly employed to represent
the cumulative proportions for GrouplSample 1 recorded in Column B. The value in Column
B for any row is obtained as follows: a) The Group 1 cumulative frequency for the score in that
row (i.e., the frequencyof occurrence of all scores in Group 1 equal to or less than the score in
that row) is divided by the total number of scores in Group 1 (n, = 5). To illustrate, in the case
of Row 1,the score 0 is recorded twice in Column A. Thus, the cumulative frequency is equal
to 2, since there are 2 scores in Group 1 that are equal to 0 (a depression rating score cannot be
less than 0). Thus, the cumulative frequency 2 is divided by n, = 5, yielding 215 = .40. The
value .40 in Column B represents the cumulative proportion in Group 1 associated with a score
of 0. It means that the proportion of scores in Group 1 that is equal to 0 is .40. The proportion
of scores in Group 1 that is larger than 0 is .60 (since 1 - -40 = .60). In the case of Row 2, the
score 1 is recorded in Column A. The cumulative frequency is equal to 3, since there are 3
scores in Group 1 that are equal to or less than 1 (2 scores of 0 and a score of 1). Thus, the
cumulative frequency 3 is divided by nl = 5, yielding 315 = .60. The value .60 in Column B
represents the cumulative proportion in Group 1 associated with a score of 1. It means that the
proportion of scores in Group 1 that is equal to or less than 1 is .60. The proportion of scores
in Group 1 that is larger than 1 is .40 (since 1 - .60 = .40). In the case of Row 3, the score 2
is recorded in Column A. The cumulative frequency is equal to 4, since there are 4 scores in
Group 1 that are equal to or less than 2 (two scores of 0, a score of 1, and a score of 2). Thus,
the cumulative frequency 4 is divided by n, = 5, yielding 415 =.80. The value -80 in Column
B represents the cumulative proportion in Group 1 associated with a score of 2. It means that
the proportion of scores in Group 1 that is equal to or less than 2 is .80. The proportion of
scores in Group 1 that is larger than 2 is .20 (since 1 - .80 = .20). Note that the value of the
cumulative proportion in Column B remains .8 in Rows 4,5, and 6, since until a new score is
Copyright 2004 by Chapman & Hal/CRC

Test 13
recorded in Column A, the cumulative proportion recorded in Column B will remain the same.
In the case of Row 7, the score 11 is recorded in Column A. The cumulative frequency is equal
to 5, since there are 5 scores in Group 1 that are equal to or less than 11 (i.e., all of the scores
in Group 1 are equal to or less than 11). Thus, the cumulativefrequency 5 is divided by n, = 5 ,
yielding 515 = 1. The value 1 in Column B represents the cumulative proportion in Group 1
associated with a score of 11. It means that the proportion of scores in Group 1 that is equal to
or less than 11 is 1. The proportion of scores in Group 1 that is larger than 11 is 0 (since 1 1 = 0).
The values ofthe psychiatrist's depression ratings for the subjects in Group 2 are recorded
in Column C. Note that there are five scores recorded in Column C, and if the same score is
assigned to more than one subject in Group 2, each of the scores of that value is recorded in the
same row in Column C.
Each value in Column D represents the cumulative proportion associated with the value
of the X score recorded in Column C. The notation S2(X) is commonly employed to represent
the cumulative proportions for GroupISample 2 recorded in Column D. The value in Column
D for any row is obtained as follows: a) The Group 2 cumulative frequency for the score in that
row (i.e., the frequency of occurrence of all scores in Group 2 equal to or less than the score in
that row) is divided by the total number of scores in Group 2 (n, = 5). To illustrate, in the case
of Rows 1,2, and 3, no score is recorded in Column C. Thus, the cumulative frequencies for
each of those rows are equal to 0, since up to that point in the analysis there are no scores
recorded for Group 2. Consequently, for each of the first three rows, the cumulative frequency
0 is divided by n, = 5, yielding 015 = 0. In each of the first three rows, the value 0 in Column
D represents the cumulative proportion for Group 2 up to that point in the analysis. For each
of those rows, the proportion of scores in Group 2 that remain to be analyzed is 1 (since 1 - 0
= 1). In the case of Row 4, the score 4 is recorded in Column C. The cumulative frequency is
equal to 1, since there is 1 score in Group 2 that is equal to or less than 4 (i.e., the score 4 in that
row). Thus, the cumulative frequency 1 is divided by n2 = 5, yielding 115 =.20. The value .20
in Column D represents the cumulative proportion in Group 2 associated with a score of 4. It
means that the proportion of scores in Group 2 that is equal to or less than 4 is .20. The
proportion of scores in Group 2 that is larger than 4 is .80 (since 1 - .20 = .SO). In the case of
Row 5, the score 5 is recorded in Column C. The cumulative frequency is equal to 2, since
there are 2 scores in Group 2 that are equal to or less than 5 (the scores of 4 and 5). Thus, the
cumulative frequency 2 is divided by n, = 5, yielding 215 = .40. The value .40 in Column D
represents the cumulative proportion in Group 2 associated with a score of 5. It means that the
proportion of scores in Group 2 that is equal to or less than 5 is .40. The proportion of scores
in Group 2 that is larger than 5 is .60 (since 1 - .40 = .60). In the case of Row 6, the score 8
is recorded in Column C. The cumulative frequency is equal to 3, since there are 3 scores in
Group 2 that are equal to or less than 8 (the scores of 4, 5, and 8). Thus, the cumulative
frequency 3 is divided by n2 = 5, yielding 315 = .60. The value .60 in Column D represents
the cumulative proportion in Group 2 associated with a score of 8. It means that the proportion
of scores in Group 2 that is equal to or less than 8 is .60. The proportion of scores in Group 2
that is larger than 8 is .40 (since 1 - .60 = .40). In the case of Row 7, the score 1 1 is recorded
twice in Column C. The cumulative frequency is equal to 5, since there are 5 scores in Group
2 that are equal to or less than 1 1 (i.e., all ofthe scores in Group 2 are equal to or less than 11).
Thus, the cumulative frequency 5 is divided by n2 = 5, yielding 515 = 1. The value 1 in
Column D represents the cumulative proportion in Group 2 associated with a score of 11. It
means that the proportion of scores in Group 2 that is equal to or less than 11 is 1. The
proportion of scores in Group 2 that is larger than 11 is 0 (since 1 - 1 = 0).
Copyright 2004 by Chapman & Hal/CRC

458

Handbook of Parametric and Nonparametric Statistical Procedures

The values in Column E are differencescores between the cumulative proportionsrecorded


in Row B for Group 1 and Row D for Group 2. Thus, for Row 1the entry in Column E is .40,
which represents the Column B cumulative proportion of .40 for Group 1, minus 0, which
represents the Column D cumulative proportion for Group 2. For Row 2 the entry in Column
E is .60, which represents the Column B cumulative proportion of .60 for Group 1, minus 0,
which represents the Column D cumulative proportion for Group 2. The same procedure is
employed with the remaining five rows in the table.
As noted in Section 111, the test statistic for the Kolmogorov-Smirnov test for two
independent samples is defined by the greatest vertical distance at any point between the two
cumulative probability distributions. The largest absolute value obtained in Column E will
represent the latter value. The notation Mwill be employed for the test statistic. In Table 13.1
the largest absolute value is .80 (which is recorded in Row 3). Therefore, M = .80.'

V. Interpretation of the Test Results


The test statistic for the Kolmogorov-Smirnov test for two independent samples is evaluated
with Table A23 (Table of Critical Values for the Kolmogorov-Smirnov test for two independent samples) in the Appendix. If, at any point along the two cumulative probability
distributions, the greatest distance (i.e., the value of M ) is equal to or greater than the tabled
critical value recorded in Table A23, the null hypothesisis rejected. The critical values in Table
A23 are listed in reference to the values of n, and n2. For n, = 5 and n2 = 5, the tabled
critical two-tailed .05 and .O1 values are Mo5 = .800 and Mol = 300, and the tabled critical
one-tailed .05 and .Ol values are M,05= .600 and M,ol = .8OO .'
The following guidelines are employed in evaluating the null hypothesis for the
Kolmogorov-Smirnov test for two independent samples.
a) If the nondirectional alternative hypothesis HI: F,(X) i- F2(X) is employed, the null
hypothesis can be rejected if the computed absolute value of the test statistic is equal to or
greater than the tabled critical two-tailed M value at the prespecified level of significance.
b) If the directional alternative hypothesis HI: F,(X) > F^X) is employed, the null
hypothesis can be rejected if the computed absolute value of the test statistic is equal to or
greater than the tabled critical one-tailed M value at the prespecified level of significance.
Additionally, the difference between the two cumulative probability distributions must be such
that in reference to the point that represents the test statistic, the cumulative probability for
Sample 1 must be larger than the cumulative probability for Sample 2 (which will result in a
positive sign for the value of M).
c) If the directional alternative hypothesis HI: F,O < F2(X) is employed, the null hypothesis can be rejected ifthe computed absolute value of the test statistic is equal to or greater
than the tabled critical one-tailed Mvalue at the prespecified level of significance. Additionally,
the difference between the two cumulative probability distributions must be such that in
reference to the point that represents the test statistic, the cumulative probability for Sample 1
must be less than the cumulative probability for Sample 2 (which will result in a negative sign
for the value of M).
The above guidelines will now be employed in reference to the computed test statistic
M = .80.
a) If the nondirectional alternative hypothesis HI: F , O i- F^X) is employed, the null
hypothesis can be rejected at both the .05 and .O1 levels, since the absolute value M = .80 is
equal to the tabled critical two-tailed values M.05 = .800 and M.o, = 300.
b) If the directional alternative hypothesis H,: F,O > F2(X) is employed, the null
hypothesis can be rejected at both the .05 and .O1 levels, since the absolute value M = .80 is
Copyright 2004 by Chapman & Hal/CRC

Test 13

459

greater than or equal to the tabled critical one-tailed values M,,. = .600 and M g l = .800.
= ,801 > IS2(* = 01, the data are
Additionally, since in Row 3 of Table 13.1 [S,(X)
consistent with the alternative hypothesis H I : FAX} > F2(X). In other words, in computing the
value ofM, the cumulative proportion for Sample 1 is larger than the cumulativeproportion for
Sample 2 (which results in a positive sign for the value of M).
c) If the directional alternative hypothesis HI: F,(X)< F^X) is employed, the null
hypothesis cannot be rejected, since in order for the latter alternative hypothesis to be supported,
in computing the value of M, the cumulative proportion for Sample 2 must be larger than the
cumulative proportion for Sample 1 (which would result in a negative sign for the value of M
-which is not the case in Row 3 of Table 13.1).
A summary of the analysis of Example 13.1 with the Kolmogorov-Smirnovtest for two
independent samples follows: It can be concluded that there is a high likelihood the two
groups are derived from different populations. More specifically, the data indicate that the
depression ratings for Group 1 (i.e., the group that receives the antidepressant medication) are
significantly less than the depression ratings for Group 2 (the placebo group).
When the same set of data is evaluated with the t test for two independent samples and
the Mann-Whitney U test (i.e., Examples 11.1/12.1), in the case ofboth ofthe latter tests, the
null hypothesis can only be rejected (and only at the .05 level) if the researcher employs a
directional alternative hypothesis which predicts a lower level of depression for Group 1. The
latter result is consistent with the result obtained with the Kolmogorov-Smirnov test, in that
the directional alternative hypothesis H I : F,(X) > F2(X) is supported. Note, however, that the
latter directional alternative hypothesis is supported at both the .05 and .O1 levels when the
Kolmogorov-Smirnovtest is employed. In addition, the nondirectional alternativehypothesis
is supported at both the .05 and .O1 levels with the Kolmogorov-Smirnov test, but is not
supported when the t test and Mann-Whitney U test are used. Although the results obtained
with the Kolmogorov-Smirnov test for two independent samples are not identical with the
results obtained with the t test for two independent samples and the Mann-Whitney U test,
they are reasonably consistent.
It should be noted that in most instances the Kolmogorov-Smirnov test for two independent samples and the t test for two independent samples are employed to evaluate the
same set of data, the Kolmogorov-Smirnov test will provide a less powerful test of an
alternative hypothesis. Thus, although it did not turn out to be the case for Examples 11.1t13.1,
if a significant difference is present, the t test will be the more likely of the two tests to detect
it. Siege1 and Castellan (1988) note that when compared with the t test for two independent
samples, the Kolmogorov-Smirnovtest has apower efficiency (which is defined in Section VII
of the Wilcoxon signed-ranks test (Test 6)) of .95 for small sample sizes, and a slightly lower
power efficiency for larger sample sizes.

VI. Additional Analytical Procedures for the Kolmogorov-Smirnov


Test for Two Independent Samples and/or Related Tests
1. Graphical method for computing the Kolmogorov-Smirnov test statistic Conover (1980,

1999) employs a graphical method for computing the Kolmogorov-Smirnov test statistic that
is based on the same logic as the graphical method which is briefly discussed for computing the
test statistic for the Kolmogorov-Smirnov goodness-of-fit test for a single sample. The
method involves constructing a graph ofthe cumulativeprobability distribution for each sample
and measuring the point of maximum distance between the two cumulative probability
distributions. The latter graph is similar to the one depicted in Figure 7.1. Daniel (1990)
describes a graphical method that employs a graph referred to as a pair chart as an alternative
Copyright 2004 by Chapman & Hal/CRC

460

Handbook of Parametric and Nonparametric Statistical Procedures

way of computing the Kolmogorov-Smirnov test statistic. The latter method is attributed to
Hodges (1958) and Quade (1973) (who cites Drion (1952) as having developed the pair chart).
2. Computing sample confidence intervals for the Kolmogorov-Smirnov test for two independent samples The same procedure that is described for computing a confidence interval
for cumulative probabilities for the sample distribution that is evaluated with the KolmogorovSmirnov goodness-of-fit test for a single sample can be employed to compute a confidence
interval for cumulative probabilities for either one of the samples that are evaluated with the
Kolmogorov-Smirnov test for two independent samples. Specifically, Equation 7.1 is
employed to compute the upper and lower limits for each of the points in a confidence interval.
Thus, for each sample, Md is added to and subtracted from each ofthe S(X) values. Note that
the value of Ma employed in constructing a confidence interval for each of the samples is
derived from Table A21 (Table of Critical Values for the Kolmogorov-Smirnov Goodnessof-Fit Test for a Single Sample) in the Appendix. Thus, if one is computing a 95% confidence interval for each of the samples, the tabled critical two-tailed value M,n5= .563 for
n, = ni = % = 5 is employed to represent Ma in Equation 7.1.
Note the notation SJ(X)is used to represent the points on a cumulative probability distribution for the Kolmogorov-Smirnov test for two independent samples, while the notation
S(X,) is used to represent the points on the cumulative probability distribution for the sample
evaluated with the Kolmogorov-Smirnovgoodness-of-fittest for a single sample. In the case
of the latter test, there is only one sample for which a confidence interval can be computed,
while in the caseofthe Kolmogorov-Smirnov test for two independent samples, a confidence
interval can be constructed for each of the independent samples.

3. Large sample chi-square approximation for a one-tailed analysis of the KolmogorovSmirnov test for two independent samples Siegel and Castellan (1 988) note that Goodman
(1954) has shown that Equation 13.1 (which employs the chi-square distribution) can provide
a good approximation for large sample sizeswhen a one-tailecUdirectiona1alternativehypothesis
is eval~ated.~
(Equation 13.1)

The computed value of chi-square is evaluated with Table A4 (Table of the Chi-square
Distribution) in the Appendix. The degrees of freedom employed in the analysis will always
be df = 2. The tabled critical one-tailed .05 and .O1 chi-squared values in Table A4 for df= 2
2
are x 2 . =~ 5.99 and x,OI
= 9.21 . Ifthe computed value ofchi-square is equal to or greater than
either of the aforementioned values, the null hypothesis can be rejected at the appropriate level
of significance (i.e., the directional alternative hypothesis that is consistent with the data will
be supported). Although our sample size is too small for the large sample approximation, for
purposes of illustration we will use it. When the appropriate values for Example 13.1 are substituted in Equation 13.1, the value y2 = 6.4 is computed. Since x2 = 6.4 is larger than
2
Y~~
= 5.99 but less than
= 9.21, the null hypothesis can be rejected, but only at the .05
level. Thus, the directional alternative hypothesis H,: F,(X) > F2(X) is supported at the .05
level. Note than when the tabled critical values in Table A23 are employed, the latter
alternative hypothesisis also supported at the .O1 level. The latter is consistent with the fact that
Siegel and Castellan (1988) note that when Equation 13.1 is employed with small sample sizes,
it tends to yield a conservative result (i.e., it is less likely to reject a false null hypothesis).

2,

Copyright 2004 by Chapman & Hal/CRC

Test 13

VII. Additional Discussion of the Kolmogorov-Smirnov Test


for Two Independent Samples
1. Additional comments on the Kolmogorov-Smirnov test for two independent samples
a) Daniel (1990) statesthat if for both populations a continuous dependent variable is evaluated,
the Kolmogorov-Smirnov test for two independent samples yields exact probabilities. He
notes, however, that Noether (1963,1967) has demonstrated that if a discrete dependent variable
is evaluated, the test tends to be conservative(i.e., is less likely to reject a false null hypothesis);
b) Sprent (1993) notes that the Kolmogorov-Smirnovtest for two independent samples may
not be as powerful as tests which focus on whether or not there is a difference on a specific distributional characteristic such as a measure of central tendency andlor variability. Siege1 and
Castellan (1988) state that the Kolmogorov-Smirnov test for two independent samples is
more powerful than the chi-square test for r x c tables (Test 16) and the median test for
independent samples (Test 16e). They also note that for small sample sizes, the
Kolmogorov-Smirnov test has a higher power efficiency than the Mann-Whitney U test, but
as the sample size increases the opposite becomes true with regard to the power efficiency of the
two tests; and c) Conover (1980,1999) and Hollander and Wolfe (1999) provide a more detailed
discussion of the theory underlying the Kolmogorov-Smirnov test for two independent
samples.

VIII. Additional Examples Illustrating the Use of the KolmogorovSmirnov Test for Two Independent Samples
Since Examples 11.4 and 11.5 in Section VIII of the t test for two independent samples
employ the same data as Example 13.1, the Kolmogorov-Smirnov test for two independent
samples will yield the same result if employed to evaluate the latter two examples. In addition,
the Kolmogorov-Smirn~vtest can be employed to evaluate Examples 11.2 and 11.3. Since
different data are employed in the latter examples, the result obtained with the KolmogorovSmirnov test will not be the same as that obtained for Example 13.1. Example 11.2 is evaluated
below with the Kolmogorov-rSmirnov test for two independent samples. Table 13.2 summarizes the analysis.
Table 13.2 Calculation of Test Statistic for Kolmogorov-Smirnov Test
for Two Independent Samples for Example 11.2

The obtained value of test statistic is M = .60, since .60 is the largest absolute value for
a difference score recorded in Column E of Table 13.2. Since n, = 5 and n2 = 5, we
employ the same critical values used in evaluating Example 13.1. If the nondirectional
Copyright 2004 by Chapman & Hal/CRC

462

Handbook of Parametric and Nonparametric Statistical Procedures

alternative hypothesis H,: Fl(X) * F2(X) is employed, the null hypothesis cannot be
rejected at the .05 level, since M = .60 is less than the tabled critical two-tailed value Mo, =
.800. The data are consistent with the directional alternativehypothesis H,:F,(X) < F2(X),
since in Row 3 of Table 13.2 [Sl(X) = .40] < [S^(X) = 11 . In other words, in computing
the value of M, the cumulative proportion for Sample 2 is larger than the cumulative
proportion for Sample 1 (resulting in a negative sign for the computed value of A/). The
directional alternative hypothesis HI: FAX) < F2(X) is supported at the .05 level, since
M = .60 is equal to the tabled critical one-tailed value Mor = .600. It is not, however,
supported at the .O1 level, since M = .60 is less than the tabled critical one-tailed value Mnl
= .800. The directional alternative hypothesis HI: FAX) > F2(X) is not supported, since it
is not consistent with the data (i.e., the sign of the value computed for M is not positive).
When the null hypothesis H; p, = u, is evaluated with the t test for two independent
samples, the only alternative hypothesis which is supported (but only at the .05 level) is the
directional alternative hypothesis HI: p, > p2. The latter result (indicating higher scores in
Group 1) is consistent with the result that is obtained when the Kolmogorov-Smirnovtest for
two independent samples is employed to evaluate the same set of data.

References
Conover, W. J. (1980). Practical nonparametric statistics (2nd ed.). New York: John Wiley
& Sons.
Conover, W. J. (1999). Practical nonparametric statistics (3rd ed.). New York: John Wiley
& Sons.
Daniel, W. W. (1990). Applied nonparametric statistics (2nd 4.). Boston: PWS-Kent Publishing Company.
Drion, E. F. (1952). Some distribution-free tests for the difference between two empirical
cumulative distribution functions. Annals of Mathematical Statistics, 23, 563-574.
Goodman, L. A. (1954). Kolmogorov-Smimov tests for psychological research. Psychological
Bulletin, 51, 160-168.
Hodges, J. L., Jr. (1958). The significance probability of the Smirnov two-sample test. Ark.
Mat., 3,469-486.
Hollander, M. and Wolfe, D. A. (1999). Nonparametricstatistical methods. New York: John
Wiley & Sons.
Khamis, H. J. (1990). The 8 corrected Kolmogorov-Smimov test for goodness-of-fit. Journal
of Statistical Plan. Infer., 24,3 17-355.
Kolmogorov, A. N. (1933). Sulla deterrninazioneempirica di una legge di distribuzione. Giorn
dell'lnst. Ital. degli. Att., 4, 89-91.
Marascuilo, L. A. and McSweeney, M. (1977). Nonparametricand distribution-freemethods
for the social sciences. Monterey, CA: BrookdCole Publishing Company.
Massey, F. J., Jr. (1952). Distribution tables for the deviation between two sample cumulatives.
Annals of Mathematical Statistics, 23, pp. 435-441.
Noether, G. E. (1963). Note on the Kolmogorov statistic in the discrete case. Metrika, 7,
115-1 16.
Noether, G. E. (1967). Elements of nonparmetric statistics. New York: John Wiley & Sons.
Quade, D. (1973). The pair chart. Statistica Neerlandica, 27,2945.
Siegel, S. and Castellan, N. J., Jr. (1988). Nonparametric statistics for the behavioral
sciences (2nd ed.). New York: McGraw-Hill Book Company.
Smimov, N. V. (1936). Sur la distribution de w 2(criterium de M. R v. Mises). Comptes
Rendus (Paris), 202,449-452.
Copyright 2004 by Chapman & Hal/CRC

Test 13

463

Smirnov, N. V. (1939). Estimate of deviation between empirical distribution functions in two


independent samples (Russian), Bull Moscow Univ., 2,3-16.
Sprent, P. (1993). Applied nonparametric statistics (2nd ed). London: Chapman & Hall.
Zar, J. H. (1999). Biostatistical analysis (4th 4 . ) . Upper Saddle River, NJ: Prentice Hall.

Endnotes
Marasucilo and McSweeney (1977) employ a modified protocol that can result in a larger
absolute value for M in Column E than the one obtained in Table 13.1. The latter protocol
employs a separate row for the score of each subject when the same score occurs more than
once within a group. If the latter protocol is employed in Table 13.1, the first two rows of
the table will have the score of 0 in Column A for the two subjects in Group 1 who obtain
that score. The first 0 will be in the first row, and have a cumulative proportion in Column
B of 115 = .20. The second 0 will be in the second row, and have a cumulative proportion
in Column B of 215 = .40. In the same respect the first of the two scores of 11 (obtained by
two subjects in Group 2) will be in a separate row in Column C, and have a cumulative
proportion in Column D of 415 = 30. The second score of 11 will be in the last row of the
table, and have a cumulative proportion in Column D of 515 = 1. In the case of Example
13.1, the outcome of the analysis will not be affected if the aforementioned protocol is
employed. In some instances, however, it can result in a larger M value. The protocol
employed by Marasucilo and McSweeney (1977) is used by sources who argue that when
there are ties present in the data (i.e., the same score occurs more than once within a group),
the protocol described in this chapter (which is used in most sources) results in an overly
conservative test (i.e., makes it more difficult to reject a false null hypothesis).
2. When the values of n, and n2 are small, some of the .05 and .O1 critical values listed in
Table A23 are identical to one another.
3. The last row in Table A23 can also be employed to compute a critical M value for large
sample sizes.

Copyright 2004 by Chapman & Hal/CRC

You might also like