You are on page 1of 6

DIFFERENCES BETWEEN PARAMETRIC AND

NON-PARAMETRIC TESTS PLUS THEIR


ADVANTAGES AND LIMITATIONS

BY:
AMIR ABDULAZEEZ

GEO 8304: QUALITATIVE AND QUANTITATIVE


TECHNIQUES

FEBRUARY, 2014
1
1.0 INTRODUCTION
Data can either be continuous, discrete, binary, or categorical. Continuous, or interval, data have
units that can be measured with a value anywhere between the lowest and the highest value. An
example is platelet count. Discrete, or ordinal, data have a rank order, but the scale is not necessarily
linear. A pain scale from 1 to 10 is a good example; a pain score of 8 is not necessarily twice as bad
as 4. Binary data are simply yes/no data: alive or dead. Examples of categorical, or nominal, data are
colour or shape. The data are different, but no rank order exists. The test chosen to analyse the data is
based on the type of data collected and some key properties of that data (Hoskin, undated).
In More Good Reasons to Look at the Data, we looked at data distributions to assess centre, shape
and spread and described how the validity of many statistical procedures relies on an assumption of
approximate normality (Niedeen et al, 2003). But what do we do if our data are not normal? Here,
we’ll cover the difference between parametric and nonparametric procedures. Nonparametric
procedures are one possible solution to handle non-normal data.

2.0 PARAMETRIC TESTS


According to Robson (1994), a parametric statistical test is a test whose model specifies certain
conditions about the parameters of the population from which the research sample was drawn. He
stated the conditions to include; the observations must be independent, the observations must be
drawn from normally distributed populations, these populations must have the same variances and
variables involved must have been measured in at least an interval scale.
Commonly used parametric tests are described below;

 Pearson Product Correlation Coefficient: The correlation coefficient (r) is a value that tells us
how well 2 continuous variables from the same subject correlate to each other. An r value of 1.0
means the data are completely positively correlated and 1 variable can be used to compute the
other. An r of zero means that the 2 variables are completely random. An r of _1.0 is completely
negatively correlated.
The important thing to remember is that this is only an association and does not imply a cause-
and-effect relationship.
 Student t-Test: The Student t-test is probably the most widely used parametric test. It was
developed by a statistician working at the Guinness brewery and is called the Student t-test
because of proprietary rights. A single sample t-test is used to determine whether the mean of a
sample is different from a known average. A 2-sample t-test is used to establish whether a
difference occurs between the means of 2 similar data sets. The t-test uses the mean, standard
deviation, and number of samples to calculate the test statistic. In a data set with a large number
of samples, the critical value for the Student t-test is 1.96 for an alpha of 0.05, obtained from a t-
test table. The calculation to determine the t-value is relatively simple, but it can be found easily
on-line or in any elementary statistics book.
 The z-Test: The next test, which is very similar to the Student t-test, is the z-test. However, with
the z-test, the variance of the standard population, rather than the standard deviation of the study
groups, is used to obtain the z-test statistic. Using the z-chart, like the t-table, we see what
percentage of the standard population is outside the mean of the sample population. If, like the t-

2
test, greater than 95% of the standard population is on one side of the mean, the p-value is less
than 0.05 and statistical significance is achieved.
As some assumption of sample size exists in the calculation of the z-test, it should not be used if
sample size is less than 30. If both the n and the standard deviation of both groups are known, a
two sample t-test is best.
 ANOVA: Analysis of variance (ANOVA) is a test that incorporates means and variances to
determine the test statistic. The test statistic is then used to determine whether groups of data are
the same or different. When hypothesis testing is being performed with ANOVA, the null
hypothesis is stated such that all groups are the same. The test statistic for ANOVA is called the
F-ratio. As with the t- and z-statistics, the F-statistic is compared with a table to determine
whether it is greater than the critical value. In interpreting the F-statistic, the degrees of freedom
for both the numerator and the denominator are required. The degrees of freedom in the
numerator are the number of groups minus 1, and the degrees of freedom in the denominator are
the number of data points minus the number of group.

3.0 NON-PARAMETRIC TESTS


Nonparametric statistics (also called “distribution free statistics”) are those that can describe some
attribute of a population, test hypotheses about that attribute, its relationship with some other
attribute, or differences on that attribute across populations, across time or across related constructs,
that require no assumptions about the form of the population data distribution(s) nor require interval
level measurement (Hebel, 2002).
According to Robson (1994), non-parametric tests should be used when testing nominal or ordinal
variables and when the assumptions of parametric test have not been met
A non-parametric statistical test is also a test whose model does NOT specify conditions about the
parameters of the population from which the sample was drawn. It does not require measurement as
strong as that required for the parametric tests. Most non-parametric tests apply to data in an ordinal
scale, and some apply to data in nominal scale.
Commonly used non-parametric tests are described below;

 Chi-Squared: The chi-squared test is usually used to compare multiple groups where the input
variable and the output variable are binary. The chi-square test helps to decide whether a
frequency distribution could be the result of a definite cause or just chance. It does this by
comparing the actual distribution with the distribution which could be expected if chance was the
only factor operating. If the difference between the observed results and the expected results is
small then perhaps chance is the only factor. On the other hand if the difference between
observed and expected results is large, then the difference is said to be significant and we expect
that something is causing it.
 Spearman Rank Coefficient: Like the Pearson product correlation coefficient, the Spearman
rank coefficient is calculated to determine how well 2 variables for individual data points can
predict each other. The difference is that the data need not be linear. To start, it is easiest to graph
all the data points and find the x and y values. Then rank each x and y value in order of
occurrence. Similar to the Pearson correlation coefficient, the test statistic is from -1 to 1, with -1
being a perfect negative correlation and 1 a perfect positive correlation.
3
 Mann-Whitney U Test: This test sometimes referred to as Wilcoxon rank test, uses rank just as
the previous test did. It is analogous to the t-test for continuous variable but can be used for
ordinal data. This test compares 2 independent populations to determine whether they are
different. The sample values from both sets of data are ranked together. Once the 2 test statistics
are calculated, the smaller one is used to determine significance. Unlike the previous tests, the
null hypothesis is rejected if the test statistic is less than the critical value. The U-value table is
not as widely available as the previous tables, but most statistic software will give a p-value and
state whether statistical difference exists.
 Kruskal-Wallis Test: The Kruskal-Wallis test uses ranks of ordinal data to perform an analysis
of variance to determine whether multiple groups are similar to each other. This test, like the
previous example, ranks all data from the groups into 1 rank order and individually sums the
different ranks from the individual groups. These values are then placed into a larger formula that
computes an H-value for the test statistic. The degrees of freedom used to find the critical value
is the number of groups minus 1.

Analysis Type Example Parametric Nonparametric


Procedure Procedure
Compare means between Is the mean annual temperature of Two-sample t-test Wilcoxon rank-
two distinct/independent extreme southern Kano different from sum test
groups the mean annual temperature of
extreme northern Kano?
Compare two Was there a significant change in soil Paired t-test Wilcoxon signed-
quantitative fertility between a soil which rank test
measurements taken inorganic fertilizers was applied and
from the same individual the same soil which group which
organic manure was applied after one
year?
Compare means between If our experiment had three rock types Analysis of Kruskal-Wallis test
three or more (e.g., igneous, sedimentary and variance
distinct/independent metamorphic), we might want to (ANOVA)
groups know whether the mean mineral
content at baseline differed among the
three groups?
Estimate the degree of Is excessive deforestation associated Pearson Spearman’s rank
association between two with soil erosion? coefficient of correlation
quantitative variables correlation
Source: Adopted and Modified from Tanya Hoskin (Undated)

4.0 DIFFERENCES BETWEEN PARAMETRIC AND NON-PARAMETRIC TESTS

PARAMETRIC TESTS NON-PARAMETRIC TESTS


They make numerous or stringent assumptions They do not make numerous or stringent
about parameters. assumptions about parameters.
A parametric test focuses on the mean Non-parametric tests focus on order or ranking.

4
difference, and equivalent non-parametric test Data are changed from scores to ranks or signs
focuses on the difference between medians.
These populations must have the same Variable under study has underlying continuity
variances.
Parametric statistical procedures rely on Nonparametric statistical procedures rely on no
assumptions about the shape of the distribution or few assumptions about the shape or parameters
(i.e., assume a normal distribution) in the of the population distribution from which the
underlying population and about the form or sample was drawn.
parameters (i.e., means and standard
deviations) of the assumed distribution.
Source: Adopted and Modified from Tanya Hoskin (Undated), Angela Hebel (Undated) and Robson (1994)

4.0 ADVANTAGES OF PARAMETRIC TESTS


Although the non-parametric tests require fewer assumptions and can be used on a wider range of
data types, parametric tests are preferred because non-parametric tests tend to be less sensitive at
detecting differences between samples or an effect of the independent variable on the dependent
variable. In other words, the power efficiency of the nonparametric test is lower than its parametric
counterpart. This means that to detect any given effect at a specified significance level, a larger
sample size is required for the non-parametric test than the parametric test (Robson, 1994). Some
people also argue that non-parametric methods are most appropriate when the sample sizes are small.
However, when the data set is large, (e.g. n > 100), the central limit theorem can be applied, so often
it makes little sense to use non-parametric statistics.
There are other assumptions specific to individual tests. For example, when comparing the means of
two independent samples, the variances of the two distributions should be approximately equal. This
is also known as the assumption of homogeneity of variance.
When the data under analysis are met those assumptions for parametric tests, we should choose
parametric tests because they are more powerful than non-parametric tests.

5.0 LIMITATIONS OF PARAMETRIC TESTS


If the data deviate strongly from the assumptions of a parametric procedure, using the parametric
procedure could lead to incorrect conclusions.
One must be aware of the assumptions associated with a parametric procedure and should learn
methods to evaluate the validity of those assumptions.
The parametric assumption of normality is particularly worrisome for small sample sizes (n < 30).
Nonparametric tests are often a good option for these data.

6.0 ADVANTAGES OF NON-PARAMETRIC TESTS


In non-parametric tests, data are not normally distributed. Most psychological data are measured
“somewhere between” ordinal and interval levels of measurement. The good news is that the
“regular stats” are pretty robust to this influence, since the rank order information is the most
influential (especially for correlation-type analyses).
Probability statements obtained from most nonparametric statistics are exact probabilities, regardless
of the shape of the population distribution from which the random sample was drawn.
5
Non-parametric tests accommodate very small samples. If sample sizes as small as N=6 are used,
there is no alternative to using a nonparametric test.
Non-parametric tests can treat samples made up of observations from several different populations,
can treat data which are inherently in ranks as well as data whose seemingly numerical scores have
the strength in ranks. They are available to treat data which are classificatory and are easier to learn
and apply than parametric tests.

7.0 LIMITATIONS OF NON-PARAMETRIC TESTS


Non-parametric test leads to loss of precision and wastefulness of data. They have low power and
false sense of security. They lack of software for quick and large scale analysis.
Non-parametric tests are used for testing distributions only and higher-ordered interactions not dealt
with.
They are generally less statistically powerful than the analogous parametric procedure when the data
truly are approximately normal. “Less powerful” means that there is a smaller probability that the
procedure will tell us that two variables are associated with each other when they in fact truly are
associated. If you are planning a study and trying to determine how many patients to include, a
nonparametric test will require a slightly larger sample size to have the same power as the
corresponding parametric test.
Another drawback associated with nonparametric tests is that their results are often less easy to
interpret than the results of parametric tests. Many nonparametric tests use rankings of the values in
the data rather than using the actual data. Knowing that the difference in mean ranks between two
groups is five does not really help our intuitive understanding of the data.
In short, nonparametric procedures are useful in many cases and necessary in some, but they are not
a perfect solution.

REFERENCES
Hebel, A., 2002. Parametric Versus Nonparametric Statistics – When to use them and which is more
powerful? Department of Natural Sciences University of Maryland Eastern Shore.

Hoskin, T., Undated. Parametric and Nonparametric: Demystifying the Terms, a statistician in the Mayo
Clinic Department of Health Sciences Research who provides consultations through the Mayo Clinic
CTSA BERD Resource.

Robson, C., 1994. Experiment, Design and Statistics in Psychology, Chapter 7: Parametric and Nonparametric
tests. [PDF] Available at: http://www.blackwellpublishing.com/robson/pdfs/EDAC07.pdf.

Todd Neideen, MD, and Kare Brasel, MD, MPH. Understanding Statistical Tests, Division of Trauma and
Critical Care, Department of Surgery, Medical College of Wisconsin, Milwaukee,Wisconsin.

You might also like