You are on page 1of 4

The Journal of Foot & Ankle Surgery 49 (2010) 471474

Contents lists available at ScienceDirect

The Journal of Foot & Ankle Surgery


journal homepage: www.jfas.org

A 5-year Review of Statistical Methods Presented in The Journal of Foot & Ankle
Surgery
Andrew J. Meyr, DPM 1
1
Assistant Professor, Department of Podiatric Surgery, Temple University School of Podiatric Medicine, Philadelphia, PA

a r t i c l e i n f o a b s t r a c t

Level of Evidence: 4 This article presents a review of the statistical analyses used by authors and published in The Journal of Foot
Keywords: & Ankle Surgery from January 2004 to December 2008. Of the 215 articles reviewed, descriptive statistics
analysis were used in 84% and comparative statistics in 68%. The most commonly used comparative statistical tests
evidence-based medicine were Student t test (30%), analysis of variance (14%), the Mann Whitney U test (Wilcoxon rank sum) test (13%),
primer
chi-squared analyses (11%), and Fishers exact test (10%). The aim of this investigation was to review the
statistics
prevalences of various statistical methods used by foot and ankle surgeons, as reported in The Journal of Foot
& Ankle Surgery, and to serve as a primer of the most common methods used in the analysis of surgical data
and the development of a scientic basis for the practice of foot and ankle surgery.
2010 by the American College of Foot and Ankle Surgeons. All rights reserved.

A wide range of statistical tests is available for the evaluation of the results, and the secondary aim was to provide practitioners with
data that are used to determine results, and to formulate conclusions, a primer that describes the use of these statistical methods.
related to the outcomes of foot and ankle surgical interventions. These
results, in many cases, end up being published in full-text, peer- Materials and Methods

reviewed biomedical literature, as well as in textbooks and abstracts, All of the articles published in The Journal of Foot & Ankle Surgery over the 5-year
which have the potential to inuence clinical and surgical practice. period from January 2004 to December 2008 were reviewed, with attention paid to the
Unfortunately, the shear number and variety of statistical methods use of statistical methods in the reports. Inclusion criteria entailed all articles desig-
can be overwhelming to physicians expected to incorporate evidence- nated by the journal as Original Research, Case Report, or Tips, Quips, and Pearls,
wherein original data were statistically analyzed. Care was taken to record what was
based medical and surgical techniques into their practices. Using
explicitly stated within the text and tables of the published articles, and no inferences
evidence-based practices not only involves decision making based on were made in regard to the appropriateness or utility of the specic statistical methods
the critical selection of appropriate information, but also the critical that were identied. In other words, no judgment was made as to whether or not the
analysis of statistical procedures and results presented in the medical use of a specic statistical test, as described in the article, was suitable for the study
literature (15). This can be a difcult task because many foot and design, data type, and distribution. Descriptive statistics were dened as numerical
methods used to describe the characteristics of a set of data, but not to distinguish
ankle surgeons do not receive even basic training in the use of differences between or among groups. Comparative statistics were dened as numer-
statistical methods. For better or for worse, many readers assume that ical methods used to evaluate differences between or among groups. The statistical plan
authors have correctly used appropriate statistical methods, and they for this particular investigation was descriptive, focused solely on describing the
(the readers) rely on peer reviewers and editors to have already prevalences of the different statistical methods identied in the published articles.
critically appraised the reports that ultimately get published. In an
effort to further inform surgeons about the appropriate use of Results
statistical methods as presented in the biomedical literature, a review
of the reported use of statistical methods as presented in The Journal of During the 5-year observation period, The Journal of Foot & Ankle
Foot & Ankle Surgery was undertaken. The primary aim of this Surgery published a total of 215 articles that met the criteria for
retrospective investigation was to report the prevalences with which inclusion in this study. In the 215 published articles, a total of 54
various statistical methods were used to report foot and ankle surgical different statistical methods were used to analyze data and provide
results for clinical interpretation. A summary of these methods is
depicted in Table 1. Descriptive statistics were reported in 180 (84%)
Financial disclosure: None reported. articles, and focused on the following parameters: average (mean,
Conict of interest: None reported.
Address correspondence to: Andy Meyr, DPM, Temple University School of
median, and mode), dispersion (range, standard deviation), preva-
Podiatric Medicine, Department of Surgery, 8th at Race Street, Philadelphia, PA 19107. lence and frequency counts, proportions, incidence rates, and con-
E-mail address: ajmeyr@gmail.com dence intervals about point estimates. Comparative statistics were

1067-2516/$ - see front matter 2010 by the American College of Foot and Ankle Surgeons. All rights reserved.
doi:10.1053/j.jfas.2010.05.005
472 A.J. Meyr / The Journal of Foot & Ankle Surgery 49 (2010) 471474

Table 1
Comparative statistical methods used to obtain results reported in The Journal of Foot & Ankle Surgery from January 2004 to December 2008 (n 215 articles, some reporting
multiple different statistical tests)

Statistical Test Prevalence


Student t test 29.8% (64/215)
Analysis of variance 16.3% (35/215)
Mann Whitney U (Wilcoxon rank sum) 12.6% (27/215)
Chi-square tests 11.2% (24/215)
Fishers exact test 10.2% (22/215)
Wilcoxon signed-rank test 8.4% (18/215)
Pearsons correlation coefcient 6.0% (13/215)
Kruskal-Wallis test, power analysis 2.8% (6/215)
Spearman rank correlation coefcient 2.3% (5/215)
Intra-class correlation coefcient, odds ratio analysis 1.9% (4/215)
Post hoc Newman-Keuls test 1.4% (3/215)
Binomial test, Bonferroni correction post hoc analysis, Cronbachs alpha reliability score, logistic regression model, 0.9% (2/215)
Scheffes multiple comparisons test, Tukey post hoc test
Anderson-Darling test, backward elimination method, Cochran-Mantel-Haenszel procedure, discriminate functional analysis, 0.5% (1/215)
Dunn post hoc analysis, Fisher protected least signicance difference post hoc test, Hochberg correction, Kappa ranges, Kendall coefcient
of concordance, Kosmogorov-Smirnov test, least signicant difference multiple comparison method post hoc test, likelihood ratio test,
Mantel-Haenszel test, Greenland sensitivity analysis, multiple regression analysis, nonparametric Friedman test, number needed to treat,
typical error of measurement, unconditional exact Rohmet-Mansmann test, Wald-Wolfowitz run test
No comparative statistical test 31.6% (68/215)
Undened comparative statistical test 3.2% (7/215)

reported in 147 (68%) articles. The most common comparative examples of the descriptive statistical parameters that were
statistical methods included Student t test in 64 (30%) reports, anal- mentioned included the familiar mean and median averages, as well
ysis of variance in 29 (14%) reports, the Mann Whitney U (Wilcoxon as the mode, standard deviation, range (minimum to maximum, as
rank sum) test in 27 (13%) reports, chi-squared analyses in 24 (11%) well as interquartile), prevalences and other frequency counts,
reports, and Fishers exact test in 22 (10%) reports. proportions, incidence rates, and condence intervals about point
estimates. Critical readers are encouraged to appraise descriptive
Discussion statistics in regard to several points. For example, if descriptive
statistics are used to describe the characteristics of a group of patients,
The use of statistics can help us answer the most basic question at such as a series (n < 30 [typical cutoff for single-focus surgical jour-
the heart of every clinical study, namely: Is the therapy responsible for nal]) or a cohort (n  30) undergoing a particular intervention or
the observed outcome, or is it just a chance observation? Unfortu- diagnostic test, then the reader should use this information to
nately, many clinicians view statistical analyses as more of an obstacle determine whether the results can be generalized to ones own
to understanding, rather than a useful tool for answering the afore- patients (7). Would your patients meet the inclusion criteria used to
mentioned question (Is it the treatment, or just chance, that led to the dene the study group? Are the patients that you treat demograph-
observed outcome?). Although authors, reviewers, and editors do not ically similar to those described in the published article, in regard to
expect every reader to understand every statistical method that may independent variables such as age, gender, body mass index, comor-
be presented in a published report, it is incumbent upon authors, bidities, activity level, preoperative radiographic and functional
reviewers, and editors to present results clearly and without extra- measurements, and any other variable that a reasonable clinician,
neous terminology, symbolism, and jargon. It is also important to note familiar with the condition in question, would consider important? By
that it may not be necessary to use statistical methods, or to report the the same token, have the authors used reliable methods to achieve
results of statistical analyses, to get a manuscript published. This is valid results that measure important subjective outcomes, such as
particularly true with regard to case reports, or small case series, pain, foot-related quality of life, and function; and, would these
where useful and interesting information can be conveyed without measures be important in regard to the patients that you treat?
the need for statistical analyses. For example, in the MayJune 2006 Critical readers should also understand that the distribution of the
issue, Glasoe and Coughlin (6) provided a comprehensive review of data being analyzed plays an important role in deciding which
the literature and current biomechanical theories of hypermobility statistical methods are most suitable for analysis of the data in
and pronation. Their clinical review required neither original data nor question. The distribution of the data is determined, in the most basic
statistical analyses. Over the 5-year observation period described in sense, by visualizing a plot of the data points and seeing whether or
this report, 35 (16%) of the journal articles did not report descriptive not the display is symmetrically bell shaped. If so, then the distribu-
or comparative statistical analyses. It is, however, necessary to use tion of the data can be considered normal (Gaussian). A number of
statistical methods to properly describe a group of patients, or nd- computations, such as the Kolmogorov-Smirnov test, and other tests
ings related to patients, and to make comparisons between groups of of normality that are based on skewness and kurtosis, can also be used
patients and different treatments. It is also imperative that statistical to quantitatively determine whether a set of data is distributed in
methods be used to correlate one group of ndings or patients with a normal or skewed fashion. Skewed distributions are said to be
another. To better understand these concepts, let us now look more nonparametric. An appreciation of the distribution of the data,
closely at descriptive and comparative statistical methods. specically, an understanding of whether the data are distributed in
a normal or nonparametric fashion, is necessary to determine which
Descriptive Statistics comparative statistical techniques are most appropriate for the
analyses. A normally distributed population is one in which approx-
Most (84%) of the articles in our review reported descriptive imately 68% of the population is found within 1 standard deviation of
statistical results, most often to describe the characteristics of the mean, and approximately 95% of the population is found within 2
a patient cohort or series of patients under study. Some specic standard deviations of the mean (811). And, a distribution that is
A.J. Meyr / The Journal of Foot & Ankle Surgery 49 (2010) 471474 473

normally distributed is most appropriately reported in terms of the Nominal data describe variables in which there is no arithmetic
mean and standard deviation, and the 95% condence interval about relationship between the categories, and common examples of such
the point estimate can be used to further describe the precision of the categorical data in foot and ankle surgery include race, gender, the
estimate of the mean. A non-normally distributed population is best presence or absence of disease, or outcomes such as success or failure.
reported in terms of the median and range (minimum to maximum, In comparison with the analysis of continuous numeric data, it
or interquartile) (12). requires a larger sample size to show a statistically signicant
Although there is not necessarily a functional mathematical difference when categorical data are analyzed.
formula to determine whether or not a population renders a normal
distribution, analysis of the descriptive statistics reported by the Comparative Statistics
authors can provide critical readers with clues as to whether or not
they are dealing with a normally distributed population. For example, Although descriptive statistical methods often serve as the foun-
if a statistical description of a population reports the mean and dation for comparative analyses, it is important to note that use of
standard deviation, and the standard deviation is larger than 50% of comparative statistical methods is not always a requirement for
the mean, then the data are probably not normally distributed and biomedical publication. In fact, in the study described in this article,
nonparametric methods are likely to be most appropriate. Similarly, 68 (32%) of the 215 articles published during the observation period
when the mean and minimum and maximum range are reported, and did not use comparative statistical methods. In other words, the
2 standard deviations above or below the mean estimate fall outside authors did not attempt to nd a statistically signicant difference
of the reported range, the data are not likely to be normally distrib- between or among groups. Put another way, they did not attempt to
uted and nonparametric methods are likely to be most appropriate. determine a P value, or the probability of the null hypothesis (no
This is often the case when dealing with small (n < 30) samples, and statistically signicant difference between or within the groups) in
when analyzing categorical (see the following paragraph) data. the study. For example, in the MarchApril 2008 issue, Bouche and
The issue of data type is also important, and has bearing on the Heit (13) published preliminary observations concerning combined
statistical methods that are most appropriate for the analyses. plantar plate and hammertoe repair with exor digitorum longus
Readers need to consider whether the statistical results were tendon transfer for sagittal plane instability of the lesser meta-
obtained from the analysis of interval, ordinal, or nominal data (811), tarsophalangeal joints. They presented descriptive statistics related to
as data type is another piece of information used to determine which their patient cohort and outcomes, but did not attempt to determine
descriptive and comparative statistical methods are most suitable for whether or not statistically signicant differences existed between
the data in question. Interval data describe variables that have equal, the preoperative and postoperative measurements. The decision to
constant, and ordered intervals. Common examples in foot and ankle use comparative analyses depends on the aims of the particular
surgery include age, radiographic measurements (degrees, millime- investigation. As another example, the results that we present in this
ters, densities), item counts (feet, patients, events), time (hours, days, particular article are descriptive, and not comparative. We could have
weeks, and so forth) to an event, height, weight, body mass index, attempted to determine if the prevalence of the use of Student t test
force and pressure, and income. Ordinal data describe variables that was statistically signicantly different from the prevalence of the use
have ordered intervals, although the intervals need not necessarily be of, say, reports of associations in terms of odds ratios; however, we did
equal or constant. Common examples in foot and ankle surgery not feel that such a comparison would be of any clinical use to
include frequencies and ranks, such as never, sometimes, or always, surgeons, hence it was not an aim of our investigation. Statistical
and mild, moderate, and severe. Interval data and ordinal data are also signicance does not always imply practical clinical signicance, and
known as quantitative, or numeric data, because one can assign part of a readers critical appraisal of published literature should entail
a number to the variable. Interval data that range from zero to identication of information that is clinically useful (14).
innity, or from negative innity to positive innity, are continuous Although there are many comparative statistical tests that can be
numeric data, and generally enable investigators to show statistically used to evaluate data in medical research, most published articles use
signicant differences even with small effect and sample sizes. just a small sampling of such techniques. In fact, the comparative

Table 2
Data description, study design, and appropriate statistical test

Description of the Data Interval Data Ordinal Data Nominal Data

Quantitative data that have equal, Quantitative data that have ordered Categorical data
constant, and ordered intervals. intervals, but intervals that are not with arbitrary or
This table assumes that interval data equal. Also includes interval data nonarithmetic intervals.
describe a normally distributed population.* with a nonparametric distribution.*
Tests of the null  Before and after a single intervention Paired t test Wilcoxon signed-rank test McNemars test
hypothesis in 1 group of the same individuals
 Before and after multiple interventions Repeated measures Friedman statistic Cochranes Q
in 1 group of the same individuals of analysis of variance
 Before and after intervention in 2 groups Unpaired t test Mann-Whitney U Chi-square or
consisting of different individuals (Wilcoxon rank-sum test) Fishers exact testy
 Before and after intervention in Analysis of variance Kruskal-Wallis test Chi-square or
3 or more groups consisting of Fishers exact testy
different individuals
Correlation, or another association Pearson correlation, Spearman rank correlation Odds ratio,
(statistical dependence), between variables linear regression relative risk

Adapted from Glantz SA, ed. Primer of Biostatistics, ed 6, p 446, McGraw-Hill, New York, 2005; with permission.
* Interval data in this table assume a normally distributed population. If interval data are not normally distributed, then the data should be ranked and calculated on an
ordinal scale.
y
The Fishers exact test is typically used in place of the chi-square test when the expected frequency of a group is small (<10).
474 A.J. Meyr / The Journal of Foot & Ankle Surgery 49 (2010) 471474

Table 3
Recommended reading for foot and ankle surgeons interested in further understanding statistical methods used in patient-oriented research

Textbooks Riegelman RK, Hirsch RP. Studying a Study and Testing a Test: How to Read the Health Science Literature, ed 3, Boston, Little,
Brown, and Company, Inc., 1996.
Guyatt G, Rennie D, Meade MO, Cook DJ, eds. Users Guide to the Medical Literature: A Manual for Evidence-Based Clinical Practice,
ed 2, New York, McGraw-Hill, 2008.
Glantz SA, ed. Primer of Biostatistics, McGraw-Hill, New York, 2005.
Friedland DJ, Go AS, Davoren JB, Shlipak MG, Bent SW, Subak LL, Mendelson T, eds. Evidence-Based Medicine. A Framework for Clinical Practice,
New York, Lange Medical Books/McGraw-Hill, New York, 1998.
Lang TA, Secic M, eds. How to Report Statistics in Medicine. Annotated Guidelines for Authors, Editors, and Reviewers, ed 2,
Philadelphia, American College of Physicians, 2006.
Garb JL, ed. Understanding Medical Research: a Practitioners Guide, Boston, Little, Brown, and Company, Inc., 1996.
Daniel WW, ed. Biostatistics: A Foundation for Analysis in the Health Sciences, ed 8, Hoboken (NJ), John Wiley & Sons, Inc., 2005.
Bailar JC, Mosteller F, eds. Medical Uses of Statistics, ed 2, Boston, NEJM Books, 1992.
Peer-reviewed articles Bhandari M, Guyatt GH, Swiontkowski MF. Users guide to the orthopaedic literature: how to use an article about a surgical therapy.
J Bone Joint Surg Am 83-A(6):916926, 2001.
Bhandari M, Guyatt GH, Swiontkowski MF. Users guide to the orthopaedic literature: how to use an article about prognosis.
J Bone Joint Surg Am 83-A(10):15551564, 2001.
Bhandari M, Guyatt GH, Montori V, Devereaux PJ, Swiontkowwki MF. Users guide to the orthopaedic literature: how to use
a systemic literature review. J Bone Joint Surg Am 84-A(9):16721682, 2002.
Bhandari M, Morrow F, Kulkarni AV, Tornetta P 3rd. Meta-analyses in orthopedic surgery. A systemic review of their methodologies.
J Bone Joint Surg Am 83-A(1):1524, 2001.
Bhandari M, Giannoudis PV. Evidence-based medicine: what it is and what it is not. Injury 37(4):302306, 2006.
Szabo RM. Principles of epidemiology for the orthopaedic surgeon. J Bone Joint Surg Am 80(1):111120, 1998.
Kocher MS, Zurakowski D. Clinical epidemiology and biostatistics: a primer for orthopedic surgeons. J Bone Joint Surg Am 86-A(3):607620, 2004.
Boutron I, Ravaud P, Nizard R. The design and assessment of prospective randomized, controlled trials in orthopaedic surgery. J Bone Joint
Surg Br 89(7):858863, 2007.

statistical methods that we identied with a prevalence of more than recommended for foot and ankle surgeons interested in further
2% in the publications included in this investigation are depicted in understanding the statistical methods used in patient-oriented
Table 2. This table is meant to summarize the basic question under- research.
lying the aim of this report, namely, which comparative statistical test
best suits the data and design of the study in question? Three vari-
ables are used to determine the most suitable statistical test, as Acknowledgment
depicted in Table 2. First, the data need to be classied as interval,
ordinal, or nominal. Second, the distribution of the data needs to be The author thanks Dr Michael Sheridan, consulting epidemiologist
characterized as normal (parametric) or nonparametric (non-normal). and biostatistician at the Inova Fairfax Hospital in Falls Church,
To be most appropriately analyzed, non-normally distributed interval Virginia, for help in preparation of this manuscript.
data should be ranked and calculated on an ordinal scale. Third, the
aim of the statistical test, as bets the design of the investigation,
References
needs to be characterized, and 5 of the most common designs used in
foot and ankle surgical research are represented in Table 2. These 1. Friedland DJ, Go AS, Davoren JB, Shlipak MG, Bent SW, Subak LL, Mendelson T, eds.
include the following comparative statistical tests: (1) before and Evidence-Based Medicine. A Framework for Clinical Practice, Lange Medical Books/
McGraw-Hill, New York, New York, 1998.
after an intervention in a single group (paired, or linked data), (2)
2. Guyatt G, Rennie D, Meade MO, Cook DJ, eds. Users Guide to the Medical Literature:
before and after multiple interventions in a single group, (3) before A Manual for Evidence-Based Clinical Practice, ed 2, McGraw-Hill, New York, 2008.
and after an intervention in 2 different groups, (4) before and after an 3. Riegelman RK, Hirsch RP. Studying a Study and Testing a Test: How to Read the
intervention in 3 or more different groups, and (5) when examining Health Science Literature, ed 3, Little, Brown, and Company, Inc., Boston, 1996.
4. Garb JL, ed. Understanding Medical Research: a Practitioners Guide, Little, Brown,
correlations and other associations (statistical dependence) between and Company, Inc., Boston, 1996.
2 measured variables. The rst 4 categories of tests depicted in Table 2 5. Bailar JC, Mosteller F, eds. Medical Uses of Statistics, ed 2, NEJM Books, Boston, 1992.
are tests of the null hypothesis, and the last category includes corre- 6. Glasoe WM, Coughlin MJ. A critical analysis of Dudley Mortons concept of
disordered foot function. J Foot Ankle Surg 45(3):147155, 2006.
lation and other analyses, such as regression. 7. Bhandari M, Guyatt GH, Swiontkowski MF. Users guide to the orthopaedic liter-
In conclusion, the intention of this report was to provide basic ature: how to use an article about a surgical therapy. J Bone Joint Surg Am 83-
information about the use of statistical analyses in the foot and ankle A(6):916926, 2001.
8. Glantz SA, ed. Primer of Biostatistics, McGraw-Hill, New York, 2005.
surgery literature, as published in The Journal of Foot & Ankle Surgery. 9. Daniel WW, ed. Biostatistics: A Foundation for Analysis in the Health Sciences, ed 8,
The narrow scope of this investigation was intentional, and limits the John Wiley & Sons, Inc., Hoboken (NJ), 2005.
ability to make any conclusions beyond simply reporting the preva- 10. Szabo RM. Principles of epidemiology for the orthopaedic surgeon. J Bone Joint
Surg Am 80-A(1):111120, 1998.
lences of the various statistical methods identied in the group of
11. Kocher MS, Zurakowski D. Clinical epidemiology and biostatistics: a primer for
articles that were reviewed. In addition to reporting the prevalences, orthopedic surgeons. J Bone Joint Surg Am 86-A(3):607620, 2004.
it is hoped that this article will raise awareness of the importance of 12. Lang TA, Secic M, eds. How to Report Statistics in Medicine. Annotated Guidelines for
Authors, Editors, and Reviewers, ed 2, American College of Physicians, Philadelphia,
understanding data type and distribution, and of using statistical tests
2006.
that suit the data in question. As our profession transitions to an era of 13. Bouche RT, Heit EJ. Combined plantar plate and hammertoe repair with exor
evidence-based interventions, it will be increasingly important for digitorum longus tendon transfer for chronic, severe sagittal plane instability of
clinicians to hone skills in areas, such as clinical investigation, where the lesser metatarsophalangeal joints: preliminary observations. J Foot Ankle Surg
47(2):125137, 2008.
they may not have received formal training. Interested readers are 14. Malay DS. Some thoughts about data type, distribution, and statistical signicance.
referred to the textbooks and articles listed in Table 3, which are J Foot Ankle Surg 45(6):357359, 2006.

You might also like