You are on page 1of 4

ERRORS IN STATISTICS

ERROR
CLASSIFICATION
TYPES OF ERRORS - BMJ,
1977
LITERATURE REVIEW
DAHLBERGS FORMULA
STRATEGIES TO OVERCOME
ERRORS
ERROR occurs when there is a tendency to produce results that differ from the true values. A
study with small error is said to have high accuracy

Elenhaas et al stated that inappropriate application of statistical methods to medical research


data is a common error found in the literature, which questions the validity of the conclusions
reached.

The most common types of errors of scientific methods are the casual and systematic error
(Houston WJ. Am J Orthod. 1983)
The casual error, also known as random error, occurs due to the difficulty and/or
inaccuracy in either identifying or defining certain points.
The systematic error, also known as non-random error, occurs when a given
measurement is continuously under or super-estimated.

Errors are induced in data because of factors that can be controlled.


These can be broadly categorized into 3:
1. OBSERVER ERRORS: This can be subjective or Objective
Subjective error occurs by the faulty interrogation of individuals by an untrained
investigator
Objective error occurs by faulty recording of data without following a standard
procedure for recording.
2. INSTRUMENTAL ERROR: Occurs by the use of faulty instruments for recording
the data
3. ERRORS IN SAMPLING: 2 types of errors occur-
a) Sampling
i. Due to faulty sampling design
ii. Small sample size
b) Non-sampling errors-
i. Coverage error: Due to non-response or non-cooperation of
individual
ii. Observational error: Due to interviewers bias or imperfect
experimental technique or both
iii. Processing error: Due to errors in statistical analysis.

Errors related to limitations of tests of significance:


There can be 4 possibilities in testing of hypothesis-

1. Hypothesis true & is accepted- Correct decision


2. Hypothesis true but is rejected- TYPE-I ERROR (-error/ rejection error)- related
to level of significance and is predetermined (5%)
3. Hypothesis false but is accepted-TYPE-II ERROR (- error)- not predetermined
4. Hypothesis false & is rejected- correct decision

As early as 1979, the British Medical Journal introduced a system for statistical review of
submitted papers, the aim of which was to assess the statistical acceptability of papers already
denoted as acceptable by referees.
The work of Gardner et al (Gardner MJ et al, British Medical Journal,1983) suggests
that the assessment scheme made an important contribution to excluding articles of poor
quality from appearing in the British Medical Journal. At the time much concern arose
regarding the statistical content of published papers because about half of all published
papers contained statistical errors.

A review of articles in the British Medical Journal (Gore, Jones, Rytter- British Medical
Journal, 1977) revealed five types of errors or abuses of statistics, including the
following:
1. Inadequate description of data such as reporting a mean without the associated standard
deviation
2. A disregard for statistical independence, which results when multiple observations on one
subject are treated as though they represented single observations from distinct subjects
3. Errors related to randomization, in which failure to randomize is not justified or explained
in such a way that the reader can assume that biased allocation has occurred
4. Errors with Student's t-test, which is often used without examining the assumptions of this
statistical model, including normality of the distribution, independent observations, and equal
variances in the two samples
5. Errors with Xz tests, such as lack of use of a continuity correction or Fisher's exact test
when cell sizes are small, or applying the test to matched data

Out of 77 papers studied, 15 included no statistical analysis; of the remaining 62 reports,


52% included at least one error from the five Categories

Avram et al (Avram MJ et al, Anesth Analg, 1985) conducted a study to assess the statistical
analyses used in two anesthesia journals, Anesthesia and Analgesia and Anesthesiology.
Descriptive statistical errors included failure of the article to identify the statistics
used, the use of interval statistics for ordinal data, and the use of the standard error rather than
the standard deviation to describe dispersion.
With respect to Inferential statistical analysis, specific tests were unidentified;
parametric tests were used on ordinal data; the frequencies in the cells of the contingency
table were inadequate for Xz analysis; inappropriate or no tests were used as a follow-up to
analysis of variance; or tests for independent samples were used on related data, or vice
versa.

MacArthur and Jackson (Mac Arthur RD, Jackson GG- Journal of Infectious diseases,
1984) evaluated 114 articles published in the Journal of Infectious Diseases for the
occurrence of eight common statistical errors.

These were the following:


1. Failure to include a control group-When no control group is included, experimental effects
cannot adequately be assessed because no basis for comparison is present.
2. Inadequate information or improper methodology concerning randomization or
assignment to groups
3. Failure to list the statistical tests used-To interpret results, knowledge of the methods used
must be included. Indication of significance only is not sufficient for interpretation.
4. Failure to completely summarize statistical results-The meaning of a p value can be
interpreted correctly only when the name of the test is indicated, along with values of the test
statistic and degrees of freedom or sample size.
5. Use of the standard error (SE) instead of the standard deviation (SD)
The SD presents variability in the sample, whereas the SE quantifies the certainty with which
a sample mean estimates the true population mean. The SE is smaller than the SD by the
square root of the sample size and may erroneously make the data look more uniform, which
may be part of the reason for its use.
6. The inappropriate use of parametric statistical tests-Data for parametric tests must satisfy
certain assumptions such as independence of the samples, random selection of samples,
equal variances among the samples, and data measured on at least an interval scale. When
any of these criteria are not met, nonparametric must be used to adequately describe
the population.
7. Failure to include a multiple comparison correction-In the case of a significant difference
in more than two groups, post-hoc comparisons incorporating t-tests are commonly
used after analysis of variance to locate mean differences among pairs within the groups.
Without appropriate correction, greatly increased probability exists of erroneously
concluding that a difference among pairs exists when one does not (Type I error).
The Bonferroni correction ( Miller RG Jr- Simultaneous statistical inference: Ed-2,
New York, 1981) may be used to correct this type of error.
Other, less conservative tests that may be used are the Newman-Keuls (Miller RG Jr-
Simultaneous statistical inference: Ed-2, New York, 1981) , Dunnett (Dunnett CW, JAm
StatAssoc, 1955) or Duncan (Duncan DB -t tests and intervals for comparisons suggested by
the data, Biometrics, 1975) tests.
8. The inappropriate use of statistics to test for differences other than those that the
experiment was designed to detect; specifically, after-the-fact "shopping" for statistically
significant differences-One hazard of shopping is that if the original groups were randomly
assigned, random subgrouping of these groups will be lost.

The most frequently encountered error in the surveyed literature was the failure to
completely summarize the statistical results. Of the articles surveyed, 95% failed to
mention the exact values of the test statistics or associated degrees of freedom.
Approximately 25% of the articles did not name the tests used; 90% of applicable articles
included no correction for multiple comparisons after the analysis of variance; and
about 30% misused the SE.

In cephalometry, one of the methods most commonly used to estimate the magnitude of
random errors is that proposed by Dahlberg's formula (Dahlberg G, New York:
Interscience; 1940.)

He used this formula to calculate the method error.

Method error=d2 / 2n
Where d=difference between two measurements of a pair

N = number of subjects

STRATEGIES TO ELIMINATE ERRORS :

1. Controls

2. Randomization or random allocation

3. Cross over design

4. Placebo

5. Blinding technique -single/ double blinding

CONCLUSION:

Accuracy of statistics might not be of much significance in clinical practice but when it
comes to pursuing newer techniques and principles, scientific substantiation can be obtained
only with statistical verification and interpretation. Hence, minimal error/ error free
statistics is of prime importance.

You might also like