You are on page 1of 45

STATISTICAL ASSOCIATION

AND CAUSALITY

CAUSALITY AT DIFFERENT
LEVELS

Molecular cause
Physiological cause
Personal cause
Social cause, etc.
We will discuss cause from the
perspective of what aspect (or aspects) of
the environment, broadly defined, if
removed or controlled, would reduce the
burden of disease.

CAUSAL INFERENCE
1. DETERMINISTIC CAUSALITY

Many expect a cause to be very


closely related to an effect, as in
necessary and sufficient causes:

Necessary cause: The cause must be


present for the outcome to happen.
However, the cause can be present
without the outcome happening.
Sufficient cause: If the cause is
present the outcome must occur.
However, the outcome can occur
without the cause being present.

EXAMPLE OF NECESSARY
CAUSE
If outcomes are defining in terms of causes,
the cause is necessary by definition. For
example, the tubercle bacillus is necessary
for tuberculosis by the definition of
tuberculosis. Etiologic (as contrasted to
manifestational) classification of diseases
often produce necessary causes. Hepatitis
B once looked to be a necessary cause of
hepatocellular carcinoma. But now we see
that Hepatitis C may produce it too.

EXAMPLE OF SUFFICIENT CAUSE


Sufficient causes are very rare in
medicine, because it is exceptional
that one exposure is by itself enough
to cause disease. Usually exposures
are much more common than the
diseases they cause. Only about 5% of
people who smoke get lung cancer.
The measles virus virtually always
causes people to get clinical measles,
and rabies infection is always fatal.

EXAMPLE OF NECESSARY
AND SUFFICIENT CAUSE
HIV could once be classified as both
the necessary and sufficient cause of
AIDS.
Now, however, it may be that one can
be infected with HIV and never get
AIDS, either because of rare genetic
protection, or because of treatment
of the virus.

NECESSARY CAUSE
(e.g. the tubercle bacillus and tuberculosis)

HAS
DISEASE

FREE OF
DISEASE

HAS
EXPOSURE

YES

YES

DOES NOT
HAVE
EXPOSURE

NO

YES

SUFFICIENT CAUSE
( Rabies infection and death)
HAS
DISEASE

FREE OF
DISEASE

HAS
EXPOSURE

YES

NO

DOES NOT
HAVE
EXPOSURE

YES

YES

BOTH NECESSARY AND SUFFICIENT


(e.g. HIV and AIDS in the past)
HAS
DISEASE

FREE OF
DISEASE

HAS
EXPOSURE

ALL

NONE

DOES NOT
HAVE
EXPOSURE

NONE

ALL

Kochs postulates were an example of


deterministic causality. To prove that an
organism causes a disease, he required
that:
1. The organism must be isolated in every
case of the disease (i.e. be necessary)
2. The organism must be grown in pure
culture
3. The organism must always cause the
disease when inoculated into an
experimental animal (i.e. be sufficient)
4. The organism must then be recovered from
the experimental animal and identified.

PROBABILISTIC CAUSALITY
In epidemiology, most causes have
much weaker relationships to effects.
For example, high cholesterol may lead
to heart disease, but it need not
(insufficient), and heart disease does not
require a high cholesterol
(unnecessary). The emphasis on
multiple causes in probabilistic causality
leads to expressions such as the web of
causation, or chain of causation

The measures of association - odds


ratio, risk ratio, or correlation
coefficient, and of public health
impact - e.g. population attributable
risk - are related to the strength of
the causal relationship. The higher
the odds ratio, the closer the cause
is to being necessary and sufficient.
A PAR of 100% means that the cause
is necessary - all cases would be
prevented if the cause were removed.

One pragmatic definition of a cause


(or a determinant) of a disease is an
exposure which produces a regular
and predictable change in the risk of
the disease.
Thus the increase of lung cancer in
women, and its magnitude, were
predicted based on information on
their cigarette smoking habits

ASSOCIATION VS CAUSATION
To decide whether exposure A
causes disease B, we must first
find out whether the two
variables are associated, i.e.
whether one is found more
commonly in the presence of the
other.

Almost all of statistics is an attempt to


discover whether two variables are
associated, and if so, how strongly, and
whether chance can explain the
observed association. Statistics are
primarily designed to assess the role of
chance in that association. A p value
only tells us how unlikely the
association is to have arisen by chance.
Therefore, Statistical analysis alone
cannot constitute proof of a causal
relationship.

MAKING CAUSAL
INFERENCES
The use of causal criteria in
making inferences from data.

The process of weighing


evidence at the level of the
individual is clinical judgment
(e.g. should this patient with a
urinary tract infection be treated
with Ampicillin or
Sulfisoxazole?)

The process of weighing


evidence at the level of the
population is epidemiological
judgment (e.g. should middleaged men take aspirin daily to
prevent heart attacks?)

When looking at data from


epidemiological studies, we often use
casual criteria to assist in weighing the
evidence. The most commonly used are
the following criteria, derived initially
from the work of the British statistician
Austin Bradford Hill, and later further
developed by the U.S, Surgeon General's
Office in its 1964 report on smoking and
cancer.

Causal criteria are usually applied


to a group of articles on a topic,
though, in modified form, they can
be applied to an individual paper.

CAUSAL CRITERIA
Five commonly used criteria for
assessing causality in exposureoutcome relationships have been
used by epidemiologists for many
years.

1. STRENGTH
dose response
2. TIME-ORDER

3. SPECIFICITY
4. COHERENCE
5. CONSISTENCY

STRENGTH
Is the association strong? Heavy
smoking is associated with a twentyfold higher rate of lung cancer, and a
doubled rate of coronary heart
disease. The association of smoking
with lung cancer is therefore
stronger than its association with
heart disease. The stronger the
association the more likely it is to be
truly causal.

STRENGTH
One reason for the importance is is
that any confounding variable must
have a larger association with the
outcome to be confounding. The
larger the relative risk observed, the
less likely it is that a confounder with
an even larger relative risk is lurking
in the background.

EXAMPLE:
The strength of the association
was the key evidence for the
association between folic acid
supplements and neural tube
defects, in spite of less-than-ideal
study design.

Dose-response relationship
If a regular gradient of disease risk is
found to parallel a gradient in
exposure (e.g. light smokers get lung
cancer at a rate intermediate between
non-smokers and heavy smokers) the
likelihood of a causal relationship is
enhanced. Dose-response is generally
thought of as a sub-category of
strength.

Dose-response relationship
However, dose-response is not
relevant to all exposure-disease
relationships, because disease
sometimes only occurs above a fixed
threshold of exposure, and thus a
dose-response relationship need not
be seen. (remember also that
misclassification of adjacent classes
can easily produce an apparent
dose-response relationship)

EXAMPLE:
For each increase in amount
of cigarettes smoked, the risk
of lung cancer rises.

TIME ORDER
This very important criterion simply
states that one must know for sure
that the cause preceded the effect in
time. Sometimes this is hard to
know, especially in cross-sectional
studies.

EXAMPLE 1.
Studies have found an inverse
relationship between a persons
blood pressure and a persons serum
calcium. But which is the cause and
which the effect?
Time-order can also be uncertain
when disease has a long latent
period, and when the exposure may
also represent a long duration of
effect.

EXAMPLE 2:

Low serum cholesterol has been


linked to increased risk of colon
cancer in prospective cohort studies.
But is a low serum cholesterol a
cause of colon cancer, or does an
early phase of colon cancer cause
low cholesterol levels?

SPECIFICITY
Causality is enhanced if an
exposure is associated with a
specific disease, and not with
a whole variety of diseases

EXAMPLE 1.
Asbestos causes a specific lung
disease, asbestosis, distinguishable
from many other lung diseases. But low
level lead exposure is associated with
lower IQ rather than a distinguishable
brain syndrome. Thus lead is more
uncertain as a cause because of
possible confounding with other causes
of this rather non-specific effect, low IQ
(e.g. SES).

Causality is also enhanced if a


disease is associated with a
specific exposure, and not with
a whole variety of exposures.

EXAMPLE 2: Which disease is benzene


more likely to be a cause of?
Significant Adjusted ORs for the association
of two diseases with five exposures
Disease X
Disease Y
1 .smoking
2 .low SES
3. male gender
4. works with
benzene
5. factory
employee

2.1
4.2
2.3
3.0

1.1
0.9
1.2
3.0

1.5

0.8

IMPORTANT PRINCIPLE:
Specificity is enhanced by
hypothesis formulation.
Pre-specification is our major
protection against chance
findings.

COHERENCE
Does the association fit with
other biological knowledge?
One must look for support in the
laboratory, or from other aspects
of the biology of the condition.

EXAMPLE:

Presence of a serological marker of


hepatitis B infection is associated
(in Asia at least) with greatly elevated
rates of liver cancer. That Hepatitis B
infection is a true cause of liver
cancer is also supported by the
finding of the viral genome in many
liver cancers.

By contrast, Reserpine (an antihypertensive drug) was thought to be a


cause of breast cancer based on some
studies done in the early 1970's. But
there was no other supporting biological
information, or any truly plausible
biological mechanism. Subsequent
larger studies failed to support this
association. Similarly for EMF and
carcinogenesis.

CONSISTENCY
Is the same association found in
many studies? Hundreds of studies
have shown that smoking and lung
cancer are associated, and no
serious study has failed to show this
association. But whether oral
contraceptives are associated with
breast cancer is uncertain because
some studies show an association,
but others do not.

Meta-analysis is a formal method


to assess the consistency of the
measure of association across
many studies.

CONSISTENCY
Consistency can mean either:
Exact replication, as in the
laboratory sciences, or
Replication under many different
circumstances.
In epidemiology, exact
replication is impossible

WHEN TO APPLY CAUSAL


CRITERIA?
Causal criteria are principally
designed to deal with the problem of
confounding. By applying the
criteria, we reduce the possibility of
falsely assigning cause to the wrong
exposure. Causal criteria do not
work well in the case of bias.

FOR EXAMPLE
Prenatal care was widely believed to
prevent low birthweight. However,
women not getting prenatal care tended
to have all sorts of problems associated
with low birthweight. Because studies
of prenatal care assembled biased
samples, it was often impossible to
remove the bias by adjustment.
Moreover, the biased association was
very consistent, and the effect size was
strong! (it lacked coherence however)

You might also like