Professional Documents
Culture Documents
Observational Studies 1
Rex and Jane Galbraith
Introduction 2
Observational Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
Observational studies versus experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
Why do we need observational studies? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Fluoridation and cancer mortality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
The Connecticut crackdown on speeding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Connecticut traffic fatalities 1951–1959 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
Road fatality rates for Connecticut and ”control” states . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Study plans 29
Cross-sectional, prospective and retrospective studies . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
What can we estimate? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
How to interpret? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1
Odds ratios and risk ratios 33
Probabilities and odds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
Odds ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Odds ratios from prospective and retrospective studies. . . . . . . . . . . . . . . . . . . . . . . . . . . 36
For example: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Risk ratios (or relative risks) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
Numerical values of p1 for given p0 and ψ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Comparing proportions or probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Comparing proportions (continued) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Exercise. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2
Introduction 2 / 42
Observational Studies
No Experimental Manipulation
sample survey
administrative records
convenience or “happenstance” sample
Purpose?
3 / 42
Pure observational study: From hospital records, information is assembled on the same response
variable for two groups of patients, one group having received treatment A and the other treatment
B.
Suppose again that there is a clear difference in responses between groups. What can we conclude?
Although the difference is unlikely to be due to “chance” there are now many other possible explanations of it, in
addition to a treatment effect.
Note that the data here might “look” exactly the same as that for the experiment.
4 / 42
3
Why do we need observational studies?
Impossible to do an experiment
Economic reasons
Limitations of experiments
– Restricted conditions and factor levels (results may not generalise to “field” conditions)
5 / 42
Oldham and Newell (1975) re-analysed the data, taking into account the differing age-sex-race
compositions of the different cities (for it is known that cancer mortality depends on these
factors). They concluded that there was no evidence of a link between fluoridation and cancer
mortality.
In fact, the excess cancer rate (over the national average) had increased by 4% in the
non-fluoridated cities and by only 1% in the fluoridated cities. (Applied Statistics, 26, 125-135.)
6 / 42
In the next year, 1956, the road death toll was only 284. The crackdown was hailed as a
success; the governor stated
“With the saving of 40 lives in 1956, a reduction of 12.3% from the 1955 motor vehicle
death toll, we can say that the program is definitely worthwhile.”
7 / 42
4
Connecticut traffic fatalities 1951–1959
8 / 42
9 / 42
5
Design and interpretation 10 / 42
11 / 42
12 / 42
6
Regression to the mean
If a variable is measured for several individuals on two occasions then, other things being equal,
individuals with a high first value will tend to have a lower second value, and individuals with a
low first value will tend to have a higher second value.
This phenomenon (discovered by Francis Galton) of regression to the mean is powerful,
pervasive and widely misunderstood.
It is a consequence of natural or statistical variation.
The essential phenomenon is that a value may be high partly because of inherent size but
also partly because “random” effects have conspired to produce a higher value than might
otherwise have occurred. On the second occasion the latter effect is not repeated. One can
also regard this as a form of selection bias — the first value is selected precisely because it is
high.
13 / 42
Examples
Screening selection. If patients are selected for treatment because of some extreme value (e.g., high
blood pressure) then, even if the treatment has no effect, their average blood pressure is likely be
lower after treatment.
Educational tests. Army recruits are given a test. Those with high marks are praised, those with low
marks are threatened with failure. In a subsequent test, those who were initially praised mostly got
lower marks and those who were threatened got higher marks. Is it counter-productive to give praise
and helpful to threaten?
Batting averages. Look at the batsman with the highest average mid-way through the cricket season.
Now look at his average for the second half of the season — this is likely to be lower. Why?
Stature of men. Sons of tall fathers are on average tall, but (on average) not as tall as their fathers. Sons
of short fathers are on average short, but (on average) not as short as their fathers. (Does this imply
that, after several generations, men will all be of similar stature?)
14 / 42
7
Lothian road accident data — a curious paradox?
Frequency distribution of road accident sites, cross-classified by the numbers of accidents at each in two
time periods:
Quoted in Senn and Collie (1988) Road traffic Engineering and Control, 168–169.
15 / 42
If you classify the sites by the number of accidents they had in 79/80 it seems that the worst
sites have got better. (Lower accident rate in the second period.)
16 / 42
8
Evidence that the worst sites are deteriorating?
On the other hand, if you classify the sites by the number of accidents they had in 81/82 it seems
that the worst sites have got worse. (Lower accident rate in the first period.)
17 / 42
In fact, looking at the data as a whole shows that the accident rates are very stable — mean
accidents per site is 0.976 in 79/80 and 0.964 in 80/81. Another case of regression to the mean
— and the unnatural graphical presentation adds to the confusion.
18 / 42
9
Confounding – does smoking prolong life?
Here are some mortality rates from a 20 year study (1974–1994) of 1314 British women (a
sample drawn from the 1974 electoral roll in Whickham, UK) classified by smoking status; where
n is the number in each group and d is the number who died (from any cause) with 20 years:
n d d/n
All women 1314 369 28.1%
Smokers 582 139 23.9%
Non-Smokers 732 230 31.4%
n d d/n
Age 18–34 Smokers 179 5 2.8%
Non-Smokers 219 6 2.7%
Age 35–64 Smokers 354 92 26.0%
Non-Smokers 320 59 18.4%
Age 65+ Smokers 49 42 85.7%
Non-Smokers 193 165 85.5%
A different story emerges — age is a confounding factor. (There are other interesting patterns
too.)
20 / 42
Remedies
• restriction
• matching
• stratification (adjustment by sub-classification)
• regression
10
Simpson’s Paradox
Hypothetical data for 400 men and 400
women:
Recover? Rate
Males: yes no total
Drug? yes 70 30 100 70%
no 180 120 300 60% So the drug is
beneficial for men
Females: yes no total and for women —
but harmful for
Drug? yes 90 210 300 30%
patients of
no 20 80 100 20% unknown sex!
Explanation?
Drug? Rate
yes no total
Males 100 300 400 25%
Females 300 100 400 75%
Recover?
yes no total
Males 250 150 400 62%
Females 110 290 400 28%
The treated (drug) group is biased against men, but men fare better than women whether on the
drug or not.
23 / 42
11
Another situation
A similar phenomenon can arise with group means:
ȳ = mean score on cognitive test
n = number of people
1
prob(recover|no drug) = 2 × 0.60 + 12 × 0.20 = 0.40
1
prob(recover|drug) = 2 × 0.70 + 12 × 0.30 = 0.50
— standardisationa
12
Causal diagrams
26 / 42
All:
Smoking? yes 160 240 400 40%
no 200 200 400 50%
Here it is not appropriate to stratify by heart disease, which itself may be affected by Smoking,
and relevant to Survival (figure).
In this case, the All table is more relevant — but does this compare like with like? Are there
pre-disposing genetic factors?
27 / 42
13
Difficulty
There may be further unmeasured individual characteristics (age, location, general health,
genetic factors, . . .) that are associated with outcome and confounded with the treatment (or
factor of interest).
28 / 42
Study plans 29 / 42
L L̄
S 40 100 140
S̄ 30 300 330
70 400 470
Cross-sectional: Random sample of 470 people, classified by both S/S̄ and L/L̄.
Prospective: Random samples of 140 smokers and 330 non-smokers, classified by L/L̄.
Retrospective: Random samples of 70 people with lung cancer and 400 without, classified by
S/S̄.
30 / 42
14
What can we estimate?
In either the cross-sectional or prospective study, we can estimate the probability of lung cancer among
smokers:
40
estimated prob(L|S) = = 29%
140
and among non-smokers:
30
estimated prob(L|S̄) = = 9%
330
— direct information.
These are not available in the retrospective study. Instead, we have
40
estimated prob(S|L) = = 57%
70
100
estimated prob(S|L̄) = = 25%
400
— indirect information (values depending on the study design).
31 / 42
How to interpret?
It can be shown that
prob(S|L) > prob(S|L̄)
if, and only if,
prob(L|S) > prob(L|S̄)
So we can get some qualitative knowledge — evidence of the existence of an association, and
its sign
15
Odds ratios and risk ratios 33 / 42
odds
In terms of the odds, the probability is p =
1 + odds
5 5
e.g., if the odds is 5 to 1 then the probability is p = =
1+5 6
5/2 5 5
if the odds is 5 to 2 then the probability is p = = =
1 + 5/2 2+5 7
1
Note that odds = 1 corresponds to probability = 2
1
odds greater than 1 correspond to probabilities greater than 2
, and
1
odds less than 1 correspond to probabilities less than 2
34 / 42
Odds ratios
For example, to compare the risks of lung cancer (L) for smokers (S) and non-smokers (S̄). Let
p1 = prob(L|S) and p0 = prob(L|S̄) .
The two odds are p1 /(1 − p1 ) and p0 /(1 − p0 ) and the odds ratio is
p1 /(1 − p1 ) p1 1 − p0
ψ = = × .
p0 /(1 − p0 ) p0 1 − p1
An odds ratio by itself carries only a little information. We also need to know the “baseline risk”
p0 .
35 / 42
16
Odds ratios from prospective and retrospective studies
To compare the risks of lung cancer for smokers and non-smokers, the odds ratio is
prob(L|S)/prob(L̄|S)
prob(L|S̄)/prob(L̄|S̄)
prob(S|L)/prob(S̄|L)
prob(S|L̄)/prob(S̄|L̄)
i.e., the ratio of the odds of being a smoker amongst those with lung cancer to the odds of being
a smoker amongst those without lung cancer.
However, it can be shown that these two odds ratios are equal! So by estimating the latter we are
also estimating the former.
36 / 42
For example:
Consider the original data on smoking and lung cancer:
L L̄
S 40 100 140
S̄ 30 300 330
70 400 470
40/100
For a prospective study: odds ratio = = 4
30/300
40/30
For a retrospective study: odds ratio = = 4
100/300
— the same either way.
But we still can’t estimate the baseline risk from a retrospective study.
37 / 42
17
Risk ratios (or relative risks)
The ratio p1 /p0 is called the risk ratio or relative risk. It has a more direct interpretation than an
odds ratio, but again we really need to know p0 (the baseline risk) also in order to appreciate it.
If p0 and p1 are both small, then the odds ratio and risk ratio are numerically very similar:
e.g., if p0 = 0.01 and p1 = 0.02 the risk ratio = 2 and the odds ratio = 2.02.
Note: it is a common mistake to interpret an odds ratio as a risk ratio. They are different — often
very different — and are only similar in value when p0 and p1 are small.
38 / 42
For small values of p0 (and moderate ψ ) you can see that p1 approximately equals ψp0 , but for
large p0 this does not hold.
39 / 42
18
Comparing proportions or probabilities
Proportions of deaths (within one year) from cancer and from heart disease, for cigarette smokers and
non-smokers (from Doll and Peto, 1976, BMJ, 2, 1525–1536):
RR OR RD
from cancer
smokers pS = .00140
14.0 14.0 130
non-smokers pN = .00010
pS
RR = = risk ratio
pN
pS 1 − pN
OR = × = odds ratio
pN 1 − pS
RD = (pS − pN ) × 105 = risk difference per 100,000 per year
40 / 42
The risk ratio is much higher for cancer than for heart disease.
A smoker is 14 times more likely to die of cancer within one year compared to a non-smoker; but only
1.62 times more likely to die of heart disease.
But the actual risks (pS and pN ) are higher for heart disease.
And the risk difference is higher for heart disease (256 per 100,000 person years) than for
cancer (130 per 100,000 person years)
In 100,000 smokers and 100,000 non-smokers, 256 more smokers die within one year of heart disease
than non-smokers; but only 130 more from cancer.
41 / 42
19
Exercise
The data below refer to a study of 262 young and middle aged women who were admitted to 30 coronary
care units in Northern Italy with acute MI during the period 1983–1988a . Each case was matched with two
control patients admitted to the same hospitals with other acute disorders. All patients were classified
according to whether they had ever been smokers. Here are the numbers:
Ever smoker
Yes No
MI cases 172 90
Controls 173 346
42 / 42
a
Source: J.Epidemiol. and Commun. Helath, 43, 214–217 (1989)
20