You are on page 1of 65

Data Analysis & Interpretation

Why do we see conflicting


results in studies?
Results and inferences are dependent on the study
design,

Method of selection of subjects for the study,

Size of the sample used for the study,

Conduct of study.
Study Design
The aim of study design is:

to maximise attribution (inferences),

to minimise all sources of error,

to be practical.
Factor to be aware of ..
Bias
source: systematic error,

Confoundings
source: a variable which is associated with both
exposure (intervention) and outcome,

Chance
source: random error.
Some common types of
studies conducted
Experimental designs
Randomised controlled trials (RCT),
Parallel group
Crossover
Block designs,

Observational studies
Cohort (prospective),
Case-control (retrospective).
Outline
Focus first on the data structures obtained from
RCTs (parallel and crossover) and observational
designs,

Limit this to dichotomous outcomes,

Consider experiments where repeated samples or
replications for all sources of variation,

Sample size issues.
Research questions:
RCT and observational studies

Hormone replacement treatment (HRT) and Breast
cancer.

Trial of Vitamin D and calcium supplementation
and hip fracture.

Use of statins for the prevention of Myocardial
Infarction.

Use of the oral contraceptive pill and deep vein
thrombosis (DVT).
Measures of effect size
(dichotomous outcome)
Absolute risk reduction (ARR)
The difference in risk of a given event, between
two groups

Number Needed to Treat (NNT)
It is defined as the number needed to treat in
order to prevent one additional adverse event
(e.g. death)

Relative risk (RR)
Is the ratio of the risk of a given event in one
group of subjects compared to another group
(strength of association)
Relative risk reduction
(1-RR) x 100%
The proportion of the initial or baseline risk
which was eliminated by a given
treatment/intervention or by avoidance of
exposure to a risk factor

Odds ratio (OR)
Is the ratio of the odds of a given event in one
group of subjects compared to another group
Measures of effect size
(dichotomous outcome)
(strength of association)
Over 17000 patients suspected of having acute
myocardial infarction were randomised to receive
treatment (1) oral aspirin or (2) no aspirin. The table
below presents results for vascular mortality at five
weeks:
died not died total
aspirin 804 7783 8587
no aspirin 1016 7584 8600
total 1820 15367 17187
ARR (for dying) = I(804/8587) (1016/8600)I = 0.024
(24 deaths may possibly be prevented for each 1000 patients)

RR (for dying) = (804/8587) / (1016/8600) = 0.79
(those taking aspirin are at a lower risk of dying)

RRR (for dying) = RR x 100% = 21%
(aspirin reduced the risk of death at 5 weeks by 21%)

NNT = 1/ARR = 42 and OR = ODDS
no asp
/ ODDS
asp
= 1.3
Source:
A-Z of medical statistics
Interpretation of effect sizes
Consider the null hypothesis:

ARR
Difference in risk

Ho: Risk
A
Risk
B
= 0

RR/OR
Ratio

Ho: Risk
A
/Risk
B
= 1 or Odds
A
/Odds
B
= 1
Risk vs. Odds
The risk (or rate) of an event occurring is
The number with the event/total number of
people exposed,

The odds of an event is
number with event/number without the event,


Example: Out of 10 people

2 have headache rate = 0.2 & odds=0.25

4 have headache rate = 0.4 & odds=0.67
Randomised controlled trials (I)
There are two main types of trial design:

Parallel groups design: each patient receives
only one treatment.
Treat. A
Treat. B
Group 1
Group 2
Crossover design: each patient receives all/both
treatments in random order, often with a washout
period between treatments
Treat. A
Treat. B
Treat. A
Treat. B
washout
period
1
st
treatment
period
2
nd
treatment
period
Randomised controlled trials (II)
Data structure
Parallel group trials give independent samples

Crossover trials give paired samples

Parallel groups with two treatments
A and B
(independent samples of sizes n
A
and n
B
)
Failure Sucess Total
treatment A a c n
A

treatment B b d n
B

Risk of failure with Treatment A = a/(a+c)=a/n
A


Risk of failure with Treatment B = b/(b+d)=b/n
B
outcome
Example
A multi-centre, randomised placebo-controlled trial
of the beta blocking drug Timolol, reported the
number of deaths in 18 months of follow-up among
patients who had recently suffered a myocardial
infarction.


(New England Journal of Medicine. 1981;304: 801-7).
Example
1) What is the risk of death in the Timolol group?
2) What is the risk of death in the placebo group?
3) What is the difference in risk of mortality (ARR)?
4) What is the relative risk of mortality?
5) What statistical test would you apply to test
whether there is a difference in mortality between
the two groups?
6) What is the no. needed to treat with Timolol to
prevent one additional person dying?
Outcome
Treatment Died Survived Total
Timolol 98 847 945
Placebo 152 787 939
Calculations

1) Risk (timolol) = 98/945 = 0.104 (10.4%)
2) Risk (placebo) = 152/939 = 0.162 (16.2%)

3) ARR = 0.162 0.104 = 0.058 95% CI (0.028, 0.089)

4) RR = 0.104/0.162 = 0.64 95% CI (0.51, 0.81)

6) NNT = 1/0.058 [100/5.8] = 17 people
died survived
timolol 98 847 945
placebo 152 787 939
5) What statistical test would you apply to test whether
there is a difference in mortality between the two
groups?
Failure Success Total
Failure w y w+y
Success x z x+z
Total w+x y+z w+y+x+z=n
Rate of failure with treatment A = (w+x)/n

Rate of failure with treatment B = (w+y)/n

Particular interest focuses on the number of
patients with discordant findings (x and y).
Crossover (Paired samples of size n)
Outcome on treatment A
Observational studies
A question which is often posed in epidemiology is:
Does exposure A cause disease B?


It is unethical to randomise these subjects to these
exposures so instead we have to use the information
which is available,

We do not have the experimental design i.e,
randomisation,

Therefore, causal relationships are harder to prove,

Differences may exist between the exposure groups
that could have an impact on the outcome.
Confoundings
An important part of an observational study
investigating a relationship between an exposure
and a disease is to check for possible confounding
factors.

Such factors are associated with both the exposure
and the disease (e.g. a study of whether or not
smoking is a cause of liver cirrhosis would need to
take account of the confounding influence of
alcohol consumption).
Cohort Study
Group of subjects
disease free at
the start of the
study
exposed
not
exposed
diseased
non-diseased
comparison
diseased
non-diseased
Relative Risk
For cohort studies the measure of association
between exposure and disease is the relative risk.

The relative risk is the risk of disease in exposed
group relative to the risk of disease in the
unexposed group.

Example
The Caerphilly cohort study followed up
approximately 2,500 middle-aged Welsh men to
examine the association between several risk
factors (measured at entry to the study) and the
subsequent risk of ischaemic heart disease in a
five-year period.
Example

Risk of disease in exposed = 101/1387 = 0.073
Risk of disease in unexposed = 50/1114 = 0.045

The RR associated with smoking is obtained by the
risk ratio = .073/.045 = 1.62; 95% CI = [1.17,2,26]

Smokers have an increase risk of IHD compared with
non-smokers. Smokers are 1.6 times more likely than
non-smokers to have IHD.
Ischaemic HD
Exposure during follow-up
Status Yes No Total
Smoker 101 1286 1387
Non-smoker 50 1064 1114
Total 151 2350 2501
Case-Control study
Risk factors
Previous
exposures
Cases with
disease
under study
Controls without
disease
under study
comparison
Case-control study
For case-control studies data the RR cannot be
determined, so the measure of association is the
odds ratio.

The OR is the ratio of the odds of exposure in the
diseased group compared to the odds of exposure
in the non-exposed group.
Unmatched case-control study
Odds of exposure for the cases = a/b

Odds of exposure for the controls = c/d

OR = (a/b) / (c/d) = (a x d) / (b x c)

Exposure
Status

Disease Status
Case Control
Exposed a c
Non-exposed b d
Total n
CASE
n
CONTROL

Case-control example
The ECTIM study was a case-control study of 610
men who had suffered a myocardial infarction and
733 controls.

One of the factors assessed in these men was the
gene encoding for angiotensin-converting enzyme
(ACE), and each man was classified as Yes or No for
a particular ACE genotype.
Example
ACE
genotype
Disease Status
Case Control
Yes 197 200
No 413 533
Total 610 733
Odds of ACE genotype in case group = 197/413

Odds of ACE genotype in control group = 200/533
Example

The relative risk of myocardial infarction associated
with the ACE genotype is given by the odds ratio:


(197/413)/(200/533)=1.27;
95% CI = [1.00,1.62]


Cases are more likely to be exposed to the ACE
genotype than controls. The odds of being
exposed to the ACE genotype is greater in the
cases.
Individually matched case-
control study
Exposed Unexposed Total
Exposed w y w+y
Unexposed x z x+z
Total w+x y+z w+y+x+z=n
Exposure status
among cases
Interest usually focused on the number of
discordant pairs to each type (x and y),

Odds ratio = x/y

What would be the appropriate statistical test?
Observational studies
Statistical adjustment may be required for
confounding factors,

Multiple regression models
Logistic regression (previous lecture),

Stratification
Mantel Haenszel methods.

Further reading Kirkwood and Sterne Ch 18
Randomised Block experiments
Many sources of variation
Time, temperature, resting/following exercise,
observers,etc,

Replication is required per combination of
experimental conditions,

Must be independent to one another,

This will give greater precision,

There will often be non-experimental conditions
Age of patient,

Consider what varies across observations and what
varies between subjects.
Statistical Analysis
Same number of replications per combination of
experimental conditions makes analysis easier
design is said to be balanced,

Multiple regression or ANOVA commonly adopted,

Number of experimental conditions relates to one
way, two way etc ANOVA.
Examples
Study conducted to examine effect of three diets
and the timing of measurement (first thing am/after
midday meal),

Subjects were allocated to a diet and two
measurements were taken on each patient,

For each diet/timing combination 4 subjects were
measured,

Important to distinguish between subject and within
subject comparisons:

Between subject diets,

Within subjects time of assessment.
Diagram
Diet 1 Diet 2 Diet 3
Fast Food Fast Food Fast Food
x x x x x x
x x x x x x
x x x x x x
x x x x x x
Statistical Analysis
Ho: No effect of diet on outcome, i.e.,
mean (diet1) = mean(diet2) = mean(diet3)


Ho: No effect of timing on outcome, i.e,
mean(fast) = mean(food)
A two way analysis of variance (ANOVA) could be
conducted to partition variation into diet, timing,
residual.
Why consider sample size?
Recall random error chance,

It is possible determine what sample size should be
taken, if we wish to achieve a given level of
precision,

This is because precision can be increased by
reducing the size of the standard error,

The size of the standard error is based on the size of
the sample,

The larger the sample size the smaller the standard
error.
Sample size to estimate a
population parameter
Initial estimate of population parameter (e.g.
from a pilot study),

What degree of accuracy required (e.g. to
within 5%).
Sample size for population
proportions
True value Precision 95% CI Sample size
5% 0.5 4% to 6% 1900
5% 1.5 2% to 8% 212
20% 0.5 19% to 21% 6400
20% 1.5 17% to 23% 712
20% 2.5 15% to 25% 256
50% 0.5 49% to 51% 10000
50% 1.5 47% to 53% 1112
50% 2.5 45% to 55% 400
50% 5.0 40% to 60% 100

adapted from Crombie IK (1996)
Factors important on
calculating sample size
Study design,

Outcome measures,

Statistical test,

Minimum clinical effect,

Statistical power (Type II error),

Significance level (Type I error).
Significance level and power
Significance level
The probability that the statistical test returns a
significant result when there is no difference
between the treatments

Power
The probability that a study of a given size will
detect as statistically significant a real difference
of a given magnitude
The Decision Matrix
In reality
What
we conclude
The Decision Matrix
In reality
What
we conclude
Null true
Alternative false
In reality...
There is no real program effect
There is no difference, gain
Our theory is wrong
The Decision Matrix
In reality
What
we conclude
Null true
Alternative false
In reality...
Accept null
Reject alternative
We say...
There is no real
program effect
There is no difference,
gain
Our theory is wrong
There is no real program effect
There is no difference, gain
Our theory is wrong
The Decision Matrix
In reality
What
we conclude
Null true
Alternative false
In reality...
Accept null
Reject alternative
We say...
There is no real
program effect
There is no difference,
gain
Our theory is wrong
There is no real program effect
There is no difference, gain
Our theory is wrong
1-
THE CONFIDENCE LEVEL
The odds of saying there is
no effect or gain when in
fact there is none
#of times out of 100 when
there is no effect, well say
there is none
The Decision Matrix
In reality
What
we conclude
Null true
Alternative false
In reality...
Reject null
Accept alternative
We say...
There is a real program
effect
There is a difference,
gain
Our theory is correct
There is no real program effect
There is no difference, gain
Our theory is wrong
The Decision Matrix
In reality
What
we conclude
Null true
Alternative false
In reality...
Reject null
Accept alternative
We say...
There is a real program
effect
There is a difference,
gain
Our theory is correct
There is no real program effect
There is no difference, gain
Our theory is wrong

TYPE I ERROR
The odds of saying there is
an effect or gain when in
fact there is none
#of times out of 100 when
there is no effect, well say
there is one
The Decision Matrix
In reality
What
we conclude
Null false
Alternative true
In reality...
There is a real program effect
There is a difference, gain
Our theory is correct
The Decision Matrix
In reality
What
we conclude
Null false
Alternative true
In reality...
Accept null
Reject alternative
We say...
There is no real
program effect
There is no difference,
gain
Our theory is wrong
There is a real program effect
There is a difference, gain
Our theory is correct
The Decision Matrix
In reality
What
we conclude
Null false
Alternative true
In reality...
Accept null
Reject alternative
We say...
There is no real
program effect
There is no difference,
gain
Our theory is wrong
There is a real program effect
There is a difference, gain
Our theory is correct
TYPE II ERROR

The odds of saying there is
no effect or gain when in
fact there is one
#of times out of 100 when
there is an effect, well say
there is none
The Decision Matrix
In reality
What
we conclude
Null false
Alternative true
In reality...
Reject null
Accept alternative
We say...
There is a real program
effect
There is a difference,
gain
Our theory is correct
There is a real program effect
There is a difference, gain
Our theory is correct
The Decision Matrix
In reality
What
we conclude
Null false
Alternative true
In reality...
Reject null
Accept alternative
We say...
There is a real program
effect
There is a difference,
gain
Our theory is correct
There is a real program effect
There is a difference, gain
Our theory is correct
1-
POWER
The odds of saying there is
an effect or gain when in
fact there is one
#of times out of 100 when
there is an effect, well say
there is one
The Decision Matrix
In reality
What
we conclude
Null true
Null false
Alternative false
Alternative true
In reality...
In reality...
Accept null
Reject alternative
Reject null
Accept alternative
We say...
There is no real
program effect
There is no difference,
gain
Our theory is wrong
We say...
There is a real program
effect
There is a difference,
gain
Our theory is correct
There is no real program effect
There is no difference, gain
Our theory is wrong
There is a real program effect
There is a difference, gain
Our theory is correct
1-
THE CONFIDENCE LEVEL TYPE II ERROR

The odds of saying there is
no effect or gain when in
fact there is none
#of times out of 100 when
there is no effect, well say
there is none
The odds of saying there is
no effect or gain when in
fact there is one
#of times out of 100 when
there is an effect, well say
there is none
1-
TYPE I ERROR POWER
The odds of saying there is
an effect or gain when in
fact there is none
The odds of saying there is
an effect or gain when in
fact there is one
#of times out of 100 when
there is no effect, well say
there is one
#of times out of 100 when
there is an effect, well say
there is one
The Decision Matrix
In reality
What
we conclude
Null true
Null false
Alternative false Alternative true
In reality...
In reality...
Accept null
Reject alternative
Reject null
Accept alternative
We say...
There is no real
program effect
There is no difference,
gain
Our theory is wrong
We say...
There is a real program
effect
There is a difference,
gain
Our theory is correct
There is no real program effect
There is no difference, gain
Our theory is wrong
There is a real program effect
There is a difference, gain
Our theory is correct
1-
THE CONFIDENCE LEVEL TYPE II ERROR

1-
TYPE I ERROR POWER
The Decision Matrix
In reality
What
we conclude
Null true
Null false
Alternative false Alternative true
In reality...
In reality...
Accept null
Reject alternative
Reject null
Accept alternative
We say...
There is no real
program effect
There is no difference,
gain
Our theory is wrong
We say...
There is a real program
effect
There is a difference,
gain
Our theory is correct
There is no real program effect
There is no difference, gain
Our theory is wrong
There is a real program effect
There is a difference, gain
Our theory is correct
1-
THE CONFIDENCE LEVEL TYPE II ERROR

1-
TYPE I ERROR POWER
CORRECT
CORRECT
The Decision Matrix
In reality
What
we conclude
Null true
Null false
Alternative false Alternative true
In reality...
In reality...
Accept null
Reject alternative
Reject null
Accept alternative
We say...
There is no real
program effect
There is no difference,
gain
Our theory is wrong
We say...
There is a real program
effect
There is a difference,
gain
Our theory is correct
There is no real program effect
There is no difference, gain
Our theory is wrong
There is a real program effect
There is a difference, gain
Our theory is correct
1-
THE CONFIDENCE LEVEL TYPE II ERROR

The odds of saying there is
no effect or gain when in
fact there is none
#of times out of 100 when
there is no effect, well say
there is none
The odds of saying there is
no effect or gain when in
fact there is one
#of times out of 100 when
there is an effect, well say
there is none
1-
TYPE I ERROR POWER
The odds of saying there is
an effect or gain when in
fact there is none
The odds of saying there is
an effect or gain when in
fact there is one
#of times out of 100 when
there is no effect, well say
there is one
#of times out of 100 when
there is an effect, well say
there is one
If you try to increase power, you increase
the chance of winding up in the bottom
row and of Type I Error
The Decision Matrix
In reality
What
we conclude
Null true
Null false
Alternative false Alternative true
In reality...
In reality...
Accept null
Reject alternative
Reject null
Accept alternative
We say...
There is no real
program effect
There is no difference,
gain
Our theory is wrong
We say...
There is a real program
effect
There is a difference,
gain
Our theory is correct
There is no real program effect
There is no difference, gain
Our theory is wrong
There is a real program effect
There is a difference, gain
Our theory is correct
1-
THE CONFIDENCE LEVEL TYPE II ERROR

The odds of saying there is
no effect or gain when in
fact there is none
#of times out of 100 when
there is no effect, well say
there is none
The odds of saying there is
no effect or gain when in
fact there is one
#of times out of 100 when
there is an effect, well say
there is none
1-
TYPE I ERROR POWER
The odds of saying there is
an effect or gain when in
fact there is none
The odds of saying there is
an effect or gain when in
fact there is one
#of times out of 100 when
there is no effect, well say
there is one
#of times out of 100 when
there is an effect, well say
there is one
If you try to
decrease Type I
Error, you
increase the
chance of winding
up in the top row
and of Type II
Error
Sample size
for a comparative study
The proportion with the feature in the control group
(binary outcome)

Measure of variability (continuous outcome)

Minimum clinical difference
The smallest difference in outcome between
the two treatments that would be deemed to
be clinically relevant

Significance level

Power
Example 1
A randomised controlled trial to assess the
effectiveness of laparoscopic versus open hernia
repair

Primary outcome measure is proportion of patients
who have returned to normal activities at 2 weeks
post op
Sample size calculation
Study design = RCT

Outcome = Proportion of patients returned to usual
activities at 2 weeks following open hernia repair

Statistical test = Chi squared test

Estimate of level of outcome in control group
(standard care) = 30%

Minimum clinical difference = 10%

Type I error = 0.1 (90% power)

Type II error = 0.05 (5% significance)

500 patients required in each group
Example 2
A study is to be conducted to evaluate a new
drug for hypertension compared with the
standard drug.

The outcome will be systolic blood pressure at
one month after treatment starts.
Sample size calculation
Study design = RCT

Outcome = SBP at one month

Statistical test = independent groups t-test

Minimum clinical difference = 10 mm Hg

Estimate of variability of SBP = 30mm Hg

Standardised difference = 10/30 = 1/3

Power = 80%

Significance level = 5%
Example 2
From statistical formulae, the required sample size
is 300 patients in total.

150 patients are required in both groups to yield
80% power of detecting a difference of 10 mm Hg
(0.3 standard deviation) in systolic blood pressure
at the 5% significance level.

90% power - 200 patients in each group.

You might also like