Results and inferences are dependent on the study design, Method of selection of subjects for the study, size of the sample used for the study and Conduct of study. Aim of study design is: to maximise attribution (inferences), to minimise all sources of error, to be practical.
Results and inferences are dependent on the study design, Method of selection of subjects for the study, size of the sample used for the study and Conduct of study. Aim of study design is: to maximise attribution (inferences), to minimise all sources of error, to be practical.
Results and inferences are dependent on the study design, Method of selection of subjects for the study, size of the sample used for the study and Conduct of study. Aim of study design is: to maximise attribution (inferences), to minimise all sources of error, to be practical.
results in studies? Results and inferences are dependent on the study design,
Method of selection of subjects for the study,
Size of the sample used for the study,
Conduct of study. Study Design The aim of study design is:
to maximise attribution (inferences),
to minimise all sources of error,
to be practical. Factor to be aware of .. Bias source: systematic error,
Confoundings source: a variable which is associated with both exposure (intervention) and outcome,
Chance source: random error. Some common types of studies conducted Experimental designs Randomised controlled trials (RCT), Parallel group Crossover Block designs,
Observational studies Cohort (prospective), Case-control (retrospective). Outline Focus first on the data structures obtained from RCTs (parallel and crossover) and observational designs,
Limit this to dichotomous outcomes,
Consider experiments where repeated samples or replications for all sources of variation,
Sample size issues. Research questions: RCT and observational studies
Hormone replacement treatment (HRT) and Breast cancer.
Trial of Vitamin D and calcium supplementation and hip fracture.
Use of statins for the prevention of Myocardial Infarction.
Use of the oral contraceptive pill and deep vein thrombosis (DVT). Measures of effect size (dichotomous outcome) Absolute risk reduction (ARR) The difference in risk of a given event, between two groups
Number Needed to Treat (NNT) It is defined as the number needed to treat in order to prevent one additional adverse event (e.g. death)
Relative risk (RR) Is the ratio of the risk of a given event in one group of subjects compared to another group (strength of association) Relative risk reduction (1-RR) x 100% The proportion of the initial or baseline risk which was eliminated by a given treatment/intervention or by avoidance of exposure to a risk factor
Odds ratio (OR) Is the ratio of the odds of a given event in one group of subjects compared to another group Measures of effect size (dichotomous outcome) (strength of association) Over 17000 patients suspected of having acute myocardial infarction were randomised to receive treatment (1) oral aspirin or (2) no aspirin. The table below presents results for vascular mortality at five weeks: died not died total aspirin 804 7783 8587 no aspirin 1016 7584 8600 total 1820 15367 17187 ARR (for dying) = I(804/8587) (1016/8600)I = 0.024 (24 deaths may possibly be prevented for each 1000 patients)
RR (for dying) = (804/8587) / (1016/8600) = 0.79 (those taking aspirin are at a lower risk of dying)
RRR (for dying) = RR x 100% = 21% (aspirin reduced the risk of death at 5 weeks by 21%)
NNT = 1/ARR = 42 and OR = ODDS no asp / ODDS asp = 1.3 Source: A-Z of medical statistics Interpretation of effect sizes Consider the null hypothesis:
ARR Difference in risk
Ho: Risk A Risk B = 0
RR/OR Ratio
Ho: Risk A /Risk B = 1 or Odds A /Odds B = 1 Risk vs. Odds The risk (or rate) of an event occurring is The number with the event/total number of people exposed,
The odds of an event is number with event/number without the event,
Example: Out of 10 people
2 have headache rate = 0.2 & odds=0.25
4 have headache rate = 0.4 & odds=0.67 Randomised controlled trials (I) There are two main types of trial design:
Parallel groups design: each patient receives only one treatment. Treat. A Treat. B Group 1 Group 2 Crossover design: each patient receives all/both treatments in random order, often with a washout period between treatments Treat. A Treat. B Treat. A Treat. B washout period 1 st treatment period 2 nd treatment period Randomised controlled trials (II) Data structure Parallel group trials give independent samples
Crossover trials give paired samples
Parallel groups with two treatments A and B (independent samples of sizes n A and n B ) Failure Sucess Total treatment A a c n A
treatment B b d n B
Risk of failure with Treatment A = a/(a+c)=a/n A
Risk of failure with Treatment B = b/(b+d)=b/n B outcome Example A multi-centre, randomised placebo-controlled trial of the beta blocking drug Timolol, reported the number of deaths in 18 months of follow-up among patients who had recently suffered a myocardial infarction.
(New England Journal of Medicine. 1981;304: 801-7). Example 1) What is the risk of death in the Timolol group? 2) What is the risk of death in the placebo group? 3) What is the difference in risk of mortality (ARR)? 4) What is the relative risk of mortality? 5) What statistical test would you apply to test whether there is a difference in mortality between the two groups? 6) What is the no. needed to treat with Timolol to prevent one additional person dying? Outcome Treatment Died Survived Total Timolol 98 847 945 Placebo 152 787 939 Calculations
6) NNT = 1/0.058 [100/5.8] = 17 people died survived timolol 98 847 945 placebo 152 787 939 5) What statistical test would you apply to test whether there is a difference in mortality between the two groups? Failure Success Total Failure w y w+y Success x z x+z Total w+x y+z w+y+x+z=n Rate of failure with treatment A = (w+x)/n
Rate of failure with treatment B = (w+y)/n
Particular interest focuses on the number of patients with discordant findings (x and y). Crossover (Paired samples of size n) Outcome on treatment A Observational studies A question which is often posed in epidemiology is: Does exposure A cause disease B?
It is unethical to randomise these subjects to these exposures so instead we have to use the information which is available,
We do not have the experimental design i.e, randomisation,
Therefore, causal relationships are harder to prove,
Differences may exist between the exposure groups that could have an impact on the outcome. Confoundings An important part of an observational study investigating a relationship between an exposure and a disease is to check for possible confounding factors.
Such factors are associated with both the exposure and the disease (e.g. a study of whether or not smoking is a cause of liver cirrhosis would need to take account of the confounding influence of alcohol consumption). Cohort Study Group of subjects disease free at the start of the study exposed not exposed diseased non-diseased comparison diseased non-diseased Relative Risk For cohort studies the measure of association between exposure and disease is the relative risk.
The relative risk is the risk of disease in exposed group relative to the risk of disease in the unexposed group.
Example The Caerphilly cohort study followed up approximately 2,500 middle-aged Welsh men to examine the association between several risk factors (measured at entry to the study) and the subsequent risk of ischaemic heart disease in a five-year period. Example
Risk of disease in exposed = 101/1387 = 0.073 Risk of disease in unexposed = 50/1114 = 0.045
The RR associated with smoking is obtained by the risk ratio = .073/.045 = 1.62; 95% CI = [1.17,2,26]
Smokers have an increase risk of IHD compared with non-smokers. Smokers are 1.6 times more likely than non-smokers to have IHD. Ischaemic HD Exposure during follow-up Status Yes No Total Smoker 101 1286 1387 Non-smoker 50 1064 1114 Total 151 2350 2501 Case-Control study Risk factors Previous exposures Cases with disease under study Controls without disease under study comparison Case-control study For case-control studies data the RR cannot be determined, so the measure of association is the odds ratio.
The OR is the ratio of the odds of exposure in the diseased group compared to the odds of exposure in the non-exposed group. Unmatched case-control study Odds of exposure for the cases = a/b
Odds of exposure for the controls = c/d
OR = (a/b) / (c/d) = (a x d) / (b x c)
Exposure Status
Disease Status Case Control Exposed a c Non-exposed b d Total n CASE n CONTROL
Case-control example The ECTIM study was a case-control study of 610 men who had suffered a myocardial infarction and 733 controls.
One of the factors assessed in these men was the gene encoding for angiotensin-converting enzyme (ACE), and each man was classified as Yes or No for a particular ACE genotype. Example ACE genotype Disease Status Case Control Yes 197 200 No 413 533 Total 610 733 Odds of ACE genotype in case group = 197/413
Odds of ACE genotype in control group = 200/533 Example
The relative risk of myocardial infarction associated with the ACE genotype is given by the odds ratio:
(197/413)/(200/533)=1.27; 95% CI = [1.00,1.62]
Cases are more likely to be exposed to the ACE genotype than controls. The odds of being exposed to the ACE genotype is greater in the cases. Individually matched case- control study Exposed Unexposed Total Exposed w y w+y Unexposed x z x+z Total w+x y+z w+y+x+z=n Exposure status among cases Interest usually focused on the number of discordant pairs to each type (x and y),
Odds ratio = x/y
What would be the appropriate statistical test? Observational studies Statistical adjustment may be required for confounding factors,
Further reading Kirkwood and Sterne Ch 18 Randomised Block experiments Many sources of variation Time, temperature, resting/following exercise, observers,etc,
Replication is required per combination of experimental conditions,
Must be independent to one another,
This will give greater precision,
There will often be non-experimental conditions Age of patient,
Consider what varies across observations and what varies between subjects. Statistical Analysis Same number of replications per combination of experimental conditions makes analysis easier design is said to be balanced,
Multiple regression or ANOVA commonly adopted,
Number of experimental conditions relates to one way, two way etc ANOVA. Examples Study conducted to examine effect of three diets and the timing of measurement (first thing am/after midday meal),
Subjects were allocated to a diet and two measurements were taken on each patient,
For each diet/timing combination 4 subjects were measured,
Important to distinguish between subject and within subject comparisons:
Between subject diets,
Within subjects time of assessment. Diagram Diet 1 Diet 2 Diet 3 Fast Food Fast Food Fast Food x x x x x x x x x x x x x x x x x x x x x x x x Statistical Analysis Ho: No effect of diet on outcome, i.e., mean (diet1) = mean(diet2) = mean(diet3)
Ho: No effect of timing on outcome, i.e, mean(fast) = mean(food) A two way analysis of variance (ANOVA) could be conducted to partition variation into diet, timing, residual. Why consider sample size? Recall random error chance,
It is possible determine what sample size should be taken, if we wish to achieve a given level of precision,
This is because precision can be increased by reducing the size of the standard error,
The size of the standard error is based on the size of the sample,
The larger the sample size the smaller the standard error. Sample size to estimate a population parameter Initial estimate of population parameter (e.g. from a pilot study),
What degree of accuracy required (e.g. to within 5%). Sample size for population proportions True value Precision 95% CI Sample size 5% 0.5 4% to 6% 1900 5% 1.5 2% to 8% 212 20% 0.5 19% to 21% 6400 20% 1.5 17% to 23% 712 20% 2.5 15% to 25% 256 50% 0.5 49% to 51% 10000 50% 1.5 47% to 53% 1112 50% 2.5 45% to 55% 400 50% 5.0 40% to 60% 100
adapted from Crombie IK (1996) Factors important on calculating sample size Study design,
Outcome measures,
Statistical test,
Minimum clinical effect,
Statistical power (Type II error),
Significance level (Type I error). Significance level and power Significance level The probability that the statistical test returns a significant result when there is no difference between the treatments
Power The probability that a study of a given size will detect as statistically significant a real difference of a given magnitude The Decision Matrix In reality What we conclude The Decision Matrix In reality What we conclude Null true Alternative false In reality... There is no real program effect There is no difference, gain Our theory is wrong The Decision Matrix In reality What we conclude Null true Alternative false In reality... Accept null Reject alternative We say... There is no real program effect There is no difference, gain Our theory is wrong There is no real program effect There is no difference, gain Our theory is wrong The Decision Matrix In reality What we conclude Null true Alternative false In reality... Accept null Reject alternative We say... There is no real program effect There is no difference, gain Our theory is wrong There is no real program effect There is no difference, gain Our theory is wrong 1- THE CONFIDENCE LEVEL The odds of saying there is no effect or gain when in fact there is none #of times out of 100 when there is no effect, well say there is none The Decision Matrix In reality What we conclude Null true Alternative false In reality... Reject null Accept alternative We say... There is a real program effect There is a difference, gain Our theory is correct There is no real program effect There is no difference, gain Our theory is wrong The Decision Matrix In reality What we conclude Null true Alternative false In reality... Reject null Accept alternative We say... There is a real program effect There is a difference, gain Our theory is correct There is no real program effect There is no difference, gain Our theory is wrong
TYPE I ERROR The odds of saying there is an effect or gain when in fact there is none #of times out of 100 when there is no effect, well say there is one The Decision Matrix In reality What we conclude Null false Alternative true In reality... There is a real program effect There is a difference, gain Our theory is correct The Decision Matrix In reality What we conclude Null false Alternative true In reality... Accept null Reject alternative We say... There is no real program effect There is no difference, gain Our theory is wrong There is a real program effect There is a difference, gain Our theory is correct The Decision Matrix In reality What we conclude Null false Alternative true In reality... Accept null Reject alternative We say... There is no real program effect There is no difference, gain Our theory is wrong There is a real program effect There is a difference, gain Our theory is correct TYPE II ERROR
The odds of saying there is no effect or gain when in fact there is one #of times out of 100 when there is an effect, well say there is none The Decision Matrix In reality What we conclude Null false Alternative true In reality... Reject null Accept alternative We say... There is a real program effect There is a difference, gain Our theory is correct There is a real program effect There is a difference, gain Our theory is correct The Decision Matrix In reality What we conclude Null false Alternative true In reality... Reject null Accept alternative We say... There is a real program effect There is a difference, gain Our theory is correct There is a real program effect There is a difference, gain Our theory is correct 1- POWER The odds of saying there is an effect or gain when in fact there is one #of times out of 100 when there is an effect, well say there is one The Decision Matrix In reality What we conclude Null true Null false Alternative false Alternative true In reality... In reality... Accept null Reject alternative Reject null Accept alternative We say... There is no real program effect There is no difference, gain Our theory is wrong We say... There is a real program effect There is a difference, gain Our theory is correct There is no real program effect There is no difference, gain Our theory is wrong There is a real program effect There is a difference, gain Our theory is correct 1- THE CONFIDENCE LEVEL TYPE II ERROR
The odds of saying there is no effect or gain when in fact there is none #of times out of 100 when there is no effect, well say there is none The odds of saying there is no effect or gain when in fact there is one #of times out of 100 when there is an effect, well say there is none 1- TYPE I ERROR POWER The odds of saying there is an effect or gain when in fact there is none The odds of saying there is an effect or gain when in fact there is one #of times out of 100 when there is no effect, well say there is one #of times out of 100 when there is an effect, well say there is one The Decision Matrix In reality What we conclude Null true Null false Alternative false Alternative true In reality... In reality... Accept null Reject alternative Reject null Accept alternative We say... There is no real program effect There is no difference, gain Our theory is wrong We say... There is a real program effect There is a difference, gain Our theory is correct There is no real program effect There is no difference, gain Our theory is wrong There is a real program effect There is a difference, gain Our theory is correct 1- THE CONFIDENCE LEVEL TYPE II ERROR
1- TYPE I ERROR POWER The Decision Matrix In reality What we conclude Null true Null false Alternative false Alternative true In reality... In reality... Accept null Reject alternative Reject null Accept alternative We say... There is no real program effect There is no difference, gain Our theory is wrong We say... There is a real program effect There is a difference, gain Our theory is correct There is no real program effect There is no difference, gain Our theory is wrong There is a real program effect There is a difference, gain Our theory is correct 1- THE CONFIDENCE LEVEL TYPE II ERROR
1- TYPE I ERROR POWER CORRECT CORRECT The Decision Matrix In reality What we conclude Null true Null false Alternative false Alternative true In reality... In reality... Accept null Reject alternative Reject null Accept alternative We say... There is no real program effect There is no difference, gain Our theory is wrong We say... There is a real program effect There is a difference, gain Our theory is correct There is no real program effect There is no difference, gain Our theory is wrong There is a real program effect There is a difference, gain Our theory is correct 1- THE CONFIDENCE LEVEL TYPE II ERROR
The odds of saying there is no effect or gain when in fact there is none #of times out of 100 when there is no effect, well say there is none The odds of saying there is no effect or gain when in fact there is one #of times out of 100 when there is an effect, well say there is none 1- TYPE I ERROR POWER The odds of saying there is an effect or gain when in fact there is none The odds of saying there is an effect or gain when in fact there is one #of times out of 100 when there is no effect, well say there is one #of times out of 100 when there is an effect, well say there is one If you try to increase power, you increase the chance of winding up in the bottom row and of Type I Error The Decision Matrix In reality What we conclude Null true Null false Alternative false Alternative true In reality... In reality... Accept null Reject alternative Reject null Accept alternative We say... There is no real program effect There is no difference, gain Our theory is wrong We say... There is a real program effect There is a difference, gain Our theory is correct There is no real program effect There is no difference, gain Our theory is wrong There is a real program effect There is a difference, gain Our theory is correct 1- THE CONFIDENCE LEVEL TYPE II ERROR
The odds of saying there is no effect or gain when in fact there is none #of times out of 100 when there is no effect, well say there is none The odds of saying there is no effect or gain when in fact there is one #of times out of 100 when there is an effect, well say there is none 1- TYPE I ERROR POWER The odds of saying there is an effect or gain when in fact there is none The odds of saying there is an effect or gain when in fact there is one #of times out of 100 when there is no effect, well say there is one #of times out of 100 when there is an effect, well say there is one If you try to decrease Type I Error, you increase the chance of winding up in the top row and of Type II Error Sample size for a comparative study The proportion with the feature in the control group (binary outcome)
Measure of variability (continuous outcome)
Minimum clinical difference The smallest difference in outcome between the two treatments that would be deemed to be clinically relevant
Significance level
Power Example 1 A randomised controlled trial to assess the effectiveness of laparoscopic versus open hernia repair
Primary outcome measure is proportion of patients who have returned to normal activities at 2 weeks post op Sample size calculation Study design = RCT
Outcome = Proportion of patients returned to usual activities at 2 weeks following open hernia repair
Statistical test = Chi squared test
Estimate of level of outcome in control group (standard care) = 30%
Minimum clinical difference = 10%
Type I error = 0.1 (90% power)
Type II error = 0.05 (5% significance)
500 patients required in each group Example 2 A study is to be conducted to evaluate a new drug for hypertension compared with the standard drug.
The outcome will be systolic blood pressure at one month after treatment starts. Sample size calculation Study design = RCT
Outcome = SBP at one month
Statistical test = independent groups t-test
Minimum clinical difference = 10 mm Hg
Estimate of variability of SBP = 30mm Hg
Standardised difference = 10/30 = 1/3
Power = 80%
Significance level = 5% Example 2 From statistical formulae, the required sample size is 300 patients in total.
150 patients are required in both groups to yield 80% power of detecting a difference of 10 mm Hg (0.3 standard deviation) in systolic blood pressure at the 5% significance level.