You are on page 1of 45

Measures of Association

Robert Heimer Yale University School of Public Health

April 2009

Overview for Today


Purpose of measuring associations

The 2x2 table


Ratio comparisons (relative measures)
Relative risk, risk/rate ratio, odds ratio

Difference comparison (absolute measures)


Risk/rate difference, population risk/rate difference, attributable proportion among exposed and total pop.

Formula, examples, interpretations Put it all together with an example

Why Estimate Comparisons?


Quantitative comparison is an essential element of epidemiology to identify disease determinants.

Summarize relationship between exposure and disease by comparing at least two frequencies in a single summary estimate. Overall rate of disease in an exposed group says nothing about whether exposure is a risk factor for or causes a disease.
This can only be evaluated by comparing disease occurrence in an exposed group to another group that is usually not exposed. The latter group is usually called the comparison or reference group.

Two Main Options for Comparison


Calculate ratio of two measures of disease frequency
1. 2. 3. 4. Relative comparison Strength of relationship between exposure and outcome Magnitude of association between exposure and outcome How much more likely one group is to develop outcome

Calculate difference between two measures of disease frequency


1. Absolute comparison (attributable) 2. Public health impact of exposure 3. How much greater is the frequency of outcome in one group

Comparing Ratios

The 2-by-2 Table


Outcome
Yes Yes a c a+c No b d b+d Total a+b c+d a+b+c+d

Exposure

No Total

Prevalence or cumulative incidence = a/(a+b) among exposed = c/(c+d) among unexposed = (a+c)/(a+b+c+d) among total population

Example #1
Data from a fixed cohort study of oral contraceptive use (OC) and myocardial infarction (MI) in pre-menopausal women followed for 5 years (adapted from Rosenberg et al AJE 1980). MI
Yes No Total

Yes

23
133 156

304
2816 3120

327
2949 3276

OC use
No Total

5-year CI of myocardial infarction = 23/327 = 7.0% among OC users = 133/2949 = 4.5% among non-OC users

The 2-by-2 Table for Person-Time Data


Outcome
Yes Yes No Total

a c a+c

PTexp PTunexp Total PT

Exposure
No Total

Incidence rates = a/(PTexp) among exposed = c/(PTunexp) among unexposed = (a+c)/(Total PT) among total population

Example #2 -- Using Person-Time Data


Data from a dynamic cohort study of postmenopausal hormone use (PH use) and coronary heart disease (CHD) among women (Stampfer et al NEJM 1985) CHD
Yes Yes 30 60 90 No ------Total 54,308 PY 51,477 PY 105,786 PY

PH use
No Total

Incidence rate of CHD


= 30/54,306 PY = 55.2/100,000 PY among PH users = 60/51,477 PY = 116.6/100,000 PY among non-PH users

Relative Comparisons
Based on ratio of 2 measures of frequency Often referred to as relative risk Dimensionless and ranges 0 - infinity
Prevalence ratio (PR) = Pexp / Punexp = [a / (a+b)] / [c / (c+d)] Cumulative incidence ratio (CIR) = CIexp / CIunexp = [a / (a+b)] / [c / (c+d)] Incidence rate ratio (IRR) = IRexp / IRunexp = [a / PTexp] / [c / PTunexp ]

Interpretations of Risk Ratios


Gives information on the relative effect of the exposure on the disease. Tells you how many times higher or lower the disease risk is among the exposed as compared to the unexposed.

Therefore, commonly used in etiologic research as a measure of risk


Note:
Some people consider CIR and IRR better measures of relative risk than PR because CIR and IRR include measures of time

Example #1 (continued)
MI
Yes Yes 23 133 No 304 2816 Total 327 2949

OC use

No

Total

156

3120

3276

5-yr CIR of MI among OC users compared to non OC users: 5-yr CIR= CIe/CIu = [a/(a+b)]/[c/(c+d)] = (23/327)/(133/2949) = 1.6 Interpreted as relative risk: 1. Women who used OCs had 1.6 times the risk of having MI over 5 years compared to non-OC users (1.6-fold increased risk). 2. There is a 60% increase risk of MI among OC users over a 5-yr period compared to non-users (60% more likely to have MI).

Example #2 (continued)
CHD
Yes No --Total 54,308 PY

PH use

Yes

30

No
Total

60
90

-----

51,477 PY
105,786 PY

Incidence rate ratio of CHD among PH users compared to non-PH users: IRR = IRe/IRu = [a/PTe]/[c/PTu] = 0.5 Interpreted as a relative risk: Women who used PH had 0.5 times, or half, the risk of developing CHD compared to non-users.

Interpretation of Relative Risk


RR=1
Risk in exposed = risk in non-exposed No association

RR>1
Risk in exposed > risk in non-exposed Positive association, factor is associated with disease Larger RR stronger association

RR<1
Risk in exposed < risk in non-exposed Negative association, factor is protective

RR for exposed = 1/RR for non-exposed


Need to pay attention to referent group Risk for whom compared to whom?

Example #1: Extension to 2 x 2 Table


MI
Yes < 1 year 4 No 31

Total
35

1-4 years

5
7 7 133 156

107
127 39 2816 3120

112
134 46 2949 3276

OC use

5-9 years 10+ years Never Total

Any Two Groups Can Be Compared Applying Formulas on Appropriate Cells


Yes < 1 year 1-4 years 5-9 years 10+ years Never Total 4 5 7 7 133 156 No 31 107 127 39 2816 3120 Total 35 112 134 46 2949 3276

The 5-yr CIR (relative risk) of MI among users of OC for greater than 10 years compared to never users is (7/46) / (133/2949) = 3.4.
The 5-yr CIR (relative risk) of MI among users of OC for greater than 10 years compared to users of 5-9 years is (7/49) / (7/134) = 2.7

Case-Control Studies
Participants are selected for study participation on the basis of pre-existing disease status Cannot estimate prevalence, CI, or IR
Do not know the population at risk

Cannot use previous formulas Relative risk can be estimated by odds ratio (OR)
ratio of odds of exposure among cases to odds of exposure among controls

OR = (a/c)/(b/d) = ad/bc

Case-control: Looking Back


Yes Exposure Yes

No
Yes No Disease

Exposure No

Example #3: Odds Ratio


Example adapted from Kehrberg et al 1981 AJE

Toxic shock syndrome Cases Tampon brand used during month of illness Rely Other Total 15 9 24 Controls 14 45 59 Total (29) (54) (83)

OR = (15*45)/(14*9) = 5.4 Interpretations: The odds of using Rely tampons among women with TSS were 5.4 times higher than the odds of using Rely tampons among those without TSS (technical) Women who used Rely tampons were 5.4 times more likely to develop toxic shock syndrome than women who used other brands (loosely).

Notes on Odds Ratios


Can be thought of as exposure odds ratio

These data provide an estimate of risk in some situations.


They do not allow one to estimate incidence of TSS in the population of women at risk.

To do this, one would need to know the number of all cases of TSS among women (numerator) and the number of all women who are at risk (denominator).
Usually, denominator data are not available and numerator data may be incomplete in case-control studies.

Odds Ratio Approximates Relative Risk


When disease is rare
Proportion of cases in exposed and unexposed groups is low a<<b, so a+b b and c<<d, so c+d d RR = a/(a+b)/c/(c+d) a/b / c/d = ad/bc

If disease is not rare:


When cases are newly diagnosed When prevalent cases are excluded, makingit more like a cohort/incidence study

Comparing Differences

Difference (Absolute) Comparisons


Based on difference between 2 measures of frequency Comparing disease occurrence among the exposed with the disease occurrence among the unexposed comparison group by subtracting one from the other. Gives information on:
the absolute effect of exposure on disease occurrence the excess disease risk, or disease burden, in the exposed group compared to the unexposed group the public health impact of an exposure, that is, how much disease would be prevented if the exposure were removed

Note: this assumes that the exposure causes the disease

Risk Difference
Risk difference (RD) = Rexp Runexp
For Prev*: RD = Pexp - Punexp = a / (a+b) c / (c+d) For CI: RD = CIexp - CIunexp = a / (a+b) c / (c+d) For IR: RD = IRexp - IRunexp = a / PTexp c / PTunexp

* Some epidemiologists hesitate to use risk when referring to prevalence estimates More precisely called prevalence difference, cumulative incidence difference, and incidence rate difference Also called attributable risk, rate difference, attributable rate
Note: attributable implies causality RD = 0 when there is no association between exposure and disease

What we can hope to accomplish in reducing risk of disease among exposed if exposure were eliminated

Example #4
(adapted from Boice 1977 JNCI)

Breast cancer cases Radiation exposure 41 15

PY 28,010 19,017

Rates per 10,000 PY 14.6 7.9

No radiation exposure Total

56

47,027

11.9

IRD = 6.7 / 10,000 PY

Interpretation:
Broad: there are 6.7 excess cases of breast cancer for every 10,000 PY among those exposed to radiation compared to those not exposed. Narrow: Eliminating this radiation would prevent 6.7 cases of breast cancer for every 10,000 PY (attribution).

Example #5: Comparison of RR and RD


Annual Mortality Rate Per 100,000 Lung Cancer Cigarette Smoker Non Smoker 140 10 Coronary Heart Disease 669 413

RR
RD

14.0
130/100,000/YR

1.6
256/100,000/YR

Conclusion: Cigarette smoking is a much stronger risk factor for lung cancer but (assuming smoking is causally related to both diseases) the elimination of cigarette smoking would prevent far more deaths from coronary heart disease. Why is this so? Death from CHD is much more common.

In Other Words
Relative risk is a measure of strength of association between exposure and disease and is useful in analytical studies
risk

Relative difference is a measure of how much disease incidence is attributable to exposure, and is useful in assessing exposures public health importance
burden

Population Risk Difference (PRD)


Measures excess disease occurrence among the total population that is associated with the exposure. Helps to evaluate which exposures are most relevant to the health of a target population.
Describes impact of exposure on total population Number of cases that would be eliminated in total population if exposure was removed (assuming causality)

Depends on prevalence of exposure in population


For ex, PRD will be low if exposure is rare, even if RR is high

Also called population attributable risk

Calculating Population Risk Difference


Two formulas for PRD: PRD = Rt - Ru where Rt = risk total population and Ru = risk among unexposed

PRD = (RD)*(PPexp) where PPexp is % of population that is exposed Note (as always) that risk may refer to IR or CI (more accurately) or prevalence (less accurately)

Example #4 Revisited
Breast cancer cases
Radiation exposure No radiation exposure Total 41 15 56 PY 28,010 19,017 47,027 Rates/10,000 PY 14.6 7.9 11.9

PRD = 11.9 7.9 = 4

Interpretation: 4 excess breast cancer cases for every 10,000 PY of observation can be attributed to radiation exposure. If radiation causes breast cancer, then 4 cases of breast cancer for every 10,000 person-years of observation could be prevented if the radiation exposure were removed.

Attributable Proportion among Exposed


APe describes the proportion of disease among exposed that is due (attributable) to exposure or that would be prevented if exposure were eliminated Risk in non-exposed group can be considered background incidence that would occur regardless of exposure Interpretation may assume causal relationship

APe = [(Re Ru)/Re] * 100 = RD / Re * 100


Re = risk (IR, CI, P) among exposed Ru = risk (IR, CI, P) among unexposed
Also called etiologic fraction, attributable risk percent, attributable risk among exposed

Alternative formula
Again, we are interested in the difference between the risk in the exposed and risk in the unexposed:

APe = [(Re Ru)/Re] * 100 Divide numerator and denominator by Ru APe = [(RR-1)/RR] * 100 (Different formulas may be helpful depending on what information you have.)
32

Note on alternative formula:


Especially useful for case-control data
You cannot estimate risk directly Can use OR to estimate RR under certain conditions. In which case:

APe = [(OR-1)/OR] * 100

33

Example
(continued)

Breast cancer cases Radiation exposure No radiation exposure Total 41 15 56

PY

Rates/10,000 PY

28,010 19,017 47,027

14.6 7.9 11.9

APe = [(14.6 7.9)/14.6] * 100 = 46% Interpretation: 46% of cases of breast cancer among those exposed to radiation may be attributed to radiation exposure and could be eliminated if exposure were removed.

Attributable proportion among total population


APt describes the proportion of disease among total population that would be eliminated if exposure were eliminated What percent of disease in total population is due to exposure Useful for setting priorities for public health action
Elimination of exposure lead to what impact on population?

Assumes causal relationship

APt = [(Rt Ru)/Rt] * 100 = PRD / Rt * 100 Rt = risk (IR, CI, P) among total population Ru = risk (IR, CI, P) among unexposed

Also known as population attributable risk percent

Alternative formulas
APt = [1 - (Ru / Rt)] * 100 APt = [(Pe)(RR-1)]/[(Pe)(RR-1)+1] * 100
For case-control studies:
APt = [(Pe)(OR-1)]/[(Pe)(OR-1)+1] * 100

where Pe = proportion of exposed controls

(Convince yourself with some algebra if you like.)


36

Example
(continued)

Breast cancer cases Radiation exposure No radiation exposure Total 41 15 56

PY

Rates/10,000 PY

28,010 19,017 47,027

14.6 7.9 11.9

APt = [(11.9 7.9)/11.9] * 100 = 34% Interpretation:


34% of breast cancer cases in total study population may be attributed to radiation exposure and could be eliminated if exposure were removed.
37

Putting It All Together


The 58th annual convention of the American Legion was held in Philadelphia from July 21 until July 24, 1976. People at the convention included American Legion delegates, their families, and other legionnaires who were not official delegates. Between July 20th and August 30th, some of those who had been present became ill with a type of pneumonia subsequently named Legionnaire's Disease. No one attending the convention developed the disease after August 30th.

QuickTime and a decompressor are neede d to see this picture.

QuickTime and a decompressor are neede d to see this picture.

Exercise to Practice Measures of Comparison


Below are the numbers of delegates and non-delegates who developed Legionnaire's Disease during the period July 20 to August 30 (41 day period).
Developed Legionnaires Disease

Yes Delegate
Convention Status

No 1724 759

Total 1849 762

125 3

NonDelegate

What is the risk of Legionnaires disease among delegates and non-delegates?


First: what type of measure is this?

Cumulative incidence rate among delegates:

Cumulative incidence rate among non-delegates:

What is the relative risk of Legionnaires disease among delegates compared to non-delegates?
First: what type of measure is this?

RR =

Interpretation:

What is the risk difference for Legionnaires disease among delegates vs. non-delegates?
First, what type of measure is this?

RD =

Interpretation:

What is the attributable proportion of disease among delegates?


APe =

Interpretation:

Further Analysis of Convention Delegates


Developed Legionnaires Disease

Yes Hotel A
Other Hotel

No 628
1,098

Total 690
1,161

62
63

Cumulative incidence among Hotel A residents =

Cumulative incidence among other hotel residents =

CIR = Interpretation:

Key points
Relative and absolute measures of comparison tell us different things
Relative risk is a measure of strength of association; often of interest to epidemiologists who do etiologic research Absolute risk then becomes important (assuming causality) to public health planners, policy makers, etc. for estimating public health impact of exposures on communities

Measures go by different names and multiple formulas are sometimes available. It is important to
know what you are interesting in measuring, know what data are available and in what form, and if data are not available, know how to collect appropriate data.

You might also like