You are on page 1of 8

Explanation of statistical methods*

1. Introduction
This paper sets out the indirect standardisation methodology and statistical methods
used in the construction of the clinical indicators. The paper is aimed at information
staff, but a knowledge of statistics is required to understand the method of dealing
with over-dispersion.

2. Indirect standardisation methodology


Why standardise?
Health outcomes are related to a wide range of factors, including the age and sex of
the patient. Different trusts treat different populations of patients. Standardisation is
used to adjust the indicator values taking into account differences between
populations treated by individual trusts. The methodology used is termed indirect
standardisation. Indirect standardisation involves the calculation of the ratio of a
trust's observed number of readmissions or deaths and the number that would be
expected if it had experienced the average rates of patients in England, given the mix
of age, sex and other factors (as specified in the indicator constructions). The ratios
(and their confidence intervals) are then converted into standardised readmission or
death rates, which are the values of the indicators. The standardised rate for a trust
will not usually be equal to the indicator numerator divided by the denominator.
All of the indicators are standardised for age and sex. Some indicators are also
standardised for other factors such as method of admission, surgical procedure and
diagnosis (see indicator construction documents for details).

Standardisation methodology
The standardisation methodology is illustrated using the worked example below. The
example considers data for emergency readmissions following treatment for hip
fracture for trust X, but is equally applicable to the other acute and mental health
clinical indicators.
Expected hip fracture readmissions are calculated for each sex and age group using
the formula:
xs ( j )
Expected number of readmissions = n ( j ) *
ns ( j )
where
n(j) = number of spells relevant to the indicator in age group j in trust X
xs(j) = number of readmissions in age group j in the England data
ns(j) = number of spells relevant to the indicator in age group j in the England data
So, referring to the data in table 1, for age 80-84, the expected number of male
readmissions
=

15 * (155/1510) = 1.54

and the expected number of female readmissions


*

Version 5.1 of 16 July 2004

Explanation of statistical methods

40 * (588/6837) = 3.44

Table 1 Example of data for emergency admissions


TRUST X
Age range

j
0
1-4
5-9
10-14
15-19
20-24
25-29
30-34
35-39
40-44
45-49
50-54
55-59
60-64
65-69
70-74
75-79
80-84
85+
Total

England

Numerator

Denominator

Numerator

Denominator

Denominator
spells where
patient is
readmitted <28
days, x(j)
Male
Female
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
0
0
0
0
0
0
1
0
0
0
1
0
3
2
4
1
3
3
9
7
21

Spells with
emergency
admission for hip
fracture, n(j)

Denominator spells
where patient is
readmitted <28 days,
xs(j)

Spells with emergency


admission for hip
fracture, ns(j)

Male
0
1
0
1
0
1
0
0
0
1
0
1
0
1
4
6
7
15
9
47

Male
0
3
1
2
2
1
1
6
7
7
11
28
28
34
48
95
164
155
239
832

Male
5
27
23
74
51
35
60
117
112
140
155
270
332
399
580
969
1453
1510
1964
8276

Female
0
0
0
0
0
0
0
0
0
2
0
4
3
2
6
14
29
40
73
173

Female
1
0
0
1
0
0
0
1
1
2
7
13
22
51
89
194
409
588
1124
2503

Female
10
20
17
31
14
13
12
19
38
49
103
269
403
682
1261
2597
5036
6837
12394
29805

The full set of expected readmissions are shown in table 2.


The standardised ratio (SR) for persons is then calculated as:
SR =

Sum of all observed values


Sum of all expected values

x (j )
e (j )

where
x(j) = number of observed readmissions in age group j for PCT / Trust X
e(j) = number of expected readmissions in age group j for PCT / Trust X
So in the example,
SR =
(7 + 21) / (4.76 + 14.47) = 1.46

Explanation of statistical methods

Table 2 Calculated expected readmissions for emergency


readmissions example
Trust X
Age range
j
0
1-4
5-9
10-14
15-19
20-24
25-29
30-34
35-39
40-44
45-49
50-54
55-59
60-64
65-69
70-74
75-79
80-84
85+
Total

Expected readmissions
e(j)
Female
Male
0.00
0.00
0.11
0.00
0.00
0.00
0.03
0.00
0.00
0.00
0.03
0.00
0.00
0.00
0.00
0.00
0.00
0.00
0.05
0.08
0.00
0.00
0.10
0.19
0.00
0.16
0.09
0.15
0.33
0.42
0.59
1.05
0.79
2.36
1.54
3.44
1.10
6.62
4.76
14.47

The final indicator value is produced by multiplying through by the England crude rate
per 100,000, i.e.
Indicator value = SR * England crude rate
So in the example,
England crude rate = ((832 + 2503) / (8276 + 29805)) * 100000 = 8757.65
Hence, the indicator value per 100,000 = 1.46 * 8757.65 = 12786.17
This crude rate multiplication is used in both the Scottish and Welsh clinical
indicators and is done in order to produce a more useable and interpretable final
value.
Where data for more than one calendar year are pooled for the calculation of the
indicator, the data are standardised using the England averages for the pooled data.
Where data for a single year are used for the calculation of the indicator, data for the
most recent three calendar years are analysed, with the data for the middle of the
three years being used to provide England averages for standardisation. Data for the
most recent calendar year is used for calculating the published indicator values.

Explanation of statistical methods

3. Comparing a trusts value with a "standard"


Confidence Intervals
The outcome of hospital spells will vary from one group of patients to another, so for
any particular group the underlying value attributable to a hospital can only be
estimated. Confidence intervals are used to reflect this uncertainty. The probability
that the 95% confidence interval contains the true value of the rate is 95%.

Hypothesis testing
Tests of statistical significance are used to compare the observed rates of a trust with
a standard such as the national average rate. For example, P<0.001 indicates that
there is less than a one in 1000 chance of getting such a high trust result by chance
alone, were the rate of that trust truly at the national standard level. Five bands can
be used to grade trust values i.e. A1, A5 - trust values above the national average
with P<0.001 or P<0.025 respectively; B1, B5 - trust values below the national
average with P<0.001 or P<0.025 respectively; and W trust values not considered
to be statistically significantly different from the national average. P<0.001 and
P<0.025 will generally, but not always exactly, correspond to whether the respective
99.8% and 95% confidence intervals overlap with those of the average.

Over-dispersion
Indicators that are based on large numbers of cases have a precision that can result
in statistically significant differences that are not of practical significance. For the
clinical indicators it is reasonable to accept as inevitable a degree of between trust
variability in performance and seek to identify trusts that deviate from this distribution
of performance, rather than deviating from a single standard. This underlying
distribution is estimated using techniques that avoid undue influence of outlying
trusts, in which a proportion of the top and bottom values are Winsorised (i.e. shrunk
in). The significance of observed deviations then takes into account both the
precision with which the indicator is measured within each trust (i.e. the sample size),
and the estimated between-trust variability. This statistical technique is used for the
clinical indicators to reduce the possibility of inappropriately classifying trusts as
abnormal.

Interpretation
Both the 95% and 99% confidence intervals are calculated for each trust, and can be
used by trusts to compare their performance with the national figure. The upper and
lower confidence limits are provided to give a plausible range for the true underlying
rate. If trust and national confidence intervals do not overlap, this generally indicates
that differences in rates are statistically significant. This approach has been used for
star ratings purposes in the past and is straight-forward for use by trusts. However,
the star ratings model itself has evolved to use formal statistical tests using
probability (P) values rather than overlapping confidence intervals. Trust performance
is indicated by one of five bands A1, A5, W, B5, B1 (defined above) assigned using
these statistical tests.

4. Technical description of the over dispersion technique


The following section gives technical details of how the confidence limits are
calculated. This section is aimed at people with a specialist knowledge of statistics as
the detailed calculations are difficult for trusts to reproduce. There is an example of
the technique applied to emergency readmissions from acute hospitals at the end of
this document to illustrate its practical effects.

Explanation of statistical methods

We assume an indicator Y with a target o which specifies the desired expectation, so


that Exp(Y o) = o. The target is assumed known and measured without error.
For each observation y we calculate a standard P-value
pi P (Y yi 0 , i )

where i is a measure of measurement precision such as the sample size.


These P-values are then used to test the hypothesis that a trust is "on-standard", i.e.
Exp(Y) = o.
P-values can be converted to standardised Z-scores by zi = -1 (p), where -1 is the
inverse standard normal cumulative distribution function. For indicators with
suspected over-dispersion, samples sizes will generally be large enough so that the
indicator can be reported as yi and si (si = estimated standard error of yi). The general
definition of a Z-statistic is

zi = (yi - o)
si0

(1)

where si0 = standard error of yi given the trust is on target: hence si0 = Var(yi
o,i). zi
is referred to as the unadjusted Z score in the published data. It is important to note
that s0i may not necessarily be the same as the reported si, and hence some care is
required in calculating the z-scores. For example, if yi is an observed proportion
between 0 and 1, then

si = (yi(1- yi)/ni)
where ni is the effective sample size. si0 can then be estimated to be

si0 = (o(1- o) /ni)


For normal approximations, funnel plot limits o zps0 are defined as a function of
standard errors s0, where zp is the appropriate standard normal deviate. The
difference between these two standard errors explains why a confidence interval may
just include the standard, whereas the P-value may indicate the standard is not being
met.

Explanation of statistical methods

Winsorising Z-scores
Winsorising consists of shrinking in the extreme Z-scores to some selected
percentile, using the following method.
Rank cases according to their naive Z-scores.
Identify Zq and Z1-q, the 100q% most extreme top and bottom naive Z-scores,
where q might, for example, be 0.1.
Set the lowest 100q% of Z-scores to Zq, and the highest 100q% of Z-scores to Z1q. These are the Winsorised statistics.

This retains the same number of Z-scores but discounts the influence of outliers.

Estimation of over-dispersion
Following the standard approach of generalised linear modelling (McCullagh and
Nelder, 1989) an over-dispersion factor is introduced that will inflate the null
variance, so that

Var(Y , , ) Var (Y , )
0

Suppose we have a sample of I units that we shall assume (for the present) all to be
on-standard. may be estimated as follows:

1
2
z i
I i

(2)

Where zi is the standardised Pearson residual defined in (1). I * is a standard test


of heterogeneity, and is distributed as a 2I-1 distributed under the null hypothesis that
all are hitting the target. Over-dispersion might only be assumed if is significantly
greater than one. Winsorised z-scores can be used in estimating .

An additive random effects model

This assumes that Exp(Yi) = i, and that for on-standard trusts i is distributed with
mean 0 and standard deviation . can be estimated using a standard method of
moments (DerSimonian and Laird, 1986).

I * ( I 1)

(3)

w w w
2

where wi = 1/si2, and

I *

is the test for heterogeneity: if

I *

< (I - 1), then

is

set to 0 and complete homogeneity is assumed. Otherwise the adjusted z-scores are
given by

ziD = yi - 0
(si2 + 2)

Explanation of statistical methods

Example of over dispersion technique


This technique is illustrated using an example derived from Spiegelhalter (2004).
Figure 1 shows `funnel plots for readmission rates in 67 large hospitals, with and
without adjustment for over-dispersion. The `control limits show the area in which
we expect 95% and 99.8% of hospitals to lie, were they all to have the national
readmission rate of 5.9%. It is clear from Figure 1(a) that substantial numbers of
hospitals lie outside these bounds suggesting that unmeasured factors are
influencing the results beyond chance alone. The adjustment for over-dispersion
expands the control limits to include the bulk of the hospitals, while still allowing
discrimination of those with a practically important difference from the standard.

(a) Emergency re-admission within 28 days of discharge

% readmission

10

99.8 % limits
95 % limits

8
6
4
2
0
0

20000

40000

60000

80000

100000

120000

100000

120000

(b) Additive over-dispersion

% readmission

10
8
6
4
2
Random effects SD 0.086 based on 10 % winsorised

0
0

20000

40000

60000

80000

Volume of cases

Figure 1. `Funnel plot of proportions of emergency re-admission within


28 days of discharge from 67 large acute or multi-service hospitals in
England, 2000-01, plotted against volume of cases. The standard is
the overall average rate of 5.9%. (a) shows the over-dispersion around
the standard predicted 99.8% and 95% limits, while (b) shows the
expanded limits based on an additive random-effects model (adapted
from Spiegelhalter, 2004)

References
Breslow NE & Day NE, Statistical Methods in Cancer Research: Volume II- The
Design and Analysis of Cohort Studies, IARC, Lyon, 1987: 91-95

Explanation of statistical methods

Breslow NE & Day NE, Statistical Methods in Cancer Research: Volume II- The
Design and Analysis of Cohort Studies, IARC, Lyon, 1987: 107-108
Breslow NE & Day NE, Statistical Methods in Cancer Research: Volume II- The
Design and Analysis of Cohort Studies, IARC, Lyon, 1987: 64
DerSimonian R and Laird N Meta-analysis in clinical trials. Controlled Clinical Trials
1986: 7, 177-188.
McCullagh P and Nelder J. Generalised Linear Models, 2nd edition. Chapman and
Hall, London 1989.
Spiegelhalter DJ. Funnel plots for institutional comparisons. Statistics in Medicine
2004: (to appear)

You might also like