You are on page 1of 12

Statistical variability and confidence intervals for planar dose QA pass rates

Daniel W. Baileya)
Department of Physics, State University of New York at Buffalo, Buffalo, New York 14260 and Department
of Radiation Medicine, Roswell Park Cancer Institute, Buffalo, New York 14263
Benjamin E. Nelms
Canis Lupus LLC, Merrimac, Wisconsin 53561
Kristopher Attwood
Department of Biostatistics, Roswell Park Cancer Institute, Buffalo, New York 14263
Lalith Kumaraswamy
Department of Radiation Medicine, Roswell Park Cancer Institute, Buffalo, New York 14263
Matthew B. Podgorsak
Department of Radiation Medicine, Roswell Park Cancer Institute, Buffalo, New York 14263; Department of
Molecular and Cellular Biophysics and Biochemistry, Roswell Park Cancer Institute, Buffalo, New York
14263; and Department of Physiology and Biophysics, State University of New York at Buffalo, Buffalo,
New York 14214
(Received 17 May 2011; revised 14 September 2011; accepted for publication 26 September 2011;
published 20 October 2011)
Purpose: The most common metric for comparing measured to calculated dose, such as for pretreat-
ment quality assurance of intensity-modulated photon fields, is a pass rate (%) generated using percent
difference (%Diff), distance-to-agreement (DTA), or some combination of the two (e.g., gamma evalua-
tion). For many dosimeters, the grid of analyzed points corresponds to an array with a low areal density
of point detectors. In these cases, the pass rates for any given comparison criteria are not absolute but
exhibit statistical variability that is a function, in part, on the detector sampling geometry. In this work,
the authors analyze the statistics of various methods commonly used to calculate pass rates and propose
methods for establishing confidence intervals for pass rates obtained with low-density arrays.
Methods: Dose planes were acquired for 25 prostate and 79 head and neck intensity-modulated fields
via diode array and electronic portal imaging device (EPID), and matching calculated dose planes were
created via a commercial treatment planning system. Pass rates for each dose plane pair (both centered
to the beam central axis) were calculated with several common comparison methods: %Diff/DTA com-
posite analysis and gamma evaluation, using absolute dose comparison with both local and global nor-
malization. Specialized software was designed to selectively sample the measured EPID response (very
high data density) down to discrete points to simulate low-density measurements. The software was
used to realign the simulated detector grid at many simulated positions with respect to the beam central
axis, thereby altering the low-density sampled grid. Simulations were repeated with 100 positional itera-
tions using a 1 detector/cm2 uniform grid, a 2 detector/cm2 uniform grid, and similar random detector
grids. For each simulation, %/DTA composite pass rates were calculated with various %Diff/DTA crite-
ria and for both local and global %Diff normalization techniques.
Results: For the prostate and head/neck cases studied, the pass rates obtained with gamma analysis
of high density dose planes were 2%–5% higher than respective %/DTA composite analysis on av-
erage (ranging as high as 11%), depending on tolerances and normalization. Meanwhile, the pass
rates obtained via local normalization were 2%–12% lower than with global maximum normaliza-
tion on average (ranging as high as 27%), depending on tolerances and calculation method. Reposi-
tioning of simulated low-density sampled grids leads to a distribution of possible pass rates for each
measured/calculated dose plane pair. These distributions can be predicted using a binomial distribu-
tion in order to establish confidence intervals that depend largely on the sampling density and the
observed pass rate (i.e., the degree of difference between measured and calculated dose). These
results can be extended to apply to 3D arrays of detectors, as well.
Conclusions: Dose plane QA analysis can be greatly affected by choice of calculation metric and
user-defined parameters, and so all pass rates should be reported with a complete description of
calculation method. Pass rates for low-density arrays are subject to statistical uncertainty (vs. the
high-density pass rate), but these sampling errors can be modeled using statistical confidence inter-
vals derived from the sampled pass rate and detector density. Thus, pass rates for low-density array
measurements should be accompanied by a confidence interval indicating the uncertainty of each
pass rate. VC 2011 American Association of Physicists in Medicine. [DOI: 10.1118/1.3651695]

Key words: radiation therapy, IMRT QA, quality assurance, dosimetry, dose verification, diode array

6053 Med. Phys. 38 (11), November 2011 0094-2405/2011/38(11)/6053/12/$30.00 C 2011 Am. Assoc. Phys. Med.
V 6053
6054 Bailey et al.: QA pass rates 6054

I. INTRODUCTION TPS. TPS calculated dose planes for each IMRT field were
Quantitative comparison of measured and calculated dose created via the Varian Eclipse 8.6 treatment planning system,
planes is a useful tool in a broad spectrum of quality assurance using reference conditions of 95 cm source-to-surface distance
(QA) and commissioning in radiotherapy. For example, (SSD) and 5 cm depth in water. Though an EPID was used in
patient specific quality assurance for intensity-modulated radi- this study as the source of high density (HD) data, the actual
ation therapy (IMRT) or volumetric-modulated arc therapy source of the data is not germane but could have been any
(VMAT) is commonly performed by measuring dose planes high density data source (e.g., film). The EPID of course is
via arrays of point detectors, such as ionization chambers1 or itself limited in resolution by pixel size, as compared with
diodes,2–4 and comparing to respective dose planes calculated film or computed radiography which are only limited in reso-
by the treatment planning system (TPS). The most common lution by the digitization process.
metric for comparing measured to calculated dose planes is a
pass rate indicating the percentage of measured points that II.B. Pass rate calculation methods
match the calculated plane within certain criteria.5,6 Methods
used to calculate these pass rates include dose percent differ- The pass rate for each measured vs calculated plane pair
ence (%Diff), distance-to-agreement (DTA), or some combi- (both centered on the beam central axis, i.e., CAX) was cal-
nation of the two (for example, the gamma evaluation7,8), culated with two common comparison methods. %/DTA
each yielding somewhat different results. composite analysis11,12 (e.g., “DTA” analysis with the MAP-
2,13
For many dosimeters, the grid of analyzed points corre- CHECK software ) tests each measured point via sequential
sponds to an array with low detector density. In our experi- analysis of %Diff and DTA: if a measured point fails the
ence, as will be demonstrated during the course of this study, %Diff test, the area defined by the DTA criterion is scanned
the position of the low-density grid with respect to the high- for a measured point that exactly matches the calculated
resolution dose map can alter the pass rate (even for fixed point. Contrastingly, gamma evaluation tests each measured
analysis method and criteria) simply by changing the points point via a search of the surrounding realm of %Diff and dis-
sampled. In these cases, pass rates are not absolute, as they tance values, finding the point with the minimum gamma
exhibit statistical variability that is a function, in part, on the index.7 By definition, the gamma pass rate will be greater
detector sampling geometry and positioning.5,9 than or equal to the %/DTA composite pass rate.
In this work, we analyze the statistics of various methods The percent difference analysis of each method is depend-
commonly used to calculate pass rates and propose methods ent on how the dose %Diff is normalized, whether locally or
for establishing confidence intervals for pass rates obtained globally. Using local normalization, the %Diff between any
for low-density array measurements. measured and calculated dose point pair is normalized to the
planned dose at that local point, according to the following
equation:
II. METHOD AND MATERIALS
Mi;j  Pi;j
II.A. High-density dose plane acquisition %Diff i;j ¼  100%; (1)
Pi;j
2D dose planes were measured for 25 prostate IMRT fields where %Diffi,j represents the percent difference between the
and 79 head and neck (H&N) IMRT fields via a Varian Por- measured point dose Mi,j and planned point dose Pi,j. How-
talVision aS1000 electronic portal imaging device (EPID) ever, using global normalization, the difference between any
coupled to a Varian Trilogy linear accelerator (Varian Medi- measured/calculated dose point pair is normalized using an
cal Systems, Palo Alto FL), at 105 cm source-to-image dis- identical value for all point pairs,5,14 usually the maximum
tance (SID) with no additional buildup. These fields were measured or planned dose, according to the following equation:
planned via the Eclipse 8.6 treatment planning system, also
Mi;j  Pi;j
by Varian, for the 6 MV photon beam using dynamic multi- %Diff i;j ¼  100%; (2)
leaf collimation via the Millenium 120 multileaf collimator. Pnorm
Field sizes ranged from approximately 7  7 to 20  20 cm2, where Pnorm represents the global normalization dose, a con-
with radiation output ranging from approximately 100 to 300 stant for all dose point pairs (i,j) within the entire dose map.
MU and were delivered with gantry angle fixed at 0 . Dark If Pnorm is selected as the point of maximum dose, then
field and flood field calibrations were acquired on each day of Pnorm  Pi;j for all dose points by definition. Thus, the %Diff
measurement. Subsequent DICOM RT images were con- calculated by global normalization to maximum dose will
verted into dose planes via the EPIDOSE algorithm (Sun Nuclear always be less than or equal to locally normalized %Diff for
Corporation, Melbourne FL), commissioned with this EPID the same dose point, yielding a higher overall pass rate.
and the MAPCHECK 1 (also by Sun Nuclear Corporation) diode However, the extent to which a globally normalized pass
array according to vendor specifications, and optimized to this rate exceeds its respective locally normalized pass rate is not
particular LINAC/EPID system.10 The EPIDOSE program uses a readily apparent from these definitions and may be a func-
four-step algorithm to convert EPID integrated images into tion of other parameters such as fluence complexity and/or
dose planes in water. These converted dose planes are then %Diff/DTA criteria.
compared directly with TPS dose planes calculated in water, In this study, pass rates were calculated for each meas-
thereby auditing both the fluence and dose calculations in the ured/calculated plane pair using both local normalization

Medical Physics, Vol. 38, No. 11, November 2011


6055 Bailey et al.: QA pass rates 6055

and global normalization for both the %/DTA composite diode arrays). The software was designed to sample grids uni-
analysis and gamma evaluation techniques. Two sets of formly (i.e., orthogonal grids) or randomly, thereby testing any
%Diff/DTA criteria were employed (2%, 2 mm and 3%, potential biases introduced by a simple orthogonal low-density
3mm), and a 10% dose threshold was used for each analysis grid.
(i.e., the percentage of the maximum measured dose below By discretely changing the sampled positions of the vir-
which no measured dose point factors into the calculation). tual detectors, the software was used to “realign” the low-
Since many different pass rate calculation methods are density sampled grid at numerous simulated positions with
commonly employed, we suggest a short-hand label be uti- respect to the CAX. Each new position alters the discrete
lized in reporting pass rates, especially between institutions: locations audited in the delivered dose plane. This process is
analogous to simply repositioning a physical diode array
ðmethod; %Diff, DTA, threshold, dose-comparison,
with respect to the CAX, such that the diodes reside in dif-
normalization); (3) ferent positions and remeasuring the same dose delivery.
where the method is either “c” for %/DTA composite analy- Simulations were repeated with 100 positional iterations for
sis or “c” for gamma evaluation, %Diff and DTA are the each field using a 1 detector/cm2 (hereafter, det/cm2) uni-
numeric values for each respective analysis criterion, thresh- form grid, a 2 det/cm2 uniform grid, and similar random de-
old is the chosen dose threshold, dose–comparison indicates tector grids (i.e., the same detector densities, but at random
the choice of dose comparison method (either “A” for abso- positions). For each iteration, four pass rates were calculated
lute dose or “R” for relative dose, and normalization indi- with each of the following analysis criteria: (c,2,2,L);
cates the %Diff normalization method) either “L” for local (c,3,3,L); (c,2,2,Gmax); and (c,3,3,Gmax).
or “Gnorm” for global.
In this final index, “norm” represents the selected normal- II.D. Confidence interval calculations for low-density
ization value, which in this study is the maximum measured pass rates
dose for all globally normalized calculations, i.e., Gmax.
Two statistical models, both based upon the binomial pro-
Thus, for example, a gamma evaluation using the parameters
portion, were used to calculate confidence intervals for low-
of 3%, 3 mm, 10% threshold, absolute dose, and global nor-
density pass rates, based on the number of sampled points
malization would be written:
and the degree of conformity between the measured and cal-
ðc; 3; 3; 10; A; Gmax Þ: (4) culated dose planes (i.e., the observed pass rate). First, the
In this study, many pass rates are reported for various situa- normal approximation of the binomial distribution15,16 was
tions, and so a shortened form of this label is used. The pass employed to calculate a confidence interval for each sampled
rates reported in this work are all calculated for absolute low-density pass rate, according to the following equation:
rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffirffiffiffiffiffiffiffiffiffiffiffiffi
dose with a 10% dose threshold, thus the “threshold” and pð1  pÞ N  n
“dose-comparison” elements of Eq. (3) will be omitted for CIN ¼ p 6 z1a2 ; (5)
n N1
enhanced readability. Thus, for example, a gamma evalua-
tion using the parameters of 3%, 3 mm, 10% threshold, abso- where CIN represents the normal approximation confidence
lute dose, and global normalization will be written (c, 3, 3, interval for the binomial proportion, p is the probability of
Gmax). achieving a pass, z1a is the 1  ða=2Þ percentile of a stand-
2
ard normal distribution (for a 95% confidence interval,
II.C. Low-density sampling method z1a ¼ 1:96), N is the population of measurable points (i.e.,
2
the number of points sampled via high data density measure-
To test the variability of pass rates due to detector sam-
ment), n is the population of sampled points for the low-
pling geometry, specialized research software (PLANEPRO, a
density measurement.
non-commercial research software for planar analysis devel-
As a second approach, the Wilson confidence interval17,18
oped by Benjamin Nelms, Ph.D., Canis Lupus LLC, Merri-
(CIW) is an alternate method of calculating confidence intervals
mac WI, 2011) was designed to selectively sample the EPID
for binomial proportions which tends to yield good results even
response (very high data density) down to discrete points to
for small samples and probabilities close to 0 or 1. This confi-
simulate low-density measurements. Each EPID dose plane
dence interval is calculated according to the following equation:
and calculated dose plane was divided into virtual detectors sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
of 1 mm2 area (bilinear interpolation, without simulated vol- 2
1 2 pð1  pÞ z1a2
ume averaging). Respective high density grids were aligned p þ z1a 6 z1a2 þ 2
such that the origins of the 2D coordinates were colocated. 2n 2 n 4n
CIW ¼ ; (6)
To produce simulated low-density grids, the measured high 1 2
1 þ z1a
density grids from ten prostate fields and ten H&N fields n 2
(measured with EPID) were sampled to reduce the number of where CIW represents the Wilson confidence interval for the
virtual detectors per unit area. For example, high density dose binomial proportion and all other symbols are the same as in
planes were sampled from 100 detectors/cm2 down to 2 detec- Eq. (5).
tors/cm2, rather than physically measuring the dose with a 2 Both of these distributions utilize two aspects of low-
detectors/cm2 diode array (similar to commercially available density planar dose QA: each dose point measurement has

Medical Physics, Vol. 38, No. 11, November 2011


6056 Bailey et al.: QA pass rates 6056

only two possible outcomes, pass or fail; and the user knows sampled via 1 det/cm2 and 2 det/cm2 densities in the catego-
both the observed pass rate and the number of sampled ries of prostate and H&N IMRT. To simulate the distribution
points (n) for each measurement. Taking N to be the number of prostate data (with 2%, 2 mm tolerances), HD pass rates
of high density data points that could be sampled for such were allowed to vary randomly between 70% and 95%, and
QA measurements (e.g., commonly in the tens of thousands sample size was allowed to vary randomly from 70 to 105
for EPID measurements), we can approximate that the sec- points (approximately the spread of the low-density sample
ond fraction in Eq. (5) is approximately unity, since N  n. sizes for prostate fields at 1 det/cm2) and from 140 to 210
Also, we approximate that the probability (p) of any dose points (similar to 2 det/cm2 data). To simulate the distribu-
point achieving a pass in a given iteration is equal to the tion of H&N data (with 2%, 2 mm tolerances), HD pass rates
observed low-density pass rate—i.e., that the probability of varied randomly between 70% and 85%, and sample sizes
any one point measurement being within the tolerance crite- varied randomly from 260 to 300 points (similar to 1 det/
ria is equal to the overall pass rate for that field. To use the cm2 data) and 525–600 points (similar to 2 det/cm2 data).
binomial proportion, we must assume that each response is For each HD pass rate, the dose points achieving pass were
independent—i.e., that the response of any dose point is not again chosen from a binomial distribution about the HD pass
affected by the responses of neighboring points. With these rate. The simulation was run for 1 000 000 iterations.
approximations, a confidence interval can be calculated for
any one low-density pass rate given the observed pass rate III. RESULTS AND DISCUSSION
and the number of sampled points.
III.A. Effect of pass rate calculation method
Table I shows a detailed break down of the resultant differ-
II.E. Analytical confidence interval simulations
ences between gamma and %/DTA composite pass rates. For
The statistical methods of the low-density sampling study the 25 prostate fields, using global normalization and 3%, 3
are limited by the number of fields that can be efficiently mm criteria (10% threshold), pass rates calculated with
measured and the number of low-density samples that can be gamma evaluation are on average 2.4% higher than calculated
computationally taken from each measurement. However, with %/DTA composite analysis, with a maximum difference
analytical statistics simulations were performed to model the for any one field of 6.1% (with gamma % > composite %).
performance of these confidence intervals for hundreds of Meanwhile, for the 79 H&N fields (both spatially larger and
thousands of low-density iterations. Using a statistics soft- more complex in modulation), gamma evaluation pass rates
ware (SAS v. 9.2 by SAS Institute Inc.), two statistical simula- are on average 3.7% higher than respective %/DTA composite
tions were performed to examine the behavior of the analysis (3%, 3 mm, 10% threshold), with a maximum differ-
proposed confidence intervals as certain planar dose varia- ence of 9.3% (with gamma % > composite %). The maximum
bles change, e.g., HD pass rate or number of sampled points. difference (i.e., 11.0%) between gamma and %/DTA compos-
First, to examine the behavior of the confidence intervals ite pass rates occurred with a prostate field for which 95.8%
as the number of sampled points varies, confidence intervals of points passed the (c,3,3,L) criteria while only 84.8% of
were simulated for a fixed HD pass rate while increasing the points passed the (c,3,3,L) criteria (the (c,3,3,Gmax) pass rate
number of sampled points. This simulation is analogous to was 98.6% for this field).
performing the same QA measurement with multiple detec- Calculated pass rates are even more dependent on the
tors of increasing detector density. For each iteration, the choice of %Diff normalization technique. Table II shows the
sample size was chosen from a Poisson distribution about difference in pass rates between global maximum normaliza-
the average sample size. Meanwhile, the dose points achiev- tion and local normalization for %/DTA composite analysis
ing pass for each iteration (from which each sample pass rate and gamma analysis (2%, 2 mm, and 3%, 3 mm, 10% thresh-
is calculated) was chosen from a binomial distribution old). The average pass rate difference between normalization
about the fixed HD pass rate. 95% Confidence intervals were techniques ranges from 1.8% to 6.3% for prostate cases,
calculated using the CIN and CIW methods for each of depending on calculation method and tolerances, with a maxi-
1 000 000 iterations per HD pass rate, with p equal to the mum difference of 15.1% (global % > local %). For this max-
sample pass rate for each iteration. imum case, the pass rate spread is given in Table III, and
Lastly, to examine the behavior of the confidence inter- ranges from 60.5% to 96.7%. The average pass rate difference
vals as both HD pass rate and number of sampled points between normalization techniques for H&N cases ranges
vary independently, a simulation was performed analogous from 7.7% to 11.7%, depending on calculation method and
to measuring many different fields via a low-density array, tolerances, with a maximum difference of 27.4%. For this par-
each with its own HD pass rate and number of sampled ticular maximum case, the pass rate varies from 48.1% to
points. The normal approximation and Wilson methods were 93.5%, as seen in Table III. The low pass rates calculated for
used to calculate confidence intervals for pass rates distrib- locally normalized fields result from the fact that these H&N
uted about HD pass rates that were varied over a range of fields are highly modulated, with many low dose points (rela-
values (determined from the EPID measurements used in the tive to the maximum dose) for which a small difference in
low-density sample study). At the same time, the sampling absolute dose inflates the percent difference.
size corresponding to each HD pass rate was varied over a For both the prostate and the H&N cases, a Wilcoxon
range of values similar to the number of points previously paired significance test was used to compare the median

Medical Physics, Vol. 38, No. 11, November 2011


6057 Bailey et al.: QA pass rates 6057

TABLE I. Pass rate (%) variation: %=DTA composite vs gamma analysis—Displayed percentages are calculated by subtracting the %/DTA composite pass
rate from the gamma evaluation pass rate for each field, then finding the median, mean, minimum, maximum, and standard deviation of these differences. 95%
confidence intervals about the median and respective p-values (5% significance level) are calculated via the Wilcoxon paired significance test (Ref. 22).

Norm: globalmax Norm: local

Type Statistics (c,2,2,10)  (c,2,2,10) (c,3,3,10)  (c,3,3,10) (c,2,2,10)  (c,2,2,10) (c,3,3,10)  (c,3,3,10)

Prostate Median 4.5 (95%:3.7–5.3) 2.2 (95%:1.5–2.6) 3.8 (95%:3.1–4.6) 2.6 (95%:2.2–3.4)
(25 fields) p-value <0.0001 <0.0001 <0.0001 <0.0001
Average 4.6 2.4 4.0 3.2
Minimum 1.3 0.8 1.2 1.2
Maximum 8.9 6.1 9.6 11.0
Standard deviation 1.6 1.4 2.1 2.0

H&N Median 5.0 (95%:4.0–5.5) 3.2 (95%:2.7–3.6) 2.3 (95%:2.0–2.5) 2.9 (95%:2.6–3.3)
(79 fields) p-value <0.0001 <0.0001 <0.0001 <0.0001
Average 4.9 3.7 2.4 3.1
Minimum 2.8 1.2 1.0 1.8
Maximum 7.7 9.3 4.2 5.4
Standard deviation 1.3 1.8 0.8 0.9

difference in pass rates between gamma and %/DTA meth- the maximum per-field dose or the prescription dose. How-
ods, and between global and local normalization methods. ever, in field-by-field, orthogonal IMRT QA, the measured
The null hypothesis was that the median difference is 0, versus or planned maximum dose per field is not necessarily any in-
the alternative that they are unequal. The resulting p-values are dication of the prescribed dose, thus, the global normaliza-
included in Tables I and II, in each case indicating that there is tion method becomes less meaningful in relating percent
a significant difference at the 95% confidence level (i.e., all differences to the plan prescription.5 In this type of QA,
p-values < 0.05). A 95% confidence interval for the median dif- using local normalization for every percent difference might
ference is also provided, corresponding to the hypotheses tested then be chosen by default, however, Table II shows that this
in the Wilcoxon paired significance test. choice of normalization may result in significantly lower
Though there is often a large difference in the pass rates pass rates.
calculated via local normalization and global (max) normal- 2D planar QA pass rates are often reported, both anecdo-
ization, it is not always straightforward to determine which tally and in the literature, without sufficient specification as
method is most appropriate. Using the maximum measured to exactly how they were calculated. However, these results
or planned dose to normalize the percent difference at every indicate that differences in calculation methods and analysis
dose point makes concessions for very low local dose values: tolerances lead to high variability in resultant pass rates. Fur-
these points may exhibit very large differences between thermore, with the increased use of 3D detector geometries,
measured and calculated dose, yet the differences may be new methods are becoming available for calculating pass
virtually irrelevant since the dose is so low compared with rates for 3D dose distributions (e.g., differences in dose error

TABLE II. Pass rate (%) variation: Global vs local normalization—Displayed percentages are calculated by subtracting the locally normalized pass rate from
the globally normalized pass rate for each field, then finding the median, mean, minimum, maximum, and standard deviation of these differences. 95% confi-
dence intervals about the median and respective p-values (5% significance level) are calculated via the Wilcoxon paired significance test (Ref. 22).

Gmax norm. % - L norm. %

Type Statistics (c,2,2,10) (c,3,3,10) (c,2,2,10) (c,3,3,10)

Prostate Median 5.4 (95%:3.4–6.5) 2.1 (95%:1.5–2.3) 6.0 (95%:5.1–6.9) 1.3 (95%:0.9–2.3)
(25 fields) p-value <0.0001 <0.0001 <0.0001 <0.0001
Average 5.7 2.6 6.3 1.8
Minimum 1.5 0.5 1.6 0.2
Maximum 15.1 8.0 14.1 5.9
Standard deviation 3.4 1.9 2.8 1.3

H&N Median 8.8 (95%:7.3–10.5) 6.9 (95%:5.9–8.5) 10.9 (95%:9.2–13.1) 7.2 (95%:5.7–8.9)
(79 fields) p-value <0.0001 <0.0001 <0.0001 <0.0001
Average 9.2 7.7 11.7 8.3
Minimum 3.1 2.4 5.1 2.1
Maximum 22.1 22.6 27.4 22.9
Standard deviation 3.9 3.8 4.4 4.5

Medical Physics, Vol. 38, No. 11, November 2011


6058 Bailey et al.: QA pass rates 6058

TABLE III. Pass rate (%) variation: Global vs local normalization—Pass rate Figure 1 shows three distributions of pass rates for the
spread for the individual prostate field and H&N field with highest differen- same H&N field, sampled to 1 det/cm2 with three analysis
ces between global and local normalization (as shown in Table II).
methods: uniform sampling with (c,2,2,L) in gray; uniform
Norm: Gmax Norm: local sampling with (c,2,2,Gmax) in black; and random sampling
with (c,2,2,Gmax) in horizontal stripes. Figure 2 shows the
Prostate (c,2,2,10) 75.6 60.5 same analysis with 2 det/cm2 (also, here, the random grid is
(c,3,3,10) 91.6 85.0
analyzed with local normalization for comparison). Further,
(c,2,2,10) 80.4 66.3
Fig. 3 shows the same analysis with 1 det/cm2, but with more
(c,3,3,10) 96.7 90.8
lenient tolerances: uniform sampling with (c,3,3,L) in gray;
H&N (c,2,2,10) 70.2 48.1 uniform sampling with (c,3,3,Gmax) in black; and random
(c,3,3,10) 87.8 65.2 sampling with (c,3,3,L) in horizontal stripes. The high density
(c,2,2,10) 77.9 50.5 pass rates for this field are as follows: 78.42% (c,2,2,L),
(c,3,3,10) 93.5 70.6
87.66% (c,2,2,Gmax), 96.72% (c,3,3,L), and 98.93%
(c,3,3,Gmax). These results are typical of an average H&N
IMRT field: the results for all H&N fields tested are similar.
normalization19 or how exactly the DTA criterion is
applied4) which by definition affect pass rate calculation. In Several observations should be noted from these results,
both 2D and 3D dose distribution QA, every reported pass verified by all 20 cases similarly tested, whether prostate or
H&N:
rate must include a sufficient statement of the calculation
method, %Diff and DTA criteria, dose threshold, and nor- 1. The local normalization pass rate distribution is shifted
malization method, for example, by utilizing the labeling toward lower values than the global normalization distri-
convention proposed in this work [Eq. (3)]. Though not spe- bution. This shift is due to the fact that global normaliza-
cifically investigated in this work, pass rates calculated in tion always results in a pass rate greater than or equal to
terms of relative dose instead of absolute dose should bear the respective local normalization pass rate.
some indication of that fact in the report—for example, as 2. As comparison criteria are relaxed, either by selecting
suggested in the discussion of Eq. (3). global rather than local normalization or by lowering the
% or DTA tolerances, the spread of the low-density pass
III.B. Distribution of pass rates from low-density rates decreases. If tolerances are sufficiently lowered, as
sampling
seen in Fig. 3 with the (c,3,3,Gmax) data, there is hardly a
Moving a simulated low-density sampling grid over many distribution of pass rates at all, but all pass rates are
possible positions results in a distribution of pass rates for stacked toward 100%. However, choosing tolerances and
any dose QA plane pair (measured and calculated), not a calculation methods that inflate the pass rates does not
fixed pass rate. For each new positioning of the grid, the change the fact that there may be substantial differences
simulated diodes sample different positions in the dose between calculated and delivered dose planes.
plane, thereby changing the pass rate. Table IV shows the 3. The random sample grid yields a pass rate distribution
pass rate distributions resulting from uniform 1 det/cm2 sam- nearly indistinguishable from the respective uniform sam-
pling of 10 H&N fields using the (c,2,2,L) method. The low- pling grid, whether analyzed with local or global normal-
density (LD) pass rates range approximately 65% from the ization. In fact, the average uniform grid pass rate was
high density pass rate, depending on the field. Meanwhile, within 1% of its respective average random grid pass rate
the average pass rate from all low-density iterations is never for every field tested. Furthermore, for every field, the
more than 0.8% at variance with the high density pass rate standard deviation of the distribution from uniform sam-
for each of the 10 fields. That is, the distribution of pass rates pling grids was within 1% of the similar standard devia-
for any given field is always centered around the high den- tion from random grids. This result indicates that there is
sity (HD) pass rate resulting, in this study, from EPID mea- no inherent bias introduced by using only uniform orthog-
surement. However, taking multiple measurements per beam onal grids of detectors to test modulated dose planes.
(or plan) to hone in on the HD pass rate is not practical, as it 4. Low-density pass rate distributions are centered about the
would add substantial time to the QA process. average low-density pass rate for each field, which in turn

TABLE IV. H&N low-density pass rate distributions: 10 H&N fields, sampled to 1 det=cm2 with 100 positional iterations, and analyzed with (c,2,2,L) criteria.
For each field, the high density (HD) pass rate is displayed along with the average, minimum, maximum, and standard deviation of the distribution of low-
density (LD) pass rates.

Statistics H&N 1 H&N 2 H&N 3 H&N 4 H&N 5 H&N 6 H&N 7 H&N 8 H&N 9 H&N 10

HD pass % 80.4 70.2 77.8 78.4 71.6 82.5 69.4 78.5 84.5 83.9
Mean LD pass % 80.4 70.3 78.2 78.8 71.3 82.5 69.7 78.1 84.8 84.0
Minimum LD pass % 77.2 64.9 73.7 73.3 67.6 79.6 65.0 72.5 80.4 79.3
Maximum LD pass % 84.4 76.1 82.8 82.6 77.2 85.0 73.5 83.6 89.1 88.2
Standard deviation LD pass % 1.7 2.9 2.1 2.4 2.5 1.5 2.1 2.5 2.4 2.3

Medical Physics, Vol. 38, No. 11, November 2011


6059 Bailey et al.: QA pass rates 6059

FIG. 1. 1 det=cm2 pass rate distributions for one H&N


field, 100 low-density positional iterations, with: uni-
form grid and (c,2,2,L) in gray; uniform grid and
(c,2,2,Gmax) in black; random grid and (c,2,2,Gmax) in
horizontal stripes.

corresponds closely to the matching high density pass rate measurements were aligned with and compared with the
for the same comparison criteria (cf. Table IV). If a grid same calculated plane, and the results of those comparisons
of higher detector density is chosen, as demonstrated by are shown in grayscale in the right two panels of Fig. 4. Hot
Figs. 1 and 2, this average value does not change, but the spots are shown in white, while cold spots are shown in
resulting pass rate distribution has less variability about black, as calculated with the (c,3,3,Gmax) criteria. The dose
that value (i.e., narrower distribution). Thus, as detector information along the superior–inferior dose profile indi-
density increases, variation in pass rates decreases until cated by the solid black line in Fig. 4 is displayed in Fig. 5—
they finally converge on the HD pass rate. The data den- here, the original and shifted diode positions are shown with
sity at which sufficient measurement information allows round markers and square markers, respectively. Again, hot
convergence on the HD pass rate is not readily apparent and cold spots are shown in white and black, respectively.
from these results but might be of interest in future work. Changing the array position causes a change in the num-
ber of diodes that pass the chosen tolerances, thus changing
These results are in good agreement with the distributions the pass rate from 92.1% (at CAX) to 83.1% (when shifted)
that result from multiple measurements of the same field for (c,3,3,Gmax). Similar shifting to 55 different positions in
using a common commercial diode array at different sam- lateral, superior–inferior, and major diagonal increments of
pling positions. For example, Fig. 4 shows a brain IMRT 0.4 cm resulted in a distribution of pass rates, shown in
field measured with the MAPCHECK device at two different Fig. 6 for (c,2,2,Gmax) and (c,3,3,Gmax). As with the simula-
positions (left two panels): once with the center of the MAP- tions discussed above, the distributions become less spread
CHECK at the beam CAX (Fig. 4, top), and once with the cen- out as pass rate increases (by relaxing the analysis toleran-
ter of the MAPCHECK shifted by 1.6 cm in the superior–inferior ces). Also, the distributions are centered about the average
direction with respect to the CAX (Fig. 4, bottom). The pass rates of 86.8% (c,2,2,Gmax) with a standard deviation

FIG. 2. 2 det=cm2 pass rate distributions for the same


H&N case as from Fig. 1 with: uniform grid and
(c,2,2,L) in gray; uniform grid and (c,2,2,Gmax) in
black; random grid and (c,2,2,L) in horizontal stripes.

Medical Physics, Vol. 38, No. 11, November 2011


6060 Bailey et al.: QA pass rates 6060

FIG. 3. 1 det=cm2 pass rate distributions for the same


H&N case as from Figs. 1 and 2 with: uniform grid and
(c,3,3,L) in gray; uniform grid and (c,3,3,Gmax) in
black; random grid and (c,3,3,L) in horizontal stripes.

8.5% and 93.0% (c,3,3,Gmax) with a standard deviation of affect the pass rate distributions in the low-density sampling
4.8%. The corresponding high density pass rates are 89.3% study? This question is of importance if one desires to extend
and 95.6%, respectively. These experimental averages differ the methods of this study to incorporate low-resolution
slightly more from their respective high density pass rates detectors such as ionization chamber arrays. Further, the
than in the simulated cases. This could be due to at least two varying results between gamma evaluation and %/DTA anal-
factors: first, there is some amount of experimental uncer- ysis raise an important issue: when substantial errors actually
tainty added when actually measuring these positional itera- exist in treatment planning or fluence delivery, which calcu-
tions that is not present in the simulation method; and lation method best indicates those flaws? And lastly, since
second, only 55 iterations were measured, as opposed to 100 pass rate distributions become narrower as detector density
or more in the case of the simulations. Similar measurements increases, is there an optimum detector density, somewhere
with multiple fields of varying modulation complexity and in between the common diode array and EPID, that results in
detectors of different data densities might be of interest for a the same results as those achieved with very-high density
future investigation. measurement (e.g., EPID or film)? Each of these important
This study leads to several important questions. First, no questions demand systematic examination in future work,
simulated volume averaging was incorporated into the low- based upon the findings of the current study.
density sampling method, thus mimicking the response of
high-resolution detectors such as diodes, film, or EPID pix- III.C. Establishing confidence intervals for low-density
els. One might validly ask: how would volume averaging pass rates
According to the above results, every pass rate achieved
with a low-density detector is actually indicative of a spread

FIG. 4. A brain IMRT field as measured (left side) and compared (right side)
via MAPCHECK at two different positions with respect to the delivered dose
plane: (A) with the center of the MAPCHECK at the beam CAX; and (B) with
the center of the MAPCHECK shifted 1.6 cm from CAX in the superior direc- FIG. 5. The co-located superior–inferior dose profiles illustrated in Fig. 4,
tion. Hot and cold spots are indicated by white and black marks, respec- with original diode positions indicated by round markers, shifted diode posi-
tively, and the solid vertical line indicates the position of the dose profiles tions indicated by square markers, and hot and cold spots indicated by white
shown in Fig. 5. and black marks, respectively.

Medical Physics, Vol. 38, No. 11, November 2011


6061 Bailey et al.: QA pass rates 6061

FIG. 6. Pass rate distributions for the brain IMRT field


referenced in Figs. 4 and 5, with (c,2,2,Gmax) analysis
shown in gray and (c,3,3,Gmax) analysis shown in
black. High density pass rates for these analyses were
83.1% and 95.6%, respectively.

of achievable pass rates within which the HD pass rate pass rate would fall within the intervals exactly 95% of the
resides. Consequently, every such pass rate should be time. Using Eq. (5), the HD pass rates fit within the confi-
reported with some estimation of a confidence interval, dence intervals as shown in Table V. These percentages
though such confidence estimates are virtually never given show some variation, ranging from 91.8% to 98.9%, though
in the literature or routine clinical reporting. One obstacle to each of those values takes into account only 100 iterations
adequate pass rate reporting is the lack of a straightforward, per field (a statistically small sample). Taking all iterations
accessible method for calculating these confidence intervals: together (6000 iterations total for prostate and 6000 for
this study attempts to both increase understanding of pass H&N), the percentage of iterations for which the HD pass
rate distributions and to provide a method of estimating low- rates fell within the confidence intervals was 94.1% for pros-
density pass rate confidence intervals. tate cases and 95.1% for H&N cases.
Previous works have suggested that the normal distribu- When applied to the simulated low-density data, the Wil-
tion works fairly well in calculating confidence intervals for son confidence intervals (95%) included the HD pass rates
distributions similar to the low-density pass rate distributions according to Table VI. Again, these data represent the results
found in this work.6,20 Those studies examine pass rate dis- for 100 iterations per field, so variability is expected. Taking
tributions resulting from measurements of many different all iterations together (6000 prostate iterations and 6000
fields, whereas the current study looks at pass rate distribu- H&N iterations), the Wilson confidence intervals included
tions resulting from multiple measurements of the same the HD pass rates for 96.9% and 96.2% of the iterations for
field, however, both studies result in similarly shaped pass prostate and H&N fields, respectively. Similar results were
rate distributions. It was concluded by Knill and Snyder20 also achieved with the Agresti–Coull approximation15,21
that, even when the normal distribution is not the best fit, the (a simplified model similar to the Wilson confidence inter-
actual distribution fit does not greatly affect resulting confi- val), with nearly identical coverage but slightly wider confi-
dence intervals under ordinary circumstances. However, dence intervals.
turning to the current study, confidence intervals based on As an example of these confidence interval methods,
the mean and standard deviation (i.e., the normal distribu- consider a prostate IMRT field that was measured for this
tion) are inaccessible, since only one low-density measure- study for which the 2D diode array measurement sampled
ment is acquired during conventional IMRT QA, not a 182 dose points (above the dose threshold) and resulted in a
sample of measurements for the same field, and thus only pass rate of 87.9% (c,2,2,Gmax). The same field measured
one pass rate is actually calculated.
Since each observed pass rate is a proportion of points TABLE V. Confidence interval coverage: Percentage of iterations for which
that either pass or fail the comparison criteria, a reasonable the HD pass rate fell within the 95% confidence intervals calculated with the
approach is to use the binomial proportion to calculate a con- binomial distribution [normal approximation, Eq. (5)].
fidence interval for each low-density pass rate, according to
either Eq. (5) or (6). To test whether the resulting confidence (c,2,2,L) (c,2,2,Gmax) (c,2,2,L) All
uniform uniform random iterations
intervals fit our data, 95% confidence intervals were calcu-
lated using these equations for each low-density iteration of Prostate 1 det=cm2 96.8 91.7 91.5 93.3
the 20 tested fields (2000 iterations using the (c,2,2,10) crite- 2 det=cm2 95.3 97.2 91.8 94.8
ria). Then for each field, the percent of iterations was calcu- H&N 1 det=cm2 96.2 95.0 92.6 94.6
lated for which the HD pass rate fell within the confidence 2 det=cm2 98.9 92.5 95.3 95.6
intervals. For well-performing confidence intervals, the HD

Medical Physics, Vol. 38, No. 11, November 2011


6062 Bailey et al.: QA pass rates 6062

TABLE VI. Confidence interval coverage: Percentage of iterations for which increases but decreases as pass rate increases from 85% to 95%
the HD pass rate fell within the 95% confidence intervals calculated with the (i.e., increases toward unity). Meanwhile, for the Wilson confi-
binomial distribution [Wilson approximation, Eq. (6)].
dence interval, the simulation shows coverage close to 95% for
(c,2,2,L) (c,2,2,Gmax) (c,2,2,L) All both HD pass rate tiers, even with a low average sample size of
uniform uniform random iterations only 50 points. This coverage is somewhat better than the
results of the Wilson calculations for the sampled EPID mea-
Prostate 1 det=cm2 98.2 98.2 95.2 97.2
surement data, which indicated the CIW tends to be slightly too
2 det=cm2 96.8 97.3 95.6 96.6
conservative. For both calculation models, these trends are
H&N 1 det=cm2 97.4 95.2 94.0 95.5 identical all the way down to an HD pass rate of 75%.
2 det=cm2 99.4 96.1 95.3 96.9 The second statistical simulation tests both confidence
interval methods while allowing both HD pass rate and sam-
ple size to vary randomly, within realistic limits decided
with EPID/EPIDOSE and compared with the same TPS calcula- from the prostate and H&N IMRT measurements from the
tion plane resulted in a pass rate of 91.3% (same criteria). low-density sampling study. The performance of each confi-
With the normal approximation, the low-density measure- dence interval method in this simulation, in terms of cover-
ment pass rate should be reported as 87.9% 6 4.7%. Alter- age and width, is displayed in Table VIII and shows trends
nately, this pass rate could be written: 0.879 [95%:0.832 similar to those found in the previous simulation and in the
–0.926] (c,2,2,Gmax). For the Wilson confidence interval, sampled EPID study: the normal approximation tends to
which is not always centered on the observed pass rate, the yield confidence intervals that are too small, especially at
result should be reported as 0.879 [95%:0.824–0.919] low sample size and/or high pass rate (close to 1), while the
(c,2,2,Gmax). In this case, the high density pass rate happens Wilson confidence intervals tend to yield good coverage and
to fall within the confidence interval with both calculation smaller confidence intervals. Figure 7 displays the 95% con-
methods. fidence interval coverage for average sample sizes ranging
from 50 to 1000 points, while HD pass rate varies from 75%
III.D. Analytical confidence interval simulation results
to 95%, calculated with both the normal approximation and
The first statistical simulation tests the normal approxima- Wilson methods. The normal approximation yields fairly
tion and Wilson approximation confidence interval methods poor coverage for low sample size and very high pass rate.
using a fixed pass rate and variability in the number of low- Meanwhile, the Wilson method provides good coverage
density sampled points. The performance of each confidence which improves with increasing sample size but becomes
interval method, in terms of coverage and width, for fixed HD slightly too conservative as HD pass rate approaches unity.
pass rates of 75%, 85%, and 95% are shown in Table VII. A In both cases, coverage comes increasingly closer to 95% as
75% pass rate represents an atypically low value, possibly indi- sample size increases.
cating delivery or planning error, while 85% and 95% represent In summary, conventional planar dose QA gives only two
roughly the average pass rates for 2%, 2 mm and 3%, 3 mm data values, the sample pass rate and sample size. However,
analysis, respectively. For the normal approximation, this the low-density pass rate distribution which corresponds to
simulation shows that CIN coverage tends to be too low (i.e., each field can be predicted with these known values using a
intervals that are too small, as also seen in the low-density binomial distribution. In turn, confidence intervals can be
simulated data): the coverage improves as sample size estimated that should accompany each analysis. Both the

TABLE VII. Fixed pass rate simulation summary: The coverage and width of normal approximation (CIN) and Wilson (CIW) confidence intervals (95%), with
HD pass rate fixed at 75% (an atypically low value, possibly indicating error), 85% and 95% (roughly the average for 2%, 2 mm and 3%, 3 mm analysis,
respectively), sample size selected by a Poisson distribution about the average sample size, and sample pass rate varied by the binomial distribution about the
HD pass rate for each tier.

HD pass rate (%) Average sample size CIN coverage CIW coverage CIN width CIW width

Low agreement 75 50 0.935 0.951 0.237 0.233


75 250 0.947 0.951 0.107 0.107
75 500 0.948 0.950 0.075 0.076
75 1000 0.947 0.951 0.053 0.054

2%, 2 mm Typical agreement 85 50 0.922 0.953 0.194 0.195


85 250 0.943 0.950 0.088 0.088
85 500 0.945 0.949 0.062 0.062
85 1000 0.947 0.949 0.044 0.044

3%, 3 mm Typical agreement 95 50 0.898 0.961 0.102 0.130


95 250 0.931 0.949 0.053 0.055
95 500 0.939 0.949 0.038 0.038
95 1000 0.942 0.950 0.027 0.027

Medical Physics, Vol. 38, No. 11, November 2011


6063 Bailey et al.: QA pass rates 6063

TABLE VIII. Random pass rate simulation summary: The coverage and width of normal approximation (CIN) and Wilson (CIW) confidence intervals (95%),
with HD pass rate and sample size allowed to vary randomly between values typical for prostate and H&N cases. The low and high average sample sizes repre-
sent approximate average sample sizes for 1 det=cm2 and 2 det=cm2 grids, respectively.

Average pass rate (%) Average sample size CIN coverage CIW coverage CIN width CIW width

Prostate (typical) 88 80 0.886 0.953 0.127 0.134


160 0.914 0.952 0.0917 0.0942

H&N (typical) 80 280 0.936 0.951 0.0862 0.0867


570 0.941 0.950 0.0606 0.0609

normal approximation of the binomial distribution and the tistical uncertainty of that pass rate, and the approximations
Wilson confidence interval methods result in good estimates to the binomial distribution explored in this study provide
of these confidence intervals. The normal approximation good estimates of that uncertainty.
confidence intervals may be somewhat too narrow, espe- These findings call for reflection on the accuracy of estab-
cially as sample size decreases and as the fraction of passing lishing of fixed “action levels” for dose QA, especially if dose
points approaches unity. The Wilson confidence interval data are acquired with a detector of low data density. For these
yields confidence intervals that become more conservative, measurements, pass rates are not fixed but variable, so it is dif-
especially as the fraction of passing points approaches unity. ficult to establish a fixed tolerance level for QA pass rates
However, a more conservative approach (i.e., wider confi- when the result for any individual analysis is blurred by the
dence intervals) may be the preferred approach to these esti- confidence interval. When considering action levels for dose
mates, since any measurement errors or uncertainties QA, some consideration must be given to the fact that each
(avoided via the simulation methods in this study) will also dose plane measurement indicates a range of possible pass rate
tend to widen the confidence limits for each measurement. results, highly dependent on conformity between delivery and
Though each of these methods has some amount of limita- plan, sample size, and pass rate calculation metric.
tion, low-density pass rates are not absolute values. Any
passing rate from a low-density measurement array should IV. CONCLUSIONS
be accompanied by a confidence interval quantifying the sta-
The calculation method, comparison tolerances, and detec-
tor density of the measurement array all greatly affect the
resulting pass rates for planar dose comparisons. %/DTA
composite analysis yields pass rates that are always lower
than or equal to similar pass rates calculated with the gamma
evaluation, and sometimes substantially lower, depending on
the calculation criteria. Local normalization yields pass rates
that are always lower than or equal to similar pass rates calcu-
lated with global normalization to the maximum dose, though
both methods have advantages in dose comparison informa-
tion reporting. Pass rates resulting from dose measurements
with low detector density indicate a distribution of possible
pass rates that can be predicted with a binomial distribution,
and confidence intervals can be calculated which quantify the
probability that the HD pass rate falls within the interval. The
most straightforward estimate of binomial confidence inter-
vals, the normal approximation, tends to yield intervals that
are too narrow when applied to simulated low-density data
with many iterations. The Wilson confidence interval fits
simulated data more closely, resulting in confidence intervals
that are sometimes too wide, but may be preferable since mea-
surement errors and uncertainties demand wider confidence
intervals. By extension, these results apply to 3D arrays, as
well. Every pass rate should be reported with a complete
description of how it was calculated [for example, using the
short-hand method utilized in this paper, Eq. (3)] and a quanti-
tative estimate of its statistical uncertainty.
a)
FIG. 7. 95% confidence interval coverage for the normal and Wilson meth- Electronic mail: Daniel.Bailey@RoswellPark.org
1
ods, with HD pass rate varying from 75% to 95%, sample pass rates follow- J. Herzen, M. Todorovic, F. Cremers, V. Platz, D. Albers, A. Bartels, and
ing a binomial distribution about the HD pass rate per iteration (1 000 000), R. Schmidt, “Dosimetric evaluation of a 2D pixel ionization chamber for
and average samples sizes of 50, 250, 500, and 1000. implementation in clinical routine,” Phys. Med. Biol. 52, 1197 (2007).

Medical Physics, Vol. 38, No. 11, November 2011


6064 Bailey et al.: QA pass rates 6064

2 12
P. Jursinic and B. Nelms, “A 2-D diode array and analysis software for D. Low, J. Moran, J. Dempsey, L. Dong, and M. Oldham,
verification of intensity modulated radiation therapy delivery,” Med. Phys. “Dosimetry tools and techniques for IMRT,” Med. Phys. 38,
30, 870–879 (2003). 1313–1338 (2011).
3 13
V. Feygelman, K. Forster, D. Opp, and G. Nilsson, “Evaluation of a bipla- G. Yan, C. Liu, T. Simon, L. Peng, C. Fox, and J. Li, “On the sensitivity
nar diode array dosimeter for quality assurance of step-and-shoot IMRT,” of patient-specific IMRT QA to MLC positioning errors,” J. Appl. Clin.
J. Appl. Clin. Med. Phys. 10, 64–78 (2009). Med. Phys. 10, 120–128 (2009).
4 14
D. Létourneau, J. Publicover, J. Kozelka, D. Moseley, and D. Jaffray, J. van Dyk, R. Barnett, J. Cygler, and P. Shragge, “Commissioning and
“Novel dosimetric phantom for quality assurance of volumetric modulated quality assurance of treatment planning computers,” Int. J. Radiat. Oncol.
arc therapy,” Med. Phys. 36, 1813–1821 (2009). Biol. Phys. 26, 261–273 (1993).
5 15
B. Nelms and J. Simon, “A survey on planar IMRT QA analysis,” J. Appl. A. Agresti and B. Coull, “Approximate is better than “exact” for interval
Clin. Med. Phys. 8, 76–90 (2007). estimation of binomial proportions,” Am. Stat. 52, 119–126 (1998).
6 16
G. Ezzell et al., “IMRT commissioning: multiple institution planning and A. Agresti, Categorical Data Analysis (John Wiley & Sons, Hoboken, NJ,
dosimetry comparisons, a report from AAPM Task Group 119,” Med. 2002), Vol. 359, pp. 15–16.
17
Phys. 36, 5359–5373 (2009). R. Newcombe, “Two-sided confidence intervals for the single
7
D. Low, W. Harms, S. Mutic, and J. Purdy, “A technique for the quantita- proportion: Comparison of seven methods,” Stat. Med. 17, 857–872
tive evaluation of dose distributions,” Med. Phys. 25, 656–661 (1998). (1998).
8 18
D. Low and J. Dempsey, “Evaluation of the gamma dose distribution com- L. Brown, T. Cai, and A. DasGupta, “Interval estimation for a binomial
parison method,” Med. Phys. 30, 2455–2464 (2003). proportion,” Stat. Sci. 16, 101–117 (2001).
9 19
B. Poppe, A. Djouguela, A. Blechschmidt, K. Willborn, A. Ruhmann, and V. Feygelman, G. Zhang, C. Stevens, and B. Nelms, “Evaluation of a new
D. Harder, “Spatial resolution of 2d ionization chamber arrays for IMRT VMAT QA device, or the “X” and “O” array geometries,” J. Appl. Clin.
dose verification: Single-detector size and sampling step width,” Phys. Med. Phys. 12, 146–168 (2011).
20
Med. Biol. 52, 2921–2935 (2007). C. Knill and M. Snyder, “An analysis of confidence limit calculations used
10
B. Nelms, K. Rasmussen, and W. Tome, “Evaluation of a fast method of in AAPM Task Group No. 119,” Med. Phys. 38, 1779–1785 (2011).
21
EPID-based dosimetry for intensity-modulated radiation therapy,” J. Appl. A. Agresti and B. Coull, “Interval estimation for a binomial proportion:
Clin. Med. Phys. 11, 140–157 (2010). Comment,” Stat. Sci. 16, 117–120 (2001).
11 22
W. Harms Sr, D. Low, J. Wong, and J. Purdy, “A software tool for the M. Desu and D. Raghavarao, Nonparametric Statistical Methods for
quantitative evaluation of 3D dose calculation algorithms,” Med. Phys. 25, Complete and Censored Data (CRC, Boca Raton, 2004) pp. 16–17,
1830–1836 (1998). 188–191.

Medical Physics, Vol. 38, No. 11, November 2011

You might also like