You are on page 1of 10

PEDIATRICS/ORIGINAL RESEARCH

What Are the Most Clinically Useful Cutoffs for the Alvarado
and Pediatric Appendicitis Scores? A Systematic Review
Mark H. Ebell, MD, MS*; JoAnna Shinholser, BSHP
*Corresponding Author. E-mail: ebell@uga.edu, Twitter: @markebell.

Study objective: The objective of this study is to systematically review the accuracy of the Alvarado score and Pediatric
Appendicitis Score and to identify optimal cutoffs for low- and high-risk populations.

Methods: We performed a systematic review of the literature and identified 26 studies of the accuracy of the Alvarado
score and Pediatric Appendicitis Score. Data were abstracted in parallel, and only prospective, cohort studies that
avoided verification bias were included. We calculated summary likelihood ratios for low-, moderate-, and high-risk
groups, using all possible cutoffs based on available data, even if not reported in the original study.

Results: The pretest probability of appendicitis was approximately 33% in studies of children and approximately 66% in
studies of adults. Likelihood ratios at different cutoffs for the Alvarado score in adults were as follows: 0.03
(<4 points), 0.42 (4 to 6 points), and 3.4 (7 points); and 0.01 (<5 points), 0.98 (5 to 8 points), and 6.7 (9 points).
Likelihood ratios for the Alvarado score in children were as follows: 0.02 (<4 points), 0.27 (4 to 6 points), and
4.2 (7 points); and 0.04 (<5 points), 1.2 (5 to 8 points), and 8.5 (9 points). For the Pediatric Appendicitis
Score, likelihood ratios were 0.13 (<4 points), 0.70 (4 to 7 points), and 8.1 (8 points).

Conclusion: For children with a pretest probability of acute appendicitis of 60% or less, an Alvarado score below
4 rules out the diagnosis; this is also true for a score less than 5 if the pretest probability is up to approximately 40%. In
adults with a pretest probability greater than or equal to 60%, an Alvarado score of 8 or higher rules in the diagnosis,
whereas one of 9 or higher rules in the diagnosis at pretest probabilities greater than or equal to 40%. The Pediatric
Appendicitis Score did not identify clinically useful low- or high-risk groups at typical pretest probabilities. [Ann Emerg
Med. 2014;64:365-372.]

Please see page 366 for the Editor’s Capsule Summary of this article.

A feedback survey is available with each research article published on the Web at www.annemergmed.com.
A podcast for this article is available at www.annemergmed.com.
0196-0644/$-see front matter
Copyright © 2014 by the American College of Emergency Physicians.
http://dx.doi.org/10.1016/j.annemergmed.2014.02.025

SEE EDITORIAL, P. 373. clinical examination results, patients with acute abdominal pain
and a probability of appendicitis that is below the test threshold
may be discharged home without additional diagnostic tests,
INTRODUCTION whereas those with a high probability of disease that is above the
Background treatment threshold may be treated with immediate
Clinical decision rules integrate several findings from the appendectomy. Patients who have an intermediate risk of
medical history, physical examination, and simple laboratory tests appendicitis that is between the test and treatment thresholds
to predict the likelihood of a disease. Several clinical decision rules might undergo imaging or observation for further data gathering.
have been developed and prospectively evaluated for their accuracy A truly useful clinical decision rule would classify patients into
in diagnosis of appendicitis in both adults and children. Among low-, moderate-, and high-risk groups that correspond to the
the most widely studied are the Alvarado score and the Pediatric zones below the test threshold, between the test and treatment
Appendicitis Score.1,2 The Alvarado score can be used in adults thresholds, and above the treatment threshold, respectively.4 On
and children, whereas the Pediatric Appendicitis Score is used only the other hand, if a clinical decision rule for appendicitis creates
in children and adolescents. The scores are summarized in Table 1. low- and high-risk groups, but the low-risk group is not low risk
Pauker and Kassirer3 proposed the threshold model of enough to rule out appendicitis and the high-risk group is not
diagnosis, which identifies test and treatment thresholds for high risk enough to rule it in, then the clinical decision rule does
clinical decisionmaking. For example, according to the initial not have good clinical relevance.

Volume 64, no. 4 : October 2014 Annals of Emergency Medicine 365

Downloaded for Mahasiswa 1 FK UNPAD (mhs.clinicalkey1@fk.unpad.ac.id) at Universitas Padjadjaran from ClinicalKey.com by Elsevier on October 04, 2018.
For personal use only. No other uses without permission. Copyright ©2018. Elsevier Inc. All rights reserved.
Most Clinically Useful Cutoffs for Appendicitis Scores Ebell & Shinholser

Editor’s Capsule Summary MATERIALS AND METHODS


Study Design
What is already known on this topic We searched PubMed, using the following initial search
It is unclear whether appendicitis scoring instruments strategy: (Clinical Prediction Guides/Broad[filter]) AND
are of any value. (appendicitis) AND (sensitivity[tiab] OR specificity[tiab] OR
receiver[tiab] OR “likelihood ratio”[tiab] OR “predictive
What question this study addressed value”[tiab]) OR “alvarado score”[tiab].
This systematic review compared the 2 published and We also searched the reference lists of previous systematic
validated appendicitis scores and their ability to rule reviews to identify any additional studies not identified by our
in or rule out appendicitis at various pretest clinical initial search.
suspicion levels.
What this study adds to our knowledge Selection of Participants
We included only original research studies that gathered
In contrast to work recently published in this journal, data prospectively for a series of adults or children with acute
this study suggests that scoring instruments may have abdominal pain or clinically suspected appendicitis and also
a role in specific situations. For example, in children, reported sufficient data to calculate the sensitivity and specificity
if the clinical suspicion for appendicitis is less than or likelihood ratio for a clinical decision rule. We excluded the
60% and the Alvarado score is less than 4, derivation study used to create each clinical decision rule, studies
observation without imaging may be preferred. In gathering data retrospectively or using chart review, studies that
adults, if the clinical suspicion for appendicitis is failed to report how they followed up patients who did not
greater than 50% and the Alvarado score is greater undergo surgery, studies of mixed populations of adults and
than 9, surgery without imaging may be best. children, and any studies using a case-control design. We also
excluded studies that calculated any modification of the Alvarado
How this is relevant to clinical practice score. We included studies performed in any country and
These systematic review data suggest optimal ways in published in any language as long as they met our inclusion
which the Alvarado score might be applied. criteria described above.
These inclusion criteria ensured selection of high-quality
diagnostic test studies that used a prospective cohort design and
avoided verification and spectrum bias. Because we included only
studies that enrolled patients and gathered data prospectively,
Goals of This Investigation the persons performing the index test were blinded to the final
In this study, we will perform a diagnostic meta-analysis of diagnosis. To assess quality, we used criteria recommended by
high-quality studies of the Alvarado score and the Pediatric the Cochrane Handbook for Diagnostic Test Accuracy Reviews
Appendicitis Score and evaluate the clinical utility of the full range (http://srdta.cochrane.org/handbook-dta-reviews, chapter 9,
of calculable pairs of cutoffs to define low-, moderate-, and high- Table 9.1) and adapted for this topic.
risk groups. We will also determine the accuracy of decision
thresholds that were not explicitly reported by the original studies.
Data Collection and Processing
Each abstract was initially reviewed by both of the authors
Table 1. The Alvarado score and the Pediatric Appendicitis to determine whether it met our inclusion criteria, and
Score.1,2 differences were resolved by discussion until consensus was
Clinical Variable Alvarado Score PAS achieved. We then abstracted study design characteristics in
Migration of pain 1 1 parallel and again met to resolve any discrepancies. Finally, data
Anorexia 1 1 about the accuracy of each clinical decision rule were abstracted
Nausea or vomiting 1 1 in parallel and discrepancies resolved by consensus discussion.
Right lower quadrant tenderness 2 2 Studies enrolling only persons aged 14 years and older were
Rebound pain 1
Elevated temperature* 1 1
classified as involving adults, and those enrolling only persons
Leukocytosis (10,000/mL) 2 1 aged 18 years and younger were classified as involving children.
Shift of WBC count to the left 1 1 We analyzed studies enrolling both adults and children separately
(75% polymorphonucleocytes) from those enrolling only children or only adults. For studies
Cough/percussion/hopping cause pain in the RLQ 2 published in languages other than English, we relied on assistance
Total 10 10
from native speakers or used Google Translate to translate
PAS, Pediatric Appendicitis Score; WBC, white blood count; RLQ, right lower quadrant. key parts.
*Fever generally defined as greater than or equal to 37.3 C (91.2 F) for the Alvarado
score and greater than or equal to 37.3 C (99.2 F) or 38.0 C (100.4 F) for PAS. In addition to the risk categories originally described by the
authors, where possible we also abstracted data for other risk

366 Annals of Emergency Medicine Volume 64, no. 4 : October 2014

Downloaded for Mahasiswa 1 FK UNPAD (mhs.clinicalkey1@fk.unpad.ac.id) at Universitas Padjadjaran from ClinicalKey.com by Elsevier on October 04, 2018.
For personal use only. No other uses without permission. Copyright ©2018. Elsevier Inc. All rights reserved.
Volume 64, no. 4 : October 2014

Ebell & Shinholser


Table 2. Characteristics of included studies.
Percentage
Percentage Operated
Downloaded for Mahasiswa 1 FK UNPAD (mhs.clinicalkey1@fk.unpad.ac.id) at Universitas Padjadjaran from ClinicalKey.com by Elsevier on October 04, 2018.

With With
Reference Mean Age (SD), Appendicitis Appendicitis
Study Population Standard n Range, Years Male (%) (0%-100%) (0%-100%) Country
Adults
Al Qahtani, 200411 Consecutive patients with suspected Surgeryþclinical f/u by telephone 2–3 211 32, 13–72 59.2 56.9 87.7 Saudi Arabia
appendicitis days after d/c
Baidya, 200712 Clinically suspected appendicitis Surgeryþclinical f/u at hospital d/c 231 26.3, 16–72 61 51.5 93.7 India
For personal use only. No other uses without permission. Copyright ©2018. Elsevier Inc. All rights reserved.

Kim, 200813 Clinically suspected appendicitis Surgeryþclinical f/u at 3 mo 157 37.1 (16.5), 15–84 40.1 57.3 96.8 Korea
Limpawattanasiri, 201114 Clinically suspected appendicitis Surgeryþinpatient f/u at 24þ h 1,000 NR, 15–72 40.7 71.5 85.3 Thailand
Pouget-Baudry, 201015 RLQ abdominal pain Surgeryþfollow-up clinical f/u at 1 wk 233 31.5, 15–88 48.1 73.4 98.3 France
Pruekprasert, 200416 Clinically suspected appendicitis Surgeryþfinal f/u at hospital d/c 231 27, 14–75 58 80.5 92.5 Thailand
Sanabria, 200717 Patients >14 y with pain in the right Surgeryþfollow-up telephone 374 29.5 (10.8), 15–71 47.6 55.2 76.60 Columbia
lower quadrant call at 30 days
Memon, 200918 Clinically suspected appendicitis Surgical findings 100 24.8 (9), 13–55 65 91 91 Pakistan
undergoing surgery
Inci, 201119 Clinically suspected appendicitis Surgical findings 66 26.5 (11.3), 14–72 52.9 86.4 86.4 Turkey
undergoing surgery
Canavosso, 200820 Clinically suspected appendicitis Surgical findings 207 26.6, 13–82 52.2 91.3 91.3 Argentina
undergoing surgery
Denizbasi, 200321 Clinically suspected appendicitis Surgical findings 221 26.6, 14þ 53.6 79.2 79.2 Turkey
undergoing surgery
Kang, 198940 Clinically suspected appendicitis Surgeryþclinical follow-up 62 45.8, 18–78 66.1 67.7 85.7 China
Sigdel, 201022 Clinically suspected appendicitis Surgical findings 100 27.5 (9.8), 15–68 72 94 94 Nepal
undergoing surgery
Children
Shreef, 201023 Suspected acute appendicitis Surgeryþ24-h f/u or telephone call 350 9.3, 8–14 56.1 37.7 72.5 Egypt,
Saudi Arabia

Most Clinically Useful Cutoffs for Appendicitis Scores


Bond, 199024 Consecutive patients with Surgeryþclinical f/u at 2 wk 189 NR, 2–17 NR 61.4 94.3 US
abdominal pain <1 wk
Escribá, 201125 Suspected appendicitis who Surgeryþtelephone call 10 days 99 11.2 (3.7), 4–17.8 62.6 42.4 95.5 Spain
had blood drawn in ED after discharge from ED
Mandeville, 201126 Verbal children 4 to 17 y with Surgeryþf/u telephone call 2 wk later 287 9.8 (3.1), 4–16 52.6 54 NR USA
clinical suspicion of appendicitis
Borges, 200327 Clinically suspected appendicitis Surgeryþclinical at 1 wk 76 NR, <18 52.6 71.1 96.4 Brazil
(visit or telephone)
Annals of Emergency Medicine 367

Schneider, 200728 Clinically suspected appendicitis Surgeryþf/u telephone call 2 wk later 588 11.9, 3–21 NR 33.5 91.7 USA
Bhatt, 200929 Abdominal pain <3 days and Surgeryþf/u telephone call 1 mo later 246 10.9 (3.4), 4–18 59.8 33.7 87.4 Canada
appendicitis suspected
Goldman, 200830 Abdominal pain <7 days Surgeryþtelephone call 5–7 days later 849 NR, 1–17 NR 14.5 NR Canada
Zuniga, 201231 Clinically suspected Surgeryþclinical f/u 7 days later 101 9.5 (2.8) 54.5 27.7 93.3 Spain
appendicitis <7 days
Wu, 201238 Clinically suspected appendicitis Surgeryþtelephone call 2 wk after 1,395 11.1 (4.2), 3–18 46.2 63.2 NR Taiwan
discharge from ED
Wu, 201239 Clinically suspected appendicitis Surgeryþtelephone call 2 wk after 594 11.1 (4.2), 4–18 60.3 51.5 74.8 Taiwan
discharge from ED
f/u, follow-up; d/c, discharge; h, hours; wk, week; mo, month.
Most Clinically Useful Cutoffs for Appendicitis Scores Ebell & Shinholser

categories. Many studies reported the number of patients with Excluding studies that enrolled only patients who underwent
and without a final diagnosis of appendicitis for each value of the surgery, the pooled percentage of patients with appendicitis was
Alvarado score or Pediatric Appendicitis Score. Even if the 66.0% in the adult studies (range 51.5% to 80.5%), 38.8% in
original study reported only the accuracy for a single the mixed-population studies (range 35.0% to 58.3%), and
dichotomous cutoff or pair of cutoffs, this information allowed us 33.4% in the studies of children (range 14.5% to 71.1%). The
to abstract data for a series of dichotomous cutoffs and for mean age ranged from 25 to 37 years in studies of adults, 23 to
commonly reported pairs of cutoffs such as less than 4, 4 to 6, 30 years in studies with a mixed population, and 9.3 to
and greater than or equal to 7 points, and less than 5, 5 to 8, and 11.9 years in studies of children.
greater than or equal to 8 points for the Alvarado score. Because we used methodologically rigorous inclusion criteria,
Data were recorded in parallel in a Google Docs spreadsheet. all included studies were prospective cohort studies of unselected
After discrepancies were reconciled as described above, the final patients with abdominal pain or suspected appendicitis, had
data set was imported into Stata (version 12.1; StataCorp, adequate follow-up, and reported sufficient data to calculate
College Station, TX) for analysis. Alvarado or Pediatric Appendicitis Score scores. None of the
studies blinded outcome assessors, and both scores were always
Primary Data Analysis
calculated before surgery or on admission. Five studies enrolled
We used the MIDAS library (Ben Dwamena, 2007. MIDAS:
only patients undergoing surgery,18-22 whereas the remainder
A program for Meta-analytical Integration of Diagnostic
were all judged to have adequate clinical follow-up in
Accuracy Studies in Stata. Division of Nuclear Medicine,
nonoperated patients to avoid verification bias. Most studies
Department of Radiology, University of Michigan Medical
specified that the decision to perform surgery was independent of
School, Ann Arbor, Michigan) to calculate summary likelihood
the clinical decision rule(s). In 3 studies, the decision to operate
ratios for the final diagnosis of appendicitis for patients who
was guided by the score result,12,14,20 and for 5 studies it was not
scored above and below each cutoff. MIDAS uses a bivariate
clear whether the surgery decision was independent.11,19,24,30,31
mixed-effects regression model5,6 that has been modified for
A summary of the quality assessment is shown in Figure 1.
meta-analysis of diagnostic accuracy studies.7
The accuracy of the Alvarado score in adults for the most
We took the approach of Ohle et al6 to calculate likelihood
commonly used sets of decision thresholds is summarized in
ratios for low-, moderate-, and high-risk groups. For example, if
Table 3. The likelihood ratio for the low-risk groups in adults
low-, moderate-, and high-risk groups were defined as less than
ranged from 0.01 to 0.38 and was less than 0.05 for 6 of 10 study
4, 4 to 6, and greater than or equal to 7 points, respectively, we
populations. Two studies had significantly higher likelihood
calculated the likelihood ratios for less than 4 versus greater than
ratios for the low-risk group.13,15 One of these 2 studies found
or equal to 4, less than 7 versus greater than or equal to 7, and 4
appendicitis in 23 of 55 patients with a score less than 4 points,
to 6 versus all other scores. In some cases, studies contributed
but the score was calculated by an intern on call, so lack of
data for the upper or lower cutoff, but not both. We also report
clinical experience may have contributed to a failure to detect
95% confidence intervals (CIs) for each summary likelihood
clinical signs and symptoms of appendicitis.15 The other was a
ratio. Bivariate summary receiver operating characteristic curves
Korean study that found appendicitis in 11 of 34 patients with a
were created for the key lower and upper clinical cutoff points
score less than 4 points; however, we could find no explanation
with the MIDAS procedure where possible. Because they use a
for the higher rate of appendicitis compared with that in other
bivariate random-effects regression model, they can be used to
studies.13 In regard to the high-risk groups in adults, the
provide a valid estimate of the summary likelihood ratios.
likelihood ratio ranged from 1.76 to 13.7. The summary
likelihood ratio was 3.4 (95% CI 2.5 to 4.6) for a cutoff of
RESULTS greater than or equal to 7 points and 6.7 (95% CI 3.5 to 12.7)
Our initial PubMed search yielded 526 studies, and a search for a cutoff of greater than or equal to 9 points. Bivariate
for “‘pediatric appendicitis score’[tiab]” yielded 14 studies. The summary receiver operating characteristic curves for cutoffs of
reference lists for previously published systematic reviews8-10 less than 4, greater than or equal to 7, and greater than or equal
were reviewed and identified 10 additional studies not found by to 9 in adults are shown in Figure E1A through C (available
the initial PubMed searches. The final total was 544 unique online at http://www.annemergmed.com); receiver operating
original research studies, of which a total of 29 met our inclusion characteristic curves for cutoffs of less than 4, less than 5, greater
criteria. The remaining studies generally did not study accuracy than or equal to 7, and greater than or equal to 9 in children are
or did not report sufficient data to calculate accuracy, were shown in Figure E2A through D (available online at http://www.
retrospective, or used a case-control design. annemergmed.com).
The characteristics of included studies are summarized in The accuracy of the Alvarado score and Pediatric Appendicitis
Table 2. Of the 29 studies, 13 reported on use of the Alvarado Score in children is summarized in Table 4. The likelihood of
score in adults11-22,40 and 11 on use of the Alvarado score or appendicitis was significantly higher in the low-risk group for the
Pediatric Appendicitis Score in children.23-31,38,39 Five additional Pediatric Appendicitis Score (0.13; 95% CI 0.04 to 0.4) than for
studies reported data for a mixed population of adults and the Alvarado score (0.02 for a cutoff of <4 points; 0.04 for a
children32-36 and were excluded from the analysis. cutoff of <5 points). On the other hand, the high-risk group for

368 Annals of Emergency Medicine Volume 64, no. 4 : October 2014

Downloaded for Mahasiswa 1 FK UNPAD (mhs.clinicalkey1@fk.unpad.ac.id) at Universitas Padjadjaran from ClinicalKey.com by Elsevier on October 04, 2018.
For personal use only. No other uses without permission. Copyright ©2018. Elsevier Inc. All rights reserved.
Ebell & Shinholser Most Clinically Useful Cutoffs for Appendicitis Scores

Figure 1. Assessment of study quality. Green indicates low risk of bias, yellow unclear risk of bias, and red high risk of bias for each
aspect of study quality.

the Pediatric Appendicitis Score had a higher likelihood ratio et al30 enrolled all children with abdominal pain, not just
than for the Alvarado scores, but with considerable heterogeneity. children with right lower quadrant pain or suspected
Two studies were outliers, with higher likelihood ratios for the appendicitis. We could find no reason why the other study26
low-risk group than other studies.26,30 The study by Goldman might have been an outlier. Seven studies reported data sufficient

Table 3. Performance of Alvarado score in adults.


Likelihood Ratio (95% CI)
Study n Reference Standard Low Risk Moderate Risk High Risk
<4 4–6 7
Denizbasi, 200321 221 Surgery 1.76 (1.34–2.29)
Canavosso, 200820 207 Surgery 0.10 (0–4.90) 0.23 (0.13–0.40) 1.96 (1.17–3.30)
Sigdel, 201022 100 Surgery 2.46 (0.79–7.65)
Memon, 200918 100 Surgery 0.04 (0–0.83) 0.54 (0.35–0.82) 5.24 (0.82–33.5)
Inci, 201119 66 Surgery 2.53 (1.00–6.41)
Al Qahtani, 200411 211 Surgeryþf/u 0.01 (0–0.22) 0.09 (0.04–0.20) 6.04 (3.73–9.79)
Pruekprasert, 200416 231 Surgeryþf/u 2.65 (1.63–3.95)
Sanabria, 200717 374 Surgeryþf/u 0.11 (0.04–0.29) 0.45 (0.33–0.60) 2.40 (1.90–3.04)
Baidya, 200712 231 Surgeryþf/u 6.64 (4.05–10.9)
Pouget-Baudry, 201015 233 Surgeryþf/u 0.26 (0.17–0.41) 1.07 (0.75–1.54) 4.65 (2.14–10.1)
Limpawattanasiri, 201114 1000 Surgeryþf/u 3.41 (2.79–4.17)
Kang, 198940 62 Surgeryþf/u 3.57 (1.46–8.76)
Summary (all studies) 3,568 0.03 (0.01–0.33) 0.42 (0.20–0.89) 3.43 (2.53–4.65)
I2¼71 I2¼84 I2¼61
Summary (excluding surgery only studies)* 4.29 (2.98–6.17)
I2¼59
<5 5–8 9
Canavosso, 200820 207 Surgery 0.01 (0–0.27) 0.78 (0.59–1.02) 7.14 (1.05–48.4)
Memon, 200918 100 Surgery 0.01 (0–0.21) 1.31 (0.72–2.37) 5.54 (0.36–84.3)
Al Qahtani, 200411 211 Surgeryþf/u 0.01 (0–0.13) 0.84 (0.61–1.15) 13.7 (5.2–36.0)
Sanabria, 200717 374 Surgeryþf/u 0.12 (0.06–0.25) 1.00 (0.85–1.18) 4.96 (2.79–8.82)
Kim, 200813 157 Surgeryþf/u 0.38 (0.20–0.72)
Limpawattanasiri, 201114 1,000 Surgeryþf/u 0.01 (0–0.21)
Summary 2,049 0.01 (0–0.41) 0.98 (0.84–1.13) 6.69 (3.51–12.7)
I2¼94 I2¼29 I2¼0
*Meta-analysis of likelihood ratios for low- and moderate-risk groups is not possible because of the small number of studies.

Volume 64, no. 4 : October 2014 Annals of Emergency Medicine 369

Downloaded for Mahasiswa 1 FK UNPAD (mhs.clinicalkey1@fk.unpad.ac.id) at Universitas Padjadjaran from ClinicalKey.com by Elsevier on October 04, 2018.
For personal use only. No other uses without permission. Copyright ©2018. Elsevier Inc. All rights reserved.
Most Clinically Useful Cutoffs for Appendicitis Scores Ebell & Shinholser

Table 4. Performance of the Alvarado score and Pediatric Appendicitis Score in children.*
Likelihood Ratio (95% CI)
Study n Low Risk Moderate Risk High Risk
Alvarado score <4 4–6 7
Bond, 199024 189 0.01 (0–0.24) 0.24 (0.13–0.44) 3.12 (2.16–4.50)
Schneider, 200728 588 0.06 (0.02–0.18) 0.49 (0.38–0.63) 3.76 (3.01–4.69)
Shreef, 201023 350 0.03 (0–0.47) 0.19 (0.13–0.30) 5.09 (3.76–6.88)
Escriba, 201125 99 0.03 (0–0.53) 0.17 (0.06–0.44) 10.3 (4.44–24.0)
Wu, 201239 594 4.38 (3.33–5.76)
Mandeville, 201126 287 0.38 (0.21–0.70) 0.31 (0.21–0.46) 2.72 (2.04–3.62)
Summary 1,513 0.02 (0–0.36) 0.27 (0.19–0.40) 4.21 (3.33–5.32)
I2¼83 I2¼67 I2¼49
Alvarado score <5 5–8 9
Bond, 199024 189 0.01 (0–0.19) 1.03 (0.80–1.33) 6.17 (3.58–14.7)
Borges, 200327 76 0.12 (0.04–0.31)
Schneider, 200728 588 0.18 (0.11–0.29) 1.25 (1.09–1.43) 6.75 (3.89–11.7)
Shreef, 201023 350 0.01 (0–0.19) 1.15 (1.01–1.32) 10.6 (4.22–26.5)
Escriba, 201125 99 0.02 (0–0.33) 1.25 (0.84–1.87) 52.6 (3.27–847)
Mandeville, 201126 287 0.30 (0.19–0.48) 1.07 (0.87–1.32) 6.67 (2.95–15.1)
Summary 1,589 0.04 (0–0.36) 1.16 (1.06–1.27) 8.47 (5.61–12.8)
I2¼80 I2¼0 I2¼0
PAS <4 4–7 8
Schneider, 200728 588 0.16 (0.08–0.32) 0.71 (0.60–0.84) 5.04 (3.63–7.00)
Goldman, 200830 849 0.67 (0.58–0.77) 4.20 (3.05–5.80) 1.69 (0.35–8.02)
Bhatt, 200929 246 0.04 (0.01–0.26) 0.70 (0.53–0.92) 11.3 (5.59–22.8)
Escriba, 201125 99 0.03 (0–0.48) 0.50 (0.31–0.83) 79.6 (5.00–1266)
Mandeville, 201126 287 0.43 (0.21–0.88) 0.42 (0.31–0.55) 3.37 (2.35–4.85)
Zuniga, 201231 101 0.09 (0.01–1.43) 0.58 (0.37–0.91) 8.34 (3.38–20.6)
Summary 2,170 0.13 (0.04–0.40) 0.70 (0.45–1.11) 8.10 (4.13–15.9)
I2¼96 I2¼95 I2¼91
*All studies used surgeryþclinical follow-up as the reference standard.

to calculate a simple dichotomous cutoff of 7 or more points for combined results from studies using different cutoffs.7 Two
the Pediatric Appendicitis Score.25,26,28-31,38 This yielded a previous systematic reviews included studies that gathered data
positive likelihood ratio of 5.2 (95% CI 3.1 to 8.6) and negative retrospectively or had incomplete follow-up of nonoperated
likelihood ratio of 0.38 (95% CI 0.20 to 0.74). patients, which might lead to partial verification bias.5,6 Finally,
we identified high-quality, primary studies that met our inclusion
criteria but were not identified by previous studies because of
LIMITATIONS
language exclusion criteria or publication date.5-7
A limitation of the current study is threats to validity in the
In the current study, we limited our analysis to studies that
included studies. There were also differences in the level of
avoided these important biases: they used prospective data
training of the physicians gathering the clinical data. However,
collection, performed follow-up to determine the outcome for
we limited our analysis to high-quality, prospective studies that
nonoperated patients, and did not use a case-control design. We
avoided verification bias. Another limitation is that pretest
also examined the accuracy for cutoffs not reported by the
probabilities are not well known to most physicians in their
original study to identify all high-quality studies that examined a
practice setting and that the test and treatment thresholds are
particular cutoff for the test or treatment threshold.
generally determined intuitively by each physician rather than by
We were interested in identifying optimal cutoffs for the test
a formal assessment of the benefits or harms of testing, treating,
and treatment thresholds because this reflects modern clinical
or neither. Finally, CIs were fairly large for some estimates,
decisionmaking about appendicitis. Patients with a high
notably, the high-risk group for a cutoff of 9 for the Alvarado
probability of appendicitis are generally taken directly to the
score in adults and most of the low-risk groups. The latter was
operating theater, those with a very low probability are either
due primarily to 2 studies that were outliers.26,30
observed or discharged, and those with an intermediate
probability generally undergo imaging. The lower decision
DISCUSSION threshold corresponds to the “test threshold,” whereas the upper
Previous systematic reviews on this topic have had limitations. cutoff corresponds to the “treatment threshold.”
For example, some analyzed results only for either a single Figure 2 summarizes the clinical implications of our study for
cutoff,5 a single pair of cutoffs (ie, <5, 5 to 6, or 7 points),6 or the Alvarado score. We used likelihood ratios for test and

370 Annals of Emergency Medicine Volume 64, no. 4 : October 2014

Downloaded for Mahasiswa 1 FK UNPAD (mhs.clinicalkey1@fk.unpad.ac.id) at Universitas Padjadjaran from ClinicalKey.com by Elsevier on October 04, 2018.
For personal use only. No other uses without permission. Copyright ©2018. Elsevier Inc. All rights reserved.
Ebell & Shinholser Most Clinically Useful Cutoffs for Appendicitis Scores

they could achieve somewhat better sensitivity (98.1%) and


negative likelihood ratio (0.08) with the new rule. Of course, this
new rule would require prospective validation before it is applied.
Clearly, knowledge of the pretest probability is a critical
element to intelligently making use of the Alvarado score and
similar clinical decision rules. However, it is challenging for the
typical clinician in most settings to know what this pretest
probability is for common conditions in their practice or
emergency department (ED). It is hoped that increased use of
electronic health records will allow emergency physicians and
surgeons to have some idea of the percentage of children and
adults who present with acute abdominal pain and ultimately
receive a diagnosis of appendicitis.
It is also important to identify test and treatment thresholds to
Figure 2. Clinical application of optimal test and treatment
guide clinical practice and the development and validation of
thresholds for the Alvarado score in adults and children and the
Pediatric Appendicitis Score in children. Green indicates
clinical decision rules. However, few data are available in regard
probability of appendicitis below 3% and red a probability of to these thresholds. There are currently several approaches to
85% or higher. Bold columns indicates typical pretest determining thresholds (eg, cost-utility analysis, decision analysis,
probability of a final diagnosis of appendicitis for children (33%) use of clinical vignettes), and more work is needed to determine
and adults (66%) presenting with clinically suspected the best approach.
appendicitis from included studies. LR, Likelihood ratio. Finally, developers of clinical decision rules for the diagnosis
of appendicitis and other conditions should explicitly consider
treatment thresholds calculated in Tables 3 and 4 and calculated the typical range of pretest probabilities in different settings, as
posttest probabilities of appendicitis over a plausible range of well as the test and treatment thresholds, when deriving and
pretest probabilities. Ohmann et al37 proposed that clinical rules validating clinical decision rules. We argue that the most useful
not miss appendicitis in more than 5% of patients, which rules identify patients whose scores are below the test threshold,
corresponds to the test threshold to define a low-risk group. They above the treatment threshold, or both.4
also proposed that the negative appendectomy rate be no more In summary, in children with a pretest probability of acute
than 15%, which corresponds to a treatment threshold of appendicitis of 60% or less, an Alvarado score below 4 points
85%. We argue that a lower missed appendicitis rate of perhaps rules out the diagnosis satisfactorily. When the pretest probability
3% is more appropriate, particularly in children. is greater than or equal to 60% in adult populations, an Alvarado
In children, when the pretest probability of acute appendicitis is score of 7 or higher rules in the diagnosis, whereas a score of 9 or
60% or less (true for 8 of 11 studies in our analysis), an Alvarado higher rules in the diagnosis at pretest probabilities greater than
score below 4 is associated with a probability of appendicitis of less or equal to 40%.
than 3%. For a range of pretest probabilities greater than 40%,
typical for studies of consecutive adults presenting with clinically
suspected appendicitis, a score of 9 is associated with an 85% or Supervising editor: Kathy N. Shaw, MD, MSCE
higher probability of appendicitis, a point at which most surgeons Author affiliations: From the Department of Epidemiology and
would be comfortable ruling in the diagnosis. A score of 7 or higher Biostatistics, College of Public Health, University of Georgia,
rules in appendicitis at pretest probabilities of at least 60%. Athens, GA.
However, at these pretest probabilities, the Alvarado score is not
Author contributions: MHE conceived and designed the study,
useful for ruling in appendicitis in children or ruling it out in adults. supervised the conduct of the systematic review and data
Nevertheless, thoughtful application of the Alvarado score has collection, performed the literature searches, conducted the
the potential to significantly reduce use of computed tomography analysis, and wrote the article. MHE and JS performed selection of
and other imaging in children who have a low likelihood of studies and data abstraction. JS approved the article. MHE takes
appendicitis. The positive likelihood ratio of the Pediatric responsibility for the paper as a whole.
Appendicitis Score for ruling in appendicitis when above 8 points
Funding and support: By Annals policy, all authors are required to
was similar to that of the Alvarado score, and the negative likelihood disclose any and all commercial, financial, and other relationships
ratio was higher, making it worse at ruling out appendicitis. Thus, in any way related to the subject of this article as per ICMJE conflict
it has less clinical value than the Alvarado score, according to our of interest guidelines (see www.icmje.org). The authors have stated
analysis (Figure 2). that no such relationships exist.
Kharbanda et al41 recently published a multicenter validation
Publication dates: Received for publication July 14, 2013.
of the Low Risk Appendicitis Score. Their original rule was
Revision received January 25, 2014. Accepted for publication
95.5% sensitive and 36.3% specific, with a negative likelihood February 28, 2014. Available online April 14, 2014.
ratio of 0.12. They changed the rule post hoc and found that

Volume 64, no. 4 : October 2014 Annals of Emergency Medicine 371

Downloaded for Mahasiswa 1 FK UNPAD (mhs.clinicalkey1@fk.unpad.ac.id) at Universitas Padjadjaran from ClinicalKey.com by Elsevier on October 04, 2018.
For personal use only. No other uses without permission. Copyright ©2018. Elsevier Inc. All rights reserved.
Most Clinically Useful Cutoffs for Appendicitis Scores Ebell & Shinholser

REFERENCES 22. Sigdel GS, Lakhey PJ, Mishra PR. Tzanakis score vs. Alvarado score in
1. Alvarado A. A practical score for the early diagnosis of acute acute appendicitis. JNMA J Nepal Med Assoc. 2010;49:96-99.
appendicitis. Ann Emerg Med. 1986;15:557-564. 23. Shreef KS, Waly AH, Abd-Elrahman S, et al. Alvarado score as an
2. Samuel M. Pediatric appendicitis score. J Pediatr Surg. admission criterion in children with pain in right iliac fossa. Afr J
2002;37:877-881. Paediatr Surg. 2010;7:163-165.
3. Pauker SG, Kassirer JP. The threshold approach to clinical decision- 24. Bond GR, Tully SB, Chan LS, et al. Use of the MANTRELS score in
making. N Engl J Med. 1980;302:1109-1117. childhood appendicitis: a prospective study of 187 children with
4. Ebell MH. AHRQ white paper: use of clinical decision rules for point-of- abdominal pain. Ann Emerg Med. 1990;19:1014-1018.
care decision support. Med Decis Mak. 2010;30:712-721. 25. Escriba A, Gamell AM, Fernandez Y, et al. Prospective validation of two
5. van Houwelingen HC, Arends LR, Stijnen T. Advanced methods in systems of classification for the diagnosis of acute appendicitis.
meta-analysis: multivariate approach and meta-regression. Stat Med. Pediatr Emerg Care. 2011;27:165-169.
2002;21:589-624. 26. Mandeville K, Pottker T, Bulloch B, et al. Using appendicitis scores in
6. van Houwelingen HC, Zwinderman KH, Stijnen T. A bivariate approach the pediatric ED. Am J Emerg Med. 2011;29:972-977.
to meta-analysis. Stat Med. 1993;12:22730-22784. 27. Borges PS, Lima M, Neto GH. The Alvarado score validation in
7. Reitsma JB, Glas AS, Rutjes AWS, et al. Bivariate analysis of sensitivity diagnosing acute appendicitis in children and teenagers at the
and specificity produces informative summary measures in diagnostic Instituto Materno Infantil de Pernambuco, IMIP. Res Bras Saude
reviews. J Clin Epidemiol. 2005;58:982-990. Matern Infant. 2003;3:439-445.
8. Kulik DM. Does this child have appendicitis? a systematic review of 28. Schneider C, Kharbanda A, Bachur R. Evaluating appendicitis scoring
clinical prediction rules for children with acute abdominal pain. J Clin systems using a prospective pediatric cohort. Ann Emerg Med.
Epidemiol. 2013;66:95-104. 2007;49:778-784, 784.e1.
9. Ohle R, O’Reilly F, O’Brien KK, et al. The Alvarado score for predicting 29. Bhatt M, Joseph L, Ducharme FM, et al. Prospective validation of the
acute appendicitis: a systematic review. BMC Med. 2011;9:139. Pediatric Appendicitis Score in a Canadian pediatric emergency
10. Liu JLY, Wyatt JC, Deeks JJ, et al. Systematic reviews of clinical decision department. Acad Emerg Med. 2009;16:591-596.
tools for acute abdominal pain. Health Technol Assess. 30. Goldman RD, Carter S, Stephens D, et al. Prospective validation of the
2006;10:1-167. Pediatric Appendicitis Score. J Pediatr. 2008;153:278-282.
11. Al Qahtani HH, Muhammad AA. Alvarado score as an admission 31. Zúñiga RV, Arribas JL, Montes SP, et al. Application of Pediatric
criterion for suspected appendicitis in adults. Saudi J Gastroenterol. Appendicitis Score on the emergency department of a secondary level
2004;10:8691. hospital. Pediatr Emerg Care. 2012;28:489-492.
12. Baidya N, Rodrigues G, Rao A, et al. Evaluation of Alvarado score in 32. Andersson M, Andersson RE. The Appendicitis Inflammatory Response
acute appendicitis: a prospective study. Internet J Surg. 2007;9:1. Score: a tool for the diagnosis of acute appendicitis that outperforms
13. Kim K, Rhee JE, Lee CC, et al. Impact of helical computed tomography the Alvarado score. World J Surg. 2008;32:1843-1849.
in clinically evident appendicitis. Emerg Med J. 2008;25:477-481. 33. de Castro SM, Unlu C, Steller EP, et al. Evaluation of the Appendicitis
14. Limpawattanasiri C. Alvarado score for the acute appendicitis in a Inflammatory Response Score for patients with acute appendicitis.
provincial hospital. J Med Assoc Thai. 2011;94:441-449. World J Surg. 2012;36:1540-1545.
15. Pouget-Baudry Y, Mucci S, Eyssartier E, et al. The use of the Alvarado 34. Saidi RF, Ghasemi M. Role of Alvarado score in diagnosis and
score in the management of right lower quadrant abdominal pain in treatment of suspected acute appendicitis. Am J Emerg Med.
the adult. J Visc Surg. 2010;147:e40-e44. 2000;18:230-231.
16. Pruekprasert P, Maipang T, Geater A, et al. Accuracy in diagnosis of 35. Chan MY, Tan C, Chiu MT, et al. Alvarado score: an admission criterion
acute appendicitis by comparing serum C-reactive protein in patients with right iliac fossa pain. Surgeon. 2003;1:39-41.
measurements, Alvarado score and clinical impression of surgeons. 36. Jang SO, Kim BS, Moon DJ. [Application of Alvarado score in
J Med Assoc Thai. 2004;87:296-303. patients with suspected appendicitis]. Korean J Gastroenterol.
17. Sanabria A, Domínguez LC, Bermúdez C, et al. [Evaluation of diagnostic 2008;52:27-31.
scales for appendicitis in patients with lower abdominal pain]. 37. Ohmann C, Yang Q, Franke C. Diagnostic scores for acute appendicitis.
Biomedica. 2007;27:419-428. Eur J Surg. 1995;161:273-281.
18. Memon AA, Vohra LM, Khaliq T, et al. Diagnostic accuracy of Alvarado 38. Wu HP, Yang WC, Wu KH, et al. Diagnosing appendicitis at different
score in the diagnosis of acute appendicitis. Pak J Med Sci. time points in children with right lower quadrant pain: comparison
2009;25:118-121. between Pediatric Appendicitis Score and the Alvarado score. World J
19. Inci E, Hocaoglu E, Aydin SP, et al. Efficiency of unenhanced MRI in Surg. 2012;36:216-221.
the diagnosis of acute appendicitis: comparison with Alvarado 39. Wu HP, Chen CY, Kuo IT, et al. Diagnostic values of a single serum
scoring system and histopathological results. Eur J Radiol. 2011;80: biomarker at different time points compared with Alvarado score and
253-258. imaging examinations in pediatric appendicitis. J Surg Res.
20. Canavosso L, Carena P, Carbonell JM, et al. [Right iliac fossa pain and 2012;174:272-277.
Alvarado score]. Cir Esp. 2008;83:247-251. 40. Kang W-M, Lee C-H, Chou Y-H, et al. A clinical evaluation of ultrasonography
21. Denizbasi A, Unluer EE. The role of the emergency medicine resident in the diagnosis of acute appendicitis. Surgery. 1989;105:154-159.
using the Alvarado score in the diagnosis of acute appendicitis 41. Kharbanda AB, Dudley NC, Bajaj L, et al. Validation and refinement of a
compared with the general surgery resident. Eur J Emerg Med. prediction rule to identify children at low risk for acute appendicitis.
2003;10:296-301. Arch Pediatr Adolesc Med. 2012;166:738-744.

Did you know?


Annals accepts audio and video files as ancillaries to the main article.
Visit http://www.annemergmed.com/content/instauth/ for more details!

372 Annals of Emergency Medicine Volume 64, no. 4 : October 2014

Downloaded for Mahasiswa 1 FK UNPAD (mhs.clinicalkey1@fk.unpad.ac.id) at Universitas Padjadjaran from ClinicalKey.com by Elsevier on October 04, 2018.
For personal use only. No other uses without permission. Copyright ©2018. Elsevier Inc. All rights reserved.
Ebell & Shinholser Most Clinically Useful Cutoffs for Appendicitis Scores

Figure E1. Bivariate receiver operating characteristic curves for different low and high cutoffs of the Alvarado score in adults.
A, A cutoff of less than 4 points. B, A cutoff of 7 or more points. C, A cutoff of 9 or more points. Because of the small number of
studies and events within studies, the MIDAS procedure in Stata was unable to create a receiver operating characteristic curve for
the cutoff of less than 5 points.

Volume 64, no. 4 : October 2014 Annals of Emergency Medicine 372.e1

Downloaded for Mahasiswa 1 FK UNPAD (mhs.clinicalkey1@fk.unpad.ac.id) at Universitas Padjadjaran from ClinicalKey.com by Elsevier on October 04, 2018.
For personal use only. No other uses without permission. Copyright ©2018. Elsevier Inc. All rights reserved.
Most Clinically Useful Cutoffs for Appendicitis Scores Ebell & Shinholser

Figure E2. Bivariate receiver operating characteristic curves for different low and high cutoffs of the Alvarado score in children.
A, A cutoff of less than 4 points. B, A cutoff of less than 5 points. C, A cutoff of 7 or more points. D, A cutoff of 9 or more points.

372.e2 Annals of Emergency Medicine Volume 64, no. 4 : October 2014

Downloaded for Mahasiswa 1 FK UNPAD (mhs.clinicalkey1@fk.unpad.ac.id) at Universitas Padjadjaran from ClinicalKey.com by Elsevier on October 04, 2018.
For personal use only. No other uses without permission. Copyright ©2018. Elsevier Inc. All rights reserved.

You might also like