You are on page 1of 31

CONFIDENCE LEVELS

We warn the reader that there is no universal convention for the term condence level (The Review of Particles Properties, 1986)

Condence levels
Part Goal

of descriptive statistics of an experiment: measure a theoretical parameter a the result usually involves giving some interval [a,b]:

Quoting ! ! !

Expresses probability that the true value is in this interval Allows information consumer to draw conclusions from the result Set upper / lower limit on the true value of a parameter
2

Condence level denition


Let some measured quantity be distributed according to some p.d.f. P(x), we can determine the probability that x lies within some interval, with some condence C
P rob(x x x+ ) =
x+ x

P (x)dx = C

We say: x lies in the interval [x- , x+] with condence C Note: C is a probability according to the frequency limit 3

Gaussian condence intervals


If

P(x) = Gaussian distribution with mean " and variance #2: some examples of condence intervals:

x x x x

= = = =

1 2 1.64 1.96
4

C C C C

= 68% = 95.4% = 90% = 95%

Types of condence intervals


P rob(x x x+ ) =
x+ x

P (x)dx = C

3 conventional ways to choose an interval around the center: 1. Symmetric interval: x- and x+ equidistant from the mean 2. Shortest interval: minimizes (x+ - x-) 3. Central interval:
x

P (x) dx =

+ x+

1C P (x) dx = 2

For Gaussian (and any symmetric distribution): 3 denitions are equivalent


5

One-tailed condence intervals


So

far, we considered two-tailed intervals. as well: one-tailed limits


x+

Useful !

Upper limit: x lies below x+ at condence level C:


P (x) dx = C

Lower Limit: x lies above x- at condence level C:


+

P (x) dx = C
6

Condence intervals in estimation


In a measurement two things involved:
! !

Physical parameter(s) X: mass, lifetime, ... Measurement of this parameter x

Given X, there is a p.d.f. for measuring x (resolution, QM,...) But what you want to know:
!

Given measurement x!x, what can I say about X ?


7

Can I say that X lies within [x-!x, x+!x] with 68% probability? Not in the sense of a frequency: X is not a random variable!!!
8

MEANING OF CONFIDENCE INTERVALS

Condence belt Construction


Neyman Construction:
D(")

parameter !

1. For each $ nd D($) with probability C 2. Condence interval includes all $ with observation at x0 NOTE: this is not a statement about the probability of " but about the interval!

x2!!"#$!2!x"$
!0

x1!!"#$!1!x"$

x1!!0"$

x2!!0"$

Possible experimental values x

10

Lower / Upper limits using the condence belt


Given measurement x0 nd X- and X+ from condence belt: X+ upper limit at C.L. 1-%:
+ x0

i.e. if X & X+ : Probability to measure x ' x0 is less than % X- lower limit at C.L. 1-%:
x0

P (x|X+ ) dx = 1

i.e. if X ' X- : Probability to measure x & x0 is less than %


11

P (x|X ) dx = 1

12

Gaussian condence levels


P(x|X): Gaussian with standard deviation # Apply method to determine a 90% C.L. interval for X given a measurement x0 :
x0
(xX+ )2 1 e 22 dx = 0.05 = 2

+ x0

(xX )2 1 e 22 dx 2

Equation for X-: requires that x0 lies some number of standard deviations above X- , which is the same as saying that X- lies the same number of # below x0 Condence belt limited by two straight lines
13

X = x0 k

(k: depends on the desired C.L.)

Condence levels near a physical boundary


Assume a mass measurement with resolution 20 MeV The true mass is 10 MeV Use a 2# (95.4%) C.I. to quote the result: x 40 MeV Consider cases:

(2.3 % probability that measurement > 50 MeV (Measurement in range 40-50 MeV: limits will be true (x = 0.2 40 MeV: correct lower limit to 0 and OK (BUT what if x = - 50 MeV 40 MeV : X < -10 MeV @ 95% C.L. !!!???
It is strictly speaking correct but ridiculous! Only means of escape: BAYES TO THE RESCUE!
14

Bayesian Condence Intervals


Bayes theorem: P(theory): assume all positive masses equally likely

P (data|theory )P (theory ) P (theory |data) = P (data)

P (true mass) =
Now apply Bayes theorem:

0, constant,

m<0 m0 (xX )2 22 e P (X |x) = )2 (xX 2 2 e dX 0

For x = - 50 MeV 20 MeV: Denominator is one-sided 2.5 # Gaussian tail: 0.0062 Look for 90 % C.L. upper limit: Integral of numerator must be ~0.0006: 3.24 # Results: mass < -50 MeV + 3.24 * 20 MeV = 15 MeV @ 90 % C.L.

15

Bayesian Condence Interval: Example


Mass measurement Measurement x = - 50 MeV 20 MeV Prior: assume all positive masses equally likely:

P (true mass) =
Denominator is one-sided 2.5 # Gaussian tail: 0.0062 Integral of numerator must be ~0.0006: 3.24 #

0, constant,

m<0 m0

Result: mass < -50 MeV + 3.24 * 20 MeV = 15 MeV @ 90 % C.L. For comparison: Frequentist 90 % Upper Limit: - 10 MeV

16

Condence Intervals for Discrete distributions


Physical parameter: real Measured variable discrete:


number of counts (Poisson) Number of successes (Binomial)

May be unable to select region with exact condence C (e.g. 90%):

Play safe: gives overcoverage

17

Binomial Condence Intervals


Observed value: Number of successes (out of N trials) - DISCRETE True value: single trial probability R - CONTINUOUS If m successes found in N trials:
!

Limits on the individual probability p: Find p+ and p- such that (Using 95% m1 central limit): N P (r; p+ , N ) = 0.975 P (r; p , N ) = 0.975
r =m+1 r =0

(CLOPPER-PEARSON COEFFICIENTS)
!

In words:

Were p&p+: Probability to get m counts or less is ' 2.5% Were p'p-: Probability to get m counts or more is ' 2.5%
18

From the archives:

19

Poisson condence intervals


n events observed from Poisson process of unknown mean " The 90 % Poisson upper limit is the value "+ such that:
n

P (r; + ) = 0.90, or equivalently


r =0

P (r; + ) = 0.10

r =n+1

i.e.: if true value of " is really "+ probability for getting a number of counts n or smaller is 10% Similarly, the 90 % Poisson lower limit is the value "- such that:
n1 r =0

P (r; ) = 0.90, or equivalently


20

r =n

P (r; ) = 0.10

Some Poisson Limits

21

Condence intervals using likelihood function


For non-Gaussian estimators, still possible to determine condence interval with a simple approximate technique using the likelihood function, or equivalently the !2 function For a ML estimator for a parameter a In the large sample approximation:

1 ( a a)2 exp g ( a; a) = 2 2 a 2a ! The likelihood function itself becomes Gaussian with the same # (a a )2 L(a) = Lmax exp 2 2a Can extract # from likelihood scan!
22

! The p.d.f. g(,a) becomes Gaussian:

Prescription for setting condence intervals using the likelihood


1. Extract # from log-likelihood scan using: N2 logL( a N a ) = logLmax 2 2. Use fact that g(,a) is Gaussian to set condence intervals:
!

e.g.: [c, d] = [ - #, + # ] : 68% central condence interval

Can be shown that this procedure can be used even if the likelihood function is not Gaussian
! !

Exact only in the large sample limit May need to use asymmetric intervals around
23

Example: lifetime t
h_tau
1 0.8 0.6
4

Entries 5 Mean 1.201 RMS 0.5881

1 Using estimator : = n
h_tau
Entries 50 Mean 0.815 RMS 0.6943

ti
i=1

0.4 0.2 0 0

3.5 3 2.5 2

True )=1
22 20 18 16

0.5

1.5

2.5

1.5 1

h_tau
Entries 500 Mean 0.8455 RMS 0.7131

-5 -5.5 -6 -6.5 -7

" " ! - # !- " !

" " ! + # !+

0.5 0 0 0.5 1 1.5 2 2.5 3

14 12 10 8

logLmax logLmax - 1/2


-51.5 -52

" " ! - # !-

" !

" " ! + # !+

6 4 2 0 0 0.5 1 1.5 2 2.5 3

-7.5
-52.5

-8 0.5 1 1.5 2 2.5 3 3.5


-53 -53.5 -54 0.8 0.9 1 1.1 1.2 1.3 1.4

-516.5 -517 -517.5 -518 -518.5 -519 0.94 0.96

" " ! - # !-

" !

" " ! + # !+

0.98

1.02

1.04

1.06

1.08

1.1

1.12

1.14

24

Several variables: Condence regions


1-D case: look for single parameter " constructed interval [a,b] which contains true value with some probability C N-D case: Look for parameters "=("1, ..., "n) In general, cannot nd ai,bi so that ai < "i < bi for all i with some probability C Instead: Find condence region, ~n-dim. hyper ellipsoid

25

Multidimensional Condence regions


Use the properties of the likelihood function in the large sample limit: 1. The ML estimator p.d.f. is Gaussian: 1 g ( a; a) = (2)n1 exp a; a) , 1 / 2 2 Q( |V |

Q( a; a) = ( a a)T V 1 ( a a)

V 1 : inverse covariance matrix 2. The Likelihood function is Gaussian with the same V:

1 L(a) = Lmax exp Q(a, a) , 2


3. If p.d.f. described by n-dim. Gaussian: Q(, a) distributed according to !2 distribution with n d.o.f. : Q dz P rob(Q(a, a) Q ) = f2 (z, n)
0
26

2 distr. for n d.o.f .

Constructing n-dimensional Condence region


Prescription for nding condence region at @ C.L. 1-#: 1. Determine value of Q# (tables or numerically) so that:
Q 0

f2 (z, n)dz = 1

2. Find contour at which 1. *logL = - Q# / 2


27

Q# values for some values of coverage and numbers of parameters:

28

Upper limit on the mean of Poisson variable with background


Observed number of events is sum of signal + background: n = ns + nb (expectation values +s, +b) Goal: construct upper limit for +s n is Poisson distributed with mean +s + +b: ML estimate for +s : n - +b Lower limit: Upper limit:


lo s up s

= P ( s = P ( s

obs lo ; s ) s

=
nnobs

lo lo + b )n (s (s +b ) e n! up up (s + b )n (s +b ) e n!

obs up s ; s )

=
nnobs

Upper and lower limits are related to the limits without background
up = s (no background) b lo = s (no background) b

Problem if number of counts smaller than expected background, upper limit is < 0! BAYES TO THE RESCUE!
29

Upper limit with background: Bayesian approach


Bayesian

approach (at prior)

Solution

Reduces

to classical solution for no background


30

Upper limit with background


Bayesian solution

31

You might also like