You are on page 1of 24

B

a
s
i
c
C
o
u
n
t
i
n
g
S
t
a
t
i
s
t
i
c
s

Stochastic Nuclear Observables

2 sources of stochastic observables x in nuclear science:


1) Nuclear phenomena are governed by quantal wave functions and
inherent statistics
2) Detection of processes occurs with imperfect efficiency ( < 1) and
finite resolution distributing sharp events x0 over a range in x.
Stochastic observables x have a range of values with frequencies
determined by probability distribution P(x).
characterize by set of moments of P
<xn> = xn P (x)dx;

n 0, 1, 2,

Statistics

Normalization <x0> = 1. First moment (expectation value) of P:


E(x) = <x> = xP (x)dx
second central moment = variance of P(x):
x2 = <x2- <x2> >
W. Udo Schrder, 2009

Uncertainty and Statistics


Nuclear systems: quantal wave functions i(x,;t)
(x,;t) = degrees of freedom of system, time.
Probability density (e.g., for x, integrate over other d.o.f.)
1

dPi ( x, t )
2
x
t
i 1, 2,....
=
|=
,
|
(
)
i
dx
Normalization
+
+
dPi ( x, t )
2
Pi (t ) =
dx
dx
x
t
=
=

|
,
|
1
(
)
i

dx

Transition between states 1 2, 12


| M 12 |2 ( E )
2

2
12 t
dP12 ( x, t )
= | 1 ( x ) |2 e e 12 t state 1 disappears
dx

Partial probability rates 12 for disappearance (decay of 12)


can vary over many orders of magnitude no certainty statistics

The Normal Distribution


0. 2

Continuous function or discrete


distribution (over bins)

0. 1 5

2x

P(x)

G( x)
0. 1

G( )

P( x=)

<x>

0. 0 5

1 10

10 8

xx

10

P(x)

FWHM = 2 x 2 ln 2 = 2.35 x
Normalized probability

0.08
Statistics

Normal (Gaussian) Probability

G (xn)

( x x
1
exp
2
2
2

2 X
X

0.06

P ( x <=
x1 )

0.04
0.02
8

10

30
W. Udo Schrder, 2009

30

38

46

xxn

54

62

70
70

1
2 X2

(x x

dx exp
2
2

x1

Experimental Mean Counts and Variance


Measured by ensemble sampling
expectation values + uncertainties

Sample (Ensemble) = Population instant


236U

(0.25mg) source, count particles


emitted during N = 10 time intervals
(samples @1 min). =??

Average count n in a sample :


5

1 N
n = ni ( n population unknown)
N i =1
Variance of n in the individual samples
N
1
s =
= (ni n ) 2
N 1 i =1
Variance (" error ") of the

Statistics

sample average n n

population

N
s2
1
(ni n ) 2
==
N N ( N 1) i =1
2
n

Std . deviation :=
n
Result : n n
W. Udo Schrder, 2009

pop

n2 < n > / N = 59
= (35496 59) min 1

Slightly different from


sample to sample

Sample Statistics
P(x)

Assume true population distribution for variable x


3

P( x=)

x x
1

pop
exp
2
2
v
2 v X2
X

)
2

10

with true (population) mean <x>pop = 5.0, x =1.0

3 independent
sample measurements
(equivalent statistics):
No rmally Distributed Event
No rmally Distrib uted Events
No rmally Distrib uted Event

10 10

10 10

<x> =5.11 =1.11

m, i

,i

10 10

<x> =4.96 =1.23

m, i

<x> =4.96 =0.94

<x>+
<x>
<x>-

Statistics

0+

0
10 1

10 5.5

0
30 0

Mean arithmetic sample average

30 4.5

0
40 1

4 0 5 .5

<x> = (5.11+4.96+4.96)/3 = 5.01

Variance of sample averages s2=2 = [(5.11-5.01)2+2(4.96-5.01)2]/2 = 0.01 s = 0.0075


x2 = 0.0075/3 = 0.0025
W. Udo Schrder, 2009

x = 0.05 Result: <x>pop 5.01 0.05

Example of Gaussian Population


Sample size makes a difference ( weighted average)
n = 10

n = 50

ii := FRAME

Normally Distributed Events

10 10

Normally Distributed Events


10

xm , i

xm, i

x0

x0

x0+

x0+

x0

x0

0
0

10

10MC Events in 10 x bins


20
Statistics

10

25

50

50

50MC Events in 10 x bins


20

10

xm
W. Udo Schrder, 2009

10

xm

10

The larger the


sample, the
narrower the
distribution of x
values, the more
it approaches the
true Gaussian
(normal)
distribution.

Central-Limit Theorem

The means (averages) of different samples in the previous


example cluster together closely. general property:

Increasing size n of samples:


Distribution of sample means Gaussian normal distrib.
regardless of form of original (population) distribution.

Statistics

The average of a distribution does not contain information on the


shape of the distribution.
The average of any truly random sample of a population is already
somewhat close to the true population average.
Many or large samples narrow the choices: smaller Gaussian width
Standard error of the mean decreases with incr. sample size
W. Udo Schrder, 2009

Binomial Distribution

Integer random variable m = number of events, out of N total, of a given


type, e.g., decay of m (from a sample of N ) radioactive nuclei, or detection
of m (out of N ) photons arriving at detector.
p = probability for a (one) success (decay of one nucleus, detection of one
photon)
Choose an arbitrary sample of m trials out of N trials
pm = probability for at least m successes
(1 -p)N- m = probability for N-m failures (survivals, escaping detection)
Probability for exactly m successes out of a total of N trials

P ( m) p

(1 p )

N m

Statistics

How many ways can m events be chosen out of N ? Binomial coefficient


N
N!
(N m + 1) N
=
=
m !(N m)!
1 m
m

Total probability (success rate) for any sample of m events:

Pbinomial
W. Udo Schrder, 2009

NI
F
(m) = G J p b1 pg
H mK
m

N m

Moments and Limits


N
N m
Pbinomial
(m) pm (1 p )
=
m

Probability for m successes out of N


trials, individual probability p
Normalization

N m
N m
P
m
p
p
=
=

1
(
)
1
(
)
binomial

Binomial Dis trib utio ns N=30
=
m 0=
m 0 m
N

0.236 0.3
10

Mean and variance

Pb( N , m , 0.1)

m =N p and m2 =N p (1 p )

0.2

Pb( N , m , 0.3)

0.1
Statistics

0
0

0
0

10
m

15

20
20

Distributions for N=30 and


p=0.1 p=0.3
Poisson Gaussian

lim Pbinomial (m=


)

N
W. Udo Schrder, 2009

N p (1 p )
1

Np
N

1
2
2 m

x m
(
exp
2

Poisson Probability Distribution


Probability for observing m
events when average is <m> =

Results from binomial


distribution in the limit of small
p and large N (Np > 0)

PPoisson ( , m) =

lim Pbinomial ( N , m) = PPoisson ( , m)

11

p 0
and N

2 =

Observe N counts (events)


uncertainty is =

Pp ( 3 , m)

Statistics

m=0,1,2,

is the mean, the average number


of successes in N trials.

Pp ( 0.5 , m) 0.6

Pp ( 5 , m)

m!

= <m> = N p and

Po iss on Dist ribu tio ns

0.607 0.8

m e

Unlike the binomial distribution,


the Poisson distribution does not
depend explicitly on p or N !

0.4

Pp ( 10 , m)
0.2

For large N, p:

0
0

0
0

W. Udo Schrder, 2009

10
m

15

20
20

Poisson Gaussian (Normal


Distribution)

Moments of Transition Probabilities

12

6.022 1023
N : 0.25mg =
0.25mg
6.38 1017
=
236 g
n
3.5946 104 min 1
14
1
p =
5.6362
10
min
=
=

N
6.38 1017
Probability for decay (decay rate per nucleus ) :
p= = 5.6362 1014 min 1

Statistics

corresponds to " half life " =


t1/ 2 2.34 107 a
Small probability for process, but many trials (n0 = 6.381017)

0< n0 <

Statistical process follows a Poisson distribution: n=random


Different statistical distributions: Binomial, Poisson, Gaussian
W. Udo Schrder, 2009

Radioactive Decay as Poisson Process


Slow radioactive decay of large sample
Sample size N 1, decay probability p 1, with 0 < Np <

13

137Cs

unstable isotope decay

t1/2 = 27 years p = ln2/27 = 0.026/a = 8.210-10s-1 0


Sample of 1 g: N = 1015 nuclei (=trials for decay)

How many will decay(= activity ) ?


>= Np = 8.210+5 s-1
=< N

Statistics

>=d<N>/dt = (8.210+5 905) s-1


Count rate estimate < N

Probability for m actual decays P (,m) =


PPoisson
=
( , m)

W. Udo Schrder, 2009

estimated

5 m

e
(8.52 10 ) e
=
m!
m!

8.52105

Functions of Stochastic Variables

14

Random independent variable sets {N1}, {N2},.,{Nn }


corresponding variances
12, 22,.,n2
Function f(N1, N2,.,Nn) defined for any tuple {N1, N2,.,Nn}
Expectation value (mean) f = f ( N1 ,..., Nn )
N1 ,...,Nn
Gauss law of error propagation:
12

f
f

2
2
2
=
+
+
+
....
f

1
2

n
N2
Nn
N1

Statistics

( f |

N2,N3,..

) + ( f |
2

N1, N3,..

+ .. + f |N

1,N2,.., Nn 1

Further terms if Ni not independent ( correlations)


Otherwise, individual component variances (f)2 add.
W. Udo Schrder, 2009

Example: Spectral Analysis


Adding or subtracting 2 Poisson distributed numbers N1 and N2:
Variances always add
N : =
N1 N1 N2 N2

15

( N1 N2 )

N1 + N2

Analyze peak in range channels c1


c2: beginning of background left
and right of peak n = c1 c2 +1.

Total area = N12 =A+B


N(c1)=B1, N(c2)=B2,

Statistics

Peak Area A

B1

W. Udo Schrder, 2009

Linear background
<B> = n(B1+B2)/2

Peak Area A:
Background B
=
A N12 n ( B1 + B2 ) / 2
B2
A = N12 + n2 ( B1 + B2 ) / 4

Confidence Level
Measured Probability

16

2
3

Assume normally distributed


observable x:
P(=
x)

1
2 v 2pop

)
2

Sample distribution with data set


observed average <x> and std.
error approximate population.
Confidence level CL
(Central Confidence Interval):

2
( x x )
exp
P (|< x pop > < x >| < )
dx =CL
2
2
2
2
< x>

With confidence level CL (probability in %), the true value <xpop>


differs by less than = n from measured average. Trustworthy exptl.
CL=
( 1
=
) 68.3% CL=
( 2
=
) 95.4%
results quote 3
CL=
( 3=
) 99.7%
error bars!
< x >+

Statistics

x x

pop
exp
2v 2pop

W. Udo Schrder, 2009

Setting Confidence Limits


Example: Search for rare decay with decay rate , observe no counts
within time t. Decay probability law

dP/dt=-dN/Ndt= exp {- t}.

17

no counts in t

P ( 0 | ) =t e t with

P(,t) = symmetric in and t

e
d =1 normalized P

P( 0 ) = t e t d = 1 e 0 t := CL

normalized P

Statistics

1
1
0 = ln[1 P( 0 )] = ln[1 CL] > 0 Upper limit
t
t
Higher confidence levels CL (0 CL 1) larger upper limits for a given
time t inspected.
Reduce limit by measuring for longer period.

W. Udo Schrder, 2009

Maximum Likelihood
Measurement of correlations between observables y and x: {xi,yi| i=1-N}
Hypothesis: y(x) =f(c1,,cm; x). Parameters defining f: {c1,,cm}
ndof=N-m degrees of freedom for a fit of the data with f.

Pi (c1 ,.., cm ; x=)

( yi f (c1 ,.., cm ; xi ) )2
exp

2
2
2

2 i
i

18

Maximize simultaneous probability

P(c1 ,.., cm )
=
=
Statistics

for every
data point

P (c ,.., c ; x)
=

i =1
N

j =1

c1 ,.., cm ) :
(=
2

yi )
(
=
2

2
N

yi )
(
1

exp
2
2
2

2 j
i
i =1

( yi f (c1 ,.., cm ; xi ) )

2
=i 1 =
i 1
i

Minimize chi-squared by varying {c1,,cm}:


W. Udo Schrder, 2009

i2
2/c

=0

When is 2 as
good as can be?

Minimizing 2
Example: linear fit f(a,b;x) = a + bx to data set {xi, yi, i}

yi )
(
2
=
=
( a, b ) :
2
N

Minimize:

19

( yi a bxi )

i
i 1
=i 1 =

i2

( yi a bxi )

N 2 ( y a bx )
2

i
i
( a, b ) =
0=
=

a
a i 1=
i2
i2
=
i 1
N 2 x ( y a bx )
2
i
i
i
( a, b ) =
0=
b
i2
i =1
Equivalent to solving system of linear equations
N

a
Statistics

+ b

xi

yi
=
2
N

2
2

=
1
i 1 =
i
i 1 i
i
i

d11 d12
ad11 + bd12 =
c1
D=
d21 d22
ad21 + bd22 =
c2

N x y
xi
xi2
=
a
i i
a
+
b
=
2
2
2

i 1 =
i
=
i
1
1
i
i
i
a2
=
N

W. Udo Schrder, 2009

1 c1 d12
1 d11 c1
=
b
D c2 d22
D d21 c2
xi2
1
1
1
2

b
D i2
D i2

Distribution of Chi-Squareds
Distribution of possible 2 for data sets distributed normally about a
theoretical expectation (function) with ndof degrees of freedom:
Chi-Squared Distribution
0 .4

Pndof ( 2 )d 2

P ( u , 1)

20

P ( u , 2)

0 .3

0 .2

P ( u , 5)

(n) = ( n 1) ! =

0 .1

Statistics

2ndof

2
2

( ndof 2 )

2
<2>ndof=5 =
n=
2ndof
2

2
dof

P ( u , 3)
P ( u , 4)

)
(
=

2 1
n
2 dof

d2

ndof 1

(Stirlings formula)

=
2.507e nnn 1 2 (1 + 0.0833 / n)
8
10

u 2
u:=

P( 2 , ndof ) =

Pndof (x)dx

Should be P 0.5 for a


reasonable fit
W. Udo Schrder, 2009

Reduced 2:
2
2
2
n
(N m 1)
=

r
dof

For
0 r2 < 1.5

Confidence 50%

Statistics

21

CL for 2-Distributions

W. Udo Schrder, 2009

1-CL

Correlations in Data Sets


Correlations within data set. Example: yi small whenever xi small

uncorrel. P(x,y)

Punc ( x, y ) =P( x) P( y ) =

( x x
=
exp
2
2 2
2

4 x
x

1
=

Pcorr ( x, y )
2 2
2
2 x y xy

22

Statistics

x
correlated P(x,y)

( x x
exp

xy =
(x x

x y

x2 + ( y y

2 y2

y2 2 ( x x

2 ( x2 y2 xy2 )

y )P( x, y )dxdy covariance

)( y

2
2
1 x y
cot =
+

xy 2

=
rxy xy
W. Udo Schrder, 2009

) +(y

x2 y2
4

2
2
+ xy

correlation coefficient 1 rxy 1

)( y

y )

Correlations in Data Sets


uncertainties of deduced most likely parameters ci (e.g., a, b for
linear fit) depend on depth/shallowness and shape of the 2 surface

cj

uncorrelated

Punc ({ci }) = P( ci )

2 surface

23

Pcorr ({ci }) P( ci )
i

ci

Statistics

rij :
=

cj

correlated 2

surface

cj

cj

1 rij + 1

ij
;
rij =
i j

ci
W. Udo Schrder, 2009

(c c ) (c
(c c ) (c

ij :
=

1
1

c
c
c
c

i j covariance
i j
n 1
n

Multivariate Correlations

cj

Smoothed 2 surface

Steepest gradient,
Newton method w or w/o
damping of oscillations,

24

initial
guess

Biased MC:Metropolis MC
algorithms,

search
path

Statistics

ci

W. Udo Schrder, 2009

Different search strategies:

Simulated annealing (MC


derived from metallurgy),

Various software packages


LINFIT, MINUIT,.

You might also like