Professional Documents
Culture Documents
Reading:
Ch. 3 in Kay-II.
Ch. 10 in Wasserman.
Communications,
Biomedicine, etc.
0 1 = , 0 1 = .
H0 : 0, null hypothesis
H1 : 1, alternative hypothesis.
and, similarly,
Z
1(x) = L(1 | 0) p( | x) d
0
Z
L(1 | 0) is constant
= L(1 | 0) p( | x) d .
| 0 {z }
P [0 | x]
X1 = {x : 1(x) 0(x)}
where
0 = (0), 1 = (1) = 1 0
describe the prior probability mass function (pmf) of the binary
random variable (recall that {0, 1}). Hence, for binary
simple hypotheses, the prior pmf of is the Bernoulli pmf.
Z Z
E x, [loss] = L(1 | 0) p(x | )() d dx
X1 0
Z Z
+ L(0 | 1) p(x | )() d dx.
X0 1
0-1 loss: For the 0-1 loss, i.e. L(1|0) = L(0|1) = 1, the
preposterior (Bayes) risk for rule (x) is
Z Z
E x, [loss] = p(x | )() d dx
X1 0
Z Z
+ p(x | )() d dx (7)
X0 1
p(x | 1) H1 0 L(1|0)
(x) = (8)
| {z } p(x | 0) 1 L(0|1)
likelihood ratio
see also Ch. 3.7 in Kay-II. (Recall that (x) is the sufficient
statistic for the detection problem, see p. 37 in handout # 1.)
Equivalently,
H1
log (x) = log[p(x | 1)] log[p(x | 0)] log 0.
In this case, our Bayes decision rule simplifies to the MAP rule
(as expected, see (5) and Ch. 3.6 in Kay-II):
p(1 | x) H1
1 (9)
p(0 | x)
| {z }
posterior-odds ratio
p(x | 1) H1
0
. (10)
p(x | 0) 1
| {z }
likelihood ratio
EE 527, Detection and Estimation Theory, # 5 15
which is the same as
(4), upon substituting the Bernoulli pmf as the prior pmf for
and
H0 : 0 versus
H1 : 1
and, therefore,
p( | x)
zZ }| {
R
p, | x(, | x) d d H1
1 L(1 | 0)
Z (11)
R L(0 | 1)
0
p, | x(, | x) d d
| {z }
p( | x)
,(, ) = () () (13)
p(x | )
zZ }| {
R
() px | ,(x | , ) () d d H1
1 L(1 | 0)
Z . (14)
R L(0 | 1)
0
() px | ,(x | , ) () d d
| {z }
p(x | )
same as (6)
R z }| {
px | ,(x | 1, ) () d p(x | 1) H1 0 L(1 | 0)
R = (16)
px | ,(x | 0, ) () d p(x | 0) 1 L(0 | 1)
| {z }
integrated likelihood ratio
where
0 = (0), 1 = (1) = 1 0.
Z Z
av. error probability = p(x | )() d dx
X1 0
Z Z
+ p(x | )() d dx
X0 1
n Z Z o
X1? = x : p(x | ) () d p(x | ) () d < 0 .
0 1
using
the def.
of X1?
Z nZ Z o
= min p(x | )() d, p(x | )() d) dx
X
|{z} 0 1
data space
4 4
= q0 (x) = q1 (x)
Z h zZ }| { i h zZ
}| {i
1
p(x | )() d p(x | )() d dx
X 0 1
Z
= [q0(x)] [q1(x)]1 dx
X
When
x1
x2
x=
..
xN
then
N
Y
q0(x) = p(x | 0) 0 = 0 p(xn | 0)
n=1
N
Y
q1(x) = p(x | 1) 1 = 1 p(xn | 1)
n=1
yielding
Chernoff bound for N conditionally i.i.d. measurements (given ) and simple hyp.
Z h Y N i h Y N i1
= 0 p(xn | 0) 1 p(xn | 1) dx
n=1 n=1
N nZ
Y o
= 0 11 [p(xn | 0)] [p(xn | 1)]1 dxn
n=1
nZ oN
= 0 11 [p(x1 | 0)] [p(x1 | 1)]1 dx1
1
log(min av. error probability) log(0 11)
N
Z
+ log [p(x1 | 0)] [p(x1 | 1)]1 dx1, [0, 1].
log f (N )
lim = 0.
N N
where
(1 )
g() = (2 1)T [ 1 + (1 ) 2]1(2 1)
2
h | + (1 ) | i
1 2
+ 12 log .
|1| |2|1
PDF under H0
0.2 PDF under H1
0.18
0.16
Probability Density Function
0.14
PD
0.12
0.1
0.08
0.06
P
FA
0.04
0.02
0
0 5 10 15 20 25
Test Statistic
(x)
z }| {
PFA = P [test {z > } | = 0]
| statistic
XX1
Comments:
(iv) However, the perfect case where our rule is always right
and never wrong (PD = 1 and PFA = 0) cannot occur when
the conditional pdfs/pmfs p(x | 0) and p(x | 1) overlap.
Setup:
H0 : = 0 versus
H1 : = 1.
PD = P [X X1 ; = 0]
PFA = P [X X1 ; = 0] = 0 .
L = PD + (PFA 0)
Z hZ i
0
= p(x ; 1) dx + p(x; 0) dx
X1 X1
Z
= [p(x ; 1) p(x ; 0)] dx 0.
X1
To maximize L, set
n p(x ; 1) o
X1 = {x : p(x ; 1)p(x ; 0) > 0} = x : > .
p(x ; 0)
p(x ; 1)
(x) = .
p(x ; 0)
or,
The complete proof of this lemma for the discrete (pmf) case
is given in
= N (PFA)
Consider
where
N
1 h 1 X 2
i
p(x ; a) = p exp (x[n] a) (17)
(2 2)N 2 2 n=1
p(x ; a = A)
(x) =
p(x ; a = 0)
N
exp[ 21 2 n=1(x[n] A)2]
2 N/2
P
1/(2 )
= 1
PN .
2
1/(2 ) N/2 2
exp( 22 n=1 x[n] )
N
1 X 4
T (x) = x = x[n]
N n=1
N N H1
1 X 1 X
0
log (x) = 2 (x[n] A)2 + (x[n])2
log
2 n=1 2 2 n=1 1
N H1
1 X
0
(x[n] x[n] +A) (x[n] + x[n] A) log
2 2 n=1 | {z } 1
0
N
X
H1
0
2 A x[n] A2 N 2 2 log
n=1
1
N
X AN H1 2
0
x[n] log since A > 0
n=1
2 A 1
H1 2 A
and, finally, x log(0/1) +
N
| A {z 2}
which, for the practically most interesting case of equiprobable
hypotheses
0 = 1 = 21 (18)
simplifies to H1 A
x
2
|{z}
known as the maximum-likelihood test (i.e. the Bayes decision
rule for 0-1 loss and a priori equiprobable hypotheses is defined
as the maximum-likelihood test). This maximum-likelihood
detector does not require the knowledge of the noise variance
1 A A
min av. error prob. = 2 P [X > | a = 0] + 12 P [X < | a = A]
| 2
{z } | 2
{z }
PFA PM
1
h X A/2 i
= 2 P p >p ;a=0
2
/N 2
/N
| {z }
standard
normal
random var.
h X A A/2 A i
+ 21 P p < p ;a=A
2
/N 2
/N
| {z }
standard
normal
random var.
r r r
N A2 N A2 N A2
1
= 2 Q 2
+ 21 2
= Q 21
| {z4 } | {z 4 } 2
standard standard
normal complementary cdf normal cdf
A2
N +
min av. error probability f (N )exp N 2 (19)
8
where f (N ) varies slowly with N when N is large, compared
with the exponential term:
log f (N )
lim = 0.
N N
Indeed, in our example,
r
N A2 1 N A2
N +
Q 2
q exp 2
4 NA 2
2 8
4 2
| {z }
f (N )
1
x+
Q(x) exp(x2/2) (20)
x 2
N 1
4 1 X
T (x) = x = x[n]
N n=0
with a threshold .
1
PD = Q Q (PFA) d .
Comments:
where
p(xn | a = 0) = N (0, 2)
h p(x | a = A) i
n (xn A)2 x2n 1 2
log = 2
+ 2
= 2
(2 A xn A ).
p(xn | a = 0) 2 2 2
Therefore,
A2
D p(Xn | 0) k p(Xn | 1) =
2 2
and the Chernoff-Stein lemma predicts the following behavior
of the detection probability as N and PFA 0:
A2
PD 1 f (N ) exp N
| {z 2 2 }
PM
A2/(2 2)
Hi : xn p(xn | i)
p(dn | 1) = PDd,n
n
(1 PD,n)1dn
| {z }
Bernoulli pmf
N
h p(d | ) i
n 1
X
log (d) = log
n=1
p(dn | 0
N h P dn (1 P )1dn i H1
D,n D,n
X
= log dn log .
1d
PFA,n (1 PFA,n) n
n=1
N
X
u1 = dn
n=1
PD (1 PFA)
>1
PFA (1 PD)
H1
u1 0 .
N
0
X N u1
PFA,global = P [U1 > | 0] = PFA (1PFA)N u1 .
0
u1
u1 =d e
0 1 M 1 = , i j = i 6= j.
H0 : 0 versus
H1 : 1 versus
.. versus
HM 1 : M 1
L(i | i) = 0, i = 0, 1, . . . , M 1.
M
X 1 Z
m(x) = L(m | i) p( | x) d
i=0 i
M
X 1 Z
= L(m | i) p( | x) d, m = 0, 1, . . . , M 1.
i=0 i
?
Xm = {x : m(x) = min l(x)}, m = 0, 1, . . . , M 1
0lM 1
n M
X 1 Z o
?
Xm = x : m = arg min L(l | i) p(x | ) () d
0lM 1 i
i=0
n M
X 1 Z o
= x : m = arg min L(l | i) p(x | ) () d .
0lM 1 i
i=0
M
X 1 M
X 1 Z Z
E x, [loss] = L(m | i) p(x | )() d dx
m=0 i=0 Xm i
M
X 1 Z M
X 1 Z
= L(m | i) p(x | )() d dx.
m=0 Xm i=0 i
| {z }
hm ()
hM
X 1 Z i hM
X 1 Z i
hm(x) dx hm(x) dx 0
m=0 Xm m=0 Xm ?
M
X 1 Z
m(x) = p( | x) d
i=0, i6=m i
Z
= const
| {z } p( | x) d
m
not a function of m
and
n Z o
?
Xm = x : m = arg max p( | x) d
0lM 1 l
n o
= x : m = arg max P [ l | x] (23)
0lM 1
or, equivalently,
n o
?
Xm = x : m = arg max [l p(x | l)] , m = 0, 1, . . . , M 1
0lM 1
1
0 = 1 = = M 1 =
M
is defined as follows:
FA P
z }| {
= P [x X1, | = 0] (simple hypotheses).
(i) H0 is true or
P [H0 | data] = P [ 0 | x]
bioinformatics and
sensor networks.
and denote by p1, p2, . . . , pM the p values for these tests. Then,
the Bonferroni method does the following:
pi < .
M
pi < threshold
V /R, R > 0,
FDP =
0, R = 0
FDR = E [FDP].
(i) Denote the ordered p values by p(1) < p(2) < p(m).
(ii) Define
i
li = and R = max{i : p(i) < li}
Cm m
where Cm is defined P
to be 1 if the p values are
m
independent and Cm = i=1(1/i) otherwise.
m0
FDR = E [FDP] .
m