Professional Documents
Culture Documents
Bayesian Networks
Stephan Dreiseitl
FH Hagenberg
Software Engineering & Interactive Media
1 / 43
Overview
Representation of uncertain knowledge
Constructing Bayesian networks
Using Bayesian networks for inference
Algorithmic aspects of inference
2 / 43
Worms
Umbrellas
3 / 43
Worms
Umbrellas
4 / 43
Burglary
P(b)
0.001
Alarm
MaryCalls
B
T
T
F
F
E
T
F
T
F
P(a | B, E)
0.95
0.94
0.24
0.001
JohnCalls
A
T
F
P(e)
0.002
P(m | A)
0.7
0.01
Lecture 11: Bayesian Networks
A
T
F
P(j | A)
0.9
0.05
5 / 43
6 / 43
n
Y
i=1
n
Y
P(Xi | X1 , . . . , Xi1 )
P(Xi | parents(Xi ))
i=1
7 / 43
8 / 43
C
Lecture 11: Bayesian Networks
B
Artificial Intelligence SS2010
9 / 43
10 / 43
11 / 43
P1
P2
C1
C2
P(X | P1 , P2 , A, B, D) = P(X | P1 , P2 )
Stephan Dreiseitl (Hagenberg/SE/IM)
12 / 43
P2
C1
C2
P(X | P1 , P2 , C1 , C2 , A, B, D) = P(X | P1 , P2 , C1 , C2 , A, B)
Stephan Dreiseitl (Hagenberg/SE/IM)
13 / 43
Noisy OR
For Boolean node X with n Boolean parents, conditional
probability table has 2n entries
Noisy OR assumption reduces this number to n: Assume
each parent may be inhibited independently
Flu
Malaria
Cold
Fever
14 / 43
Noisy OR (cont.)
Need only specify first three entries of table:
Flu Malaria Cold
T
F
F
F
F
T
T
T
F
T
F
F
T
F
T
T
F
F
T
F
T
T
F
T
P(fever)
0.2
0.1
0.6
1.0
0.1 0.6 = 0.06
0.2 0.6 = 0.12
0.2 0.1 = 0.02
0.2 0.1 0.6 = 0.012
15 / 43
16 / 43
LightsOn
HearDogBark
17 / 43
18 / 43
FamilyOut
LightsOn
DogOut
HearDogBark
19 / 43
LightsOn
F
T
F
P(l | F)
0.99
0.1
BowelProblems
P(f)
0.2
P(b)
0.05
DogOut
F
T
T
F
F
B
T
F
T
F
P(d | F, B)
0.99
0.88
0.96
0.2
HearDogBark
D
T
F
P(h | D)
0.6
0.25
20 / 43
21 / 43
Types of inference
Causal reasoning: query variable is downstream of events
P(heardogbark | familyout) = 0.56
Diagnostic reasoning: query variable upstream of events
P(familyout | heardogbark) = 0.296
Explaining away (intercausal reasoning): knowing effect
and possible cause, reduce the probability of other
possible causes
P(familyout | bowelproblems, heardogbark) = 0.203
P(bowelproblems | heardogbark) = 0.078
P(bowelproblems | familyout, heardogbark) = 0.053
Stephan Dreiseitl (Hagenberg/SE/IM)
22 / 43
by enumeration
by variable elimination
23 / 43
Inference by enumeration
FamilyOut example (d 0 {d, d}, b 0 {b, b})
XX
P(F | l, h) = P(F , l, h) =
P(F , l, h, d 0 , b 0 )
d0
P(f | l, h) =
XX
d0
b0
b0
= P(f )P(l | f )
P(h | d )
d0
P(b 0 )P(d 0 | f , b 0 )
b0
24 / 43
X
X
0
P(B) P(e ) P(a0 | B, e 0 )P(j | a0 )P(m | a0 )
e0
a0
25 / 43
Variable elimination
Eliminate repetitive calculations by summing inside out,
storing intermediate results (cf. dynamic programming)
Burglary example, different query:
P(J | b) =
X
X
X
0
0
0
0
P(m0 |a0 )
P(b) P(e ) P(a |b, e )P(J|a )
e0
a0
|m
{z
=1
26 / 43
27 / 43
LightsOn
F
T
F
P(l | F)
0.99
0.1
BowelProblems
P(f)
0.2
P(b)
0.05
DogOut
F
T
T
F
F
B
T
F
T
F
P(d | F, B)
0.99
0.88
0.96
0.2
HearDogBark
D
T
F
P(h | D)
0.6
0.25
28 / 43
29 / 43
P(f | h) = 0.1416
30 / 43
31 / 43
Likelihood weighting
Fix evidence and sample all other variables
This overcomes rejection sampling shortcoming by only
generating samples consistent with the evidence
Problem: Consider situation with P(E = e|X = x) = 0.001
and P(X = x) = 0.9. Then, 90% of samples will have
X = x (and fixed E = e), but this combination is very
unlikely, since P(E = e|X = x) = 0.001
Solution: Weight each sample by product of conditional
probabilities of evidence variables, given its parents
Stephan Dreiseitl (Hagenberg/SE/IM)
32 / 43
33 / 43
X
X
34 / 43
Markov chains
A sequence of discrete r.v. X0 , X1 , . . . is called a Markov
chain with state space S iff
P(Xn = xn | X0 = x0 , . . . , Xn1 = xn1 )
= P(Xn = xn | Xn1 = xn1 )
for all x0 , . . . , xn S.
Thus, Xn is conditionally independent of all variables
before it, given Xn1
Specify state transition matrix P with
Pij = P(Xn = xj | Xn1 = xi )
Stephan Dreiseitl (Hagenberg/SE/IM)
35 / 43
36 / 43
1
1
2
3
4
2
1
4
0.8
0.6
0.6
0.4
0.4
0.2
0.2
200
400
600
800
1000
200
400
600
800
1000
37 / 43
38 / 43
Gibbs sampling
Fix evidence variables to e, assign arbitrary values to
non-evidence variables X
Recall: Markov blanket of a variable is parents, children,
and childrens parents.
Iterate the following:
pick arbitrary variable Xi from X
sample from P(Xi | MarkovBlanket(Xi ))
new state = old state, with new value of Xi
39 / 43
40 / 43
41 / 43
42 / 43
Summary
Bayesian networks are graphical representations of causal
influence among random variables
Network structure graphically specifies conditional
independence assumptions
Need conditional distributions of nodes, given its parents
Use noisy OR to reduce number of parameters in tables
Reasoning types in Bayesian networks: causal, diagnostic,
and explaining away
There are exact and approximate inference algorithms
Stephan Dreiseitl (Hagenberg/SE/IM)
43 / 43