You are on page 1of 42

Markov Random Fields and Gibbs Measures

Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Random Fields and Maximum Entropy


A Brief Tutorial the FRAME Model and Gibbs Learning

Julian Antolin Camarena


Department of Physics and Astronomy

Wednesday, November 20, 2013

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Coming Up

Markov Random Fields and Gibbs Measures


The Maximum Entropy Method
The FRAME Model
Maximum Satellite Likelihood Estimation

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Markov Random Fields


Boltzmann Distribution

Random Fields
A stochastic process is a set of random variables {Xt : t T }
with Xt taking values in a finite set St .
The joint probability distribution of the variables is
p(x) = P (Xt = xt , t T ),

x = (x1 , x2 , . . . , xn ).

Let T be the set of nodes of a graph, G, and Nt the


neighborhood of t, i.e. the set for which (t, s) share and edge
in G, then the processes is said to be a Markov random field
(MRF)if
i p(x) > 0 for all x
ii for each t and x
P (xt |{xs , s G t}) = P (xt |{xs , s Nt }).
J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

J. Antolin Camarena

Markov Random Fields


Boltzmann Distribution

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Markov Random Fields


Boltzmann Distribution

The neighborhood, Nt , of a node t must satisfy the following


properties:
i A site is not its own neighbor: t
/ Nt .
ii The neighborhood property must reciprocate:
t Ns s Nt

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Markov Random Fields


Boltzmann Distribution

A clique is an ordered subset of nodes of the graph: C G.


Exmples are
Single-site: C1 = {t|t G}
Pair-site: C2 = {{t, s}|s Nt , t G}
Triple-site:
C3 = {{t, s, r}|t, s, r G are neighbors of one another}

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Markov Random Fields


Boltzmann Distribution

In statistical physics the Boltzmann distribution is given by


p(x) =

1 H(x)
e
;
Z

Z =

eH(x) .

In the MRF literature the Boltzmann distribution is called the


Gibbs measure or distribution.

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Markov Random Fields


Boltzmann Distribution

Hammersley-Clifford Theorem

Theorem
X is a Markov random field on G with respect to N if and only if
X is a Gibbs random field on G with respect to N .
The proof is omitted.
In plain English: all MRF distributions can be written as a Gibbs
distribution.

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Markov Random Fields


Boltzmann Distribution

A Gibbs distribution has the clique factorization property:


X
hc (x);
H(x) =
cC

that is, the sum is over the local energy functions of each
clique.
A GRF is said to be homogeneous if hc (x) is independent of
the relative position of the clique, c and
isotropic if hc (x) is independent of the orientation of c.

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Markov Random Fields


Boltzmann Distribution

Sometimes it is convenient to write H(x) as the sum over


cliques of equal size. For example, for cliques up to size two:
X
XX
H(x) =
h1 (xt ) +
h2 (xt , xs ),
tG

tG sNt

which is the form of a much celebrated model in statistical


physics.

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Markov Random Fields


Boltzmann Distribution

Ising Model

The Ising model of magnetism is a prototypical example of a


Gibbs random field. The Ising hamiltonian is
X
X
HI =
Jij i j
hj j ,
hi,ji

where hi, ji denotes pairs i, j in the same neighborhood.

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Markov Random Fields


Boltzmann Distribution

Ising Model Sample

Source: http://pages.physics.cornell.edu/ sethna/StatMech/ComputerExercises/Fig/CoarsenedBy0.gif


J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Markov Random Fields


Boltzmann Distribution

MRF model textures

Source: Statistical Image Processing and Multidimensional Modeling by Paul Fieguth, Springer 2012
J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Maximum Entropy Method

The ME distribution is maximally noncommittal with respect


to missing information and is solely dependent on available
data.
The resulting distribution is in the exponential family. More
specifically, it is a Gibbs distribution.
Remember, its not the true underlying distribution, it is
simply the best distribution that can be obtained from the
data that will, on average, yield the same statistics as the
data.

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

To construct it:
i Data is assumed to be a good estimate of the average value of
the measured function:
X
measurement of i (x) yields hi (x)i =
i (x)p(x)
x

ii Solve the optimization problem via Lagrange multipliers:


(
)
(P
X
x p(x) = 1
max
p(x) log p(x) subject to
P
p(x)
hi (x)i = x i (x)p(x)
x
iii Solving, one has the ME distribution:
p(x; ) p (x) =

1 Pi Ti i (x)
e
Z

where = (1 , 2 , . . . , N ).
J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Z satisfies
log Z
= hi (x)ip ,
i

2 log Z
= cov{i (x), j (x)}.
i j

The second property of Z says that the Hessian of log Z


positive semidefinite and is concave wrt and so is p(x; ).
Thus, given a set of consistent constraints the Lagrange
multipliers are unique.
The maximum likelihood estimate of the Lagrange multipliers
satisfies
dn
= hn (x)ip n ,
dt

J. Antolin Camarena

n = 1, 2, . . . , N

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Overview
We now discuss the paper Filters, Random Fields and Maximum
Entropy (FRAME): Towards a Unified Theory for Texture
Modeling [International Journal of Computer Vision 27(2), 107126
(1998)] by Zhu, Wu, and Mumford.
Given an input texture image
a set of filters is selected from a general set of filters;
histograms of the filtered image are calculated as they
approximate the marginals of the true underlying distribution,
f (I);
a maximum entropy distribution, p(I), is constructed
constrained by the marginal distributions of f (I)

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Filters
A filter is a system that performs mathematical operations on
an input signal to enhance or reduce desired features of the
input.
Linear space-invariant (LSI) filters are popular because
because they can be implemented with a convolution
operation. Let h be an LSI filters impulse response (filter
window/Green function) and x an input signal, then filtered
signal, is given by their convolution
Z
y(z) =
h(z 0 )x(z z 0 )dz 0

or

yn =

xnk hk .

k=
J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Laplacian filter

Lena filtered with Laplacian filter. Source:


http://asura.iaigiri.com/OpenGL/Image/LaplacianFilter/LaplacianFilter.png
2
2
L(x, y) = 2 + 2
x
y

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Gaussian filter

Source: Wikipedia
G(x, y; x0 , y0 , x , y ) = 21 e
x y

2 +(yy )2 /2 2 )
1 ((xx0 )2 /2x
0
y
2

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Laplacian of Gaussian

http://www.aishack.in/wp-content/uploads/2010/08/conv-laplacian-of-gaussian-result.jpg
LG (x, y; x0 , y0 , x , y ) = L(x, y)G(x, y; x0 , y0 , x , y )

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Model Assumptions and Definitions

The image I is a random field on a discrete lattice and is a


stationary process.
I contains sufficiently many pixels for statistical analysis.
Filters are denoted by F (k) , k = 1, . . . , K and the filtered
image by I(k) = I ? F (k)
Further, since I is stationary and the F (k) are LSI ,
I(k) = I ? F (k) is a convolution.

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

The histograms of I(k) are good approximations to the


marginals f (k) (I). They are vectors and are denoted H (k) .
Knowing a sufficient number of marginals we can build the
distribution.
The observed (input) image is denoted Iobs . The observed
(k)
filtered (by F (k) ) images are denoted by Iobs and the
(k)
corresponding histograms by Hobs . Similar notation is used
for the synthesized quantities.

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

The ME distribution depends upon the selected filter set SK


and the Lagrange multipliers K :
p(I; SK , K ) =

1
ZK

PK

n=1

T (n)

H (n)

We look for
K = argmax {log p(Iobs ; SK , K )}

(
= argmax log ZK
K

K
X

T (n)

(n)
Hobs

n=1

which is equivalent to
d(n)
(n)
(n)
= hHsyn
ip(I;SK ,K ) Hobs
dt
J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

FRAME Algorithm
Input a texture image Iobs .
Select a set of K filters, SK = {F (1) , F (2) , . . . , F (K) }.
Compute H (k) , k = 1, 2, . . . , K.
Initialize (k) 0, k = 1, 2, . . . , K.
Initialize I syn white Gaussian
noise texture.

(k)
(k)
1
While 2 hHsyn ip Hobs  for k = 1, 2, . . . , K
`1

Calculate

(k)
Hsyn

Update (k)

(k)

from Isyn , use it for hHsyn ip


(n)
(k)
by (k) = hHsyn ip Hobs . This updates p.

Samplea p(I; SK , K ) to update Isyn .


a

Gibbs, MCMC, etc.

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Filter Selection Algorithm


Let B be a general filter bank, S the set of selected filters, Iobs the
observed texture image, and Isyn the synthesized texture image.
Initialize k = 0, S , p(I) = U[0,G1] and Isyn U[0,G1] For
()

()

= 1, . . . , |B| compute Hobs from Iobs .


Repeat
()

()

Calculate Hsyn from Isyn .




()
()
d() = 12 Hsyn Hobs
Choose F (k+1) so that d(k + 1) = max{d() : F () B/S}
S
S S {F (k+1) }, k k + 1.
Update p(I) and Isyn with the FRAME algorithm.
Until d() < 
J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Reported Results: K = 0, 1, 2, 3, 6 filters

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Reported Results: histograms and Lagrange multipliers for


subband images

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Graphically, we have

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Overview

We now give a brief review of a follow up paper by Song Chun Zhu


and Xiuwen Liu, Learning in Gibbsian Fields: How Fast and How
Accurate Can It Be? [IEEE TRANSACTIONS ON PATTERN
ANALYSIS AND MACHINE INTELLIGENCE, VOL. 24, NO. 7,
JULY 2002]
The authors identify two major issues in Gibbsian learning:
1
2

the efficiency of likelihood functions, and


the variance in approximating partition functions using Monte
Carlo integration.

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

This paper proposes three algorithms for learning Gibbs


distribution parameters (Gibbsian learning):
1
2
3

A maximum partial likelihood estimator


A maximum patch likelihood estimator, and
A maximum satellite estimator.

They find that these algorithms have different benefits and


downfalls, but generally outperform standard MCMC Gibbsian
learning. They claim that the third algorithm offers the best
trade-off between accuracy and speed of estimation.

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

The Common Framework of Gibbsian Learning


Let IS be an input texture image and IS its boundary
conditions and S is the underlying lattice.
The feature statistics are h(IS |IS ).
The Gibbs distribution is
p(IS |IS ; ) =

1
Z(IS , )

eh,h(IS |IS )i

We wish to estimate

= argmax{G }

with G =

M
X

log p(ISi |ISi ; ).

i=1

Here Si i = 1, 2, . . . , M indicates that the lattice S has been


segmented into M regions.
J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

The Common Framework of Gibbsian Learning

The authors identify two choices that need to be made in the


Gibbsian learning problem:
1

The number, sizes, and shapes of the foreground patches Si


and corresponding backgrounds Si i = 1, 2, . . . , M .
The reference models used to estimate the partition functions.

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Choice 1: The foreground and background

The foreground pixels, Si and corresponding backgrounds Si i = 1, 2, . . . , M are


shown in light and dark shading, respectively. (a)-(c) are m m patches. In one
extreme the loglikelihood, G in (a) chooses m = N 2w and is used in MCMCMLE
methods. The other extreme in (c) chooses m = 1 and G is the pseudolikelihood
used in MPLE. The midpoint is shown in (b) and G is the lo=patch-likelihood. The
choice in (d) has M = 1 irregular patch, 1 , with pixels randomly selected, the rest of
the lattice is the background 1 and G is the log-partial-likelihood. In (b) and (c)
patches are allowed to overlap.
J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Choice 2: Reference model for estimation of Z

M

Now we need to estimate Z (Iobs
Si ) for each Si i=1 by Monte Carlo integration using a reference model at
= 0 :
L
syn obs
Z0 (Iobs
Si ) X h0 ,h(Iij |IS )i
obs
i
Z (IS )
e
i
L
j=1
syn
where Iij L
are typical samples of the reference model. The log-likelihood can be estimated iteratively by
j=1

gradient descent. The dashed line shows the inverse Fisher information and the solid curves show the variance in a
sequence of models approaching the true parameter value.
J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Algorithm 1: Maximizing partial likelihood (MPLE)

We choose S as in the figure by randomly selecting 1/3 of pixels as


obs
foreground. The log-partial likelihood is G = log p(Iobs
S1 |ISS1 ; ).

Maximizing G by gradient descent we update iteratively. This is the same


setup as in FRAME, although MPLE trades-off accuracy (lower Fisher info.)
for speed ( 25) in a better way than FRAME. This is mainly due to
FRAMEs image synthesis under nontypical conditions (initializing Isyn to
noise) and MPLE always has typical boundary conditions.
J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Algorithm 2: Maximizing patch likelihood (MPaLE)

The foreground is a set of overlapping patches from Iobs


and digs a hole Si
S
in each patch as in the figure. The patch likelihood is
G =

M
X

obs
log p(Iobs
Si |ISSi ; ).

i=1

Maximizing G by gradient descent we update iteratively. Algorithms 1 and


2 have similar performance.
J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Algorithm 3: Maximizing satellite likelihood (MSLE)


In contrast to algorithms 1 and 2, MSLE does not synthesize images online
(within the learning algorithm), which is computationally intensive.
We select a set of reference models in the exponential family:
R = {p(I; j ) : j , j = 1, 2, . . . , s}. Each model is sampled to synthesize
a large image. The log-satellite likelihood is given by
G =

s
X

G (j) (; j );

G (j) (; j ) =

j=1

M
X
i=1

obs
h,h(Iobs
S |ISS )i

log

(j)
Zi

and
(j)
Zi =

L
X
syn
Zj (Iobs
Si )
hj ,h(Iij` |Iobs
Si )i
e
L
`=1

is estimated by Monte Carlo integration. In the above the index 1 ` L runs


over the different realizations of the reference models; 1 j s runs over the
different models; and 1 i M runs over the foreground lattices. Maximizing
G by gradient descent we update iteratively.
J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Reported results: FRAME used as truth

J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Results

Top row: The difference between the two MSLE synthesized images is that the
result (b) ignores all boundary conditions, whereas (c) uses obeserved boundary
conditions.
Bottom row: was learned with MSLE for different hole sizes (a) m = 2; (b)
m = 6; and (c) m = 9.
J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

Summary of Algorithms

Group 1. In (a) ML estimators (FRAME, MPLE, MPaLE, MCMCMLE)


generate a sequence of satellites 0 , 1 , 2 , . . . , k
online.
Group 2. In (c) we see the maximum pseudo-likelihood uses a unform
model 0 = 0 to estimate any model and thus has large
variance.
Group 3. In (b) the MSLEs use a general set of satellites which are
precomputed and sampled offline. To save time, one can
obs
compute the difference d(j) = |h(Isyn
)| the index
j ) h(I
values that return the smallest s values correspond to satellites
that are closer to the truth.
J. Antolin Camarena

Random Fields and Maximum Entropy

Markov Random Fields and Gibbs Measures


Maximum Entropy
FRAME
Satellite Maximum Likelihood Estimation

THANK YOU!

J. Antolin Camarena

Random Fields and Maximum Entropy

You might also like