You are on page 1of 4

UNSUPERVISED ALGORITHM FOR THE SEGMENTATION OF THREE-DIMENSIONAL

MAGNETIC RESONANCE BRAIN IMAGES


A. S. Capelle, 0.Alata, C. Fernandez-Maloigne

J. C. Ferrie

Laboratoire IRCOM-SIC - UMR CNRS 6615


Univ. de Poitiers - Biit SP2MI
86962 Chasseuneuil-Futuroscope,France

CHU of Poitiers
Unite IRM, Scanner
BP 577,86021 Poitiers cedex, France

ABSTRACT
This paper presents a multiple resolution algorithm for the
segmentationof three-dimensionalmagneticresonance (MR)
images. The algorithmconsists in the unsupervised segmentation of the MR volume into regions of different statistical behavior. Firstly, an unsupervised merging algorithm
estimates a block segmentation of the volume while determining the region number and the parameters of those regions. This estimation is computed by minimizing a global
information criterion. Next, the small regions are eliminated using statistic criteria. Finally, the segmentation is performed using the neighboring relationships between voxels
via Hidden Markov Random Fields and a Multiple Resolution Iterated Conditional Mode algorithm. Some results on
volumetric brain MR images are presented and discuted.
1. INTRODUCTION
Magnetic Resonance (MR) imaging is commonly used
for anatomical observations of brain and the diagnostic and
treatment of brain diseases. Particularly, observations of pathological regions and measurements of the tumoral response (volume, degree of necrosis) have a p a t influence on the
choice of therapies. Thus, segmentation of MR images into
different tissues, i.e. gray matter, white matter, cerebrospinal fluid, tumors or ecchymosis, is a prerequired step.
Many unsupervised segmentation techniques have been
studying in the literature [l]. In [2], Bouman proposed an
unsupervised method for segmentation of two-dimensional
textured images into regions of different statisticalbehavior.
Each region is characterized by its mean, its variance and
a stochastic parameter model for second order stationary
random fields. This algorithm is based on the unsupervised
block estimation of the number of regions and their different parameters by minimizing a global Information Criterion (IC). Multiple Resolution Iterated Conditional Mode
(MR-ICM) algorithm is used to estimate the final segmentation. Moreover, neighboring informations between pixels
are introduced in the model via a Hidden Markov R a d o m
Fields (HMRF).

0-7803-6725-1/0 1/$ 10.00 02001 IEEE

1047

The aim of our work is to separate a data volume made


of a stack of MR brain images into regions of different statistical behavior. The MR volume used is reduced to the
brain ones, thanks to a first segmentation [3] of the original
data. A common hypothesis [ 11 is to consider that each brain
region can be modeled by an independent and identically
distributed (i.i.d.) Gaussian process defined by its mean and
its variance. For this reason, we have extended Boumans
algorithm into a new model-based segmentation method for
3D M R brain images. In this algorithm, we have included
a new stage which eliminates the small classes in order to
avoid oversegmentation.
The section 2 presents the stochastic model-based segmentation algorithm. The section 3 successively describes
the statisticalmodel used (3. l), the IC minimization method
(3.2),and the tiny class deletionalgorithm (3.3). Finally, the
section 4 provides some segmentation results.
2. STOCHASTIC MODEL-BASED APPROACH
2.1. Information criterion and MAP estimation

An unsupervised segmentation of an image is generally


composed of two parts: the first one provides an estimated
model for the image and the second one estimates the segmentation of the image given the estimated model. In this
section, we shortly describe the different steps of the proposed algorithm.
We assume that the MR volume is constituted of m classes, each class C,, i = 1, . . . ,In, is modeled by an i.i.d.
Gaussian process defined by p, and o:, respectively the
mean and the variance. Let r = {C,, . . . , Cm} be the set of
classes of the volume. The volume distribution can then be
considered as a Gaussian mixture defined by the number of
classes, the mean, the variance and the a priori probability
a, of each class in the mixture model. Hence, one particular class is completely defined by 0, = {a%,
p,, CT,}.Let
0 = {e,, . . . ,e,,} be the theoretical model associated
with the MR volume. The estimation & of 0 is computed by minimizing a global IC in the first step of our algo-

rithm.
In the second step, we estimate the segmented field by a
maximumaposrerion(MAP) estimation. Lets define y the
observation, Y its associated random field and z a realization of the segmented random field X . Then the segmented
field d given the model 0is:
2 - 4 4 . 4 ~= argmax{P(X
z

(b) Deletion of :mall classes and estimation of the


final model 0.

2. MAP estimation Of the segmented field by a MR-ICM


algorithmusing 0.

3. PARAMETER ESTIMATION

= z(Y = y)}

= argmp{- logf(ylz) - logf(S))

(1)

where f(ylz) = P(Y = ylX = x),f(z)= P ( X = x).


In order to use the neighboring relationships between the
voxels in the segmented field X during the MAP estimation, we assume that X is a HMRF.Thus, its distribution
is a Gibbs distribution [4]. The Iterated Conditional Mode
(ICM)algorithmis an efficient method [2] to estimate d .t4.4p
by minimizing (eq: 1). In order to accelerate the computation time and to improve the convergence of ICM, we use a
multiple resolution ICM algorithm. Starting from an initial
block segmentation at coarse resolution, the segmentationis
successively derived towards the finest resolution.
2.2. Proposed algorithm

Due to the multiple resolution algorithm we estimates


the model 0 at coarse resolution of the images. The volume is then divided into blocks of equal size and it is assumed
that each block initially constitutes one unique class. The
block statistical model expression is detailed in (3.1). The
relationships between blocks are ignored and the blocks are
supposed independent. The model 0 is then estimated by
minimizing the global information criterion IC(77t) (eq. 4).
This minimization process is described in (3.2).
Experiments made on different MR volumes show that
the estimated number of classes is too large and leads to an
oversegmentation of the MR volume. For this reason, we
propose to add an iterative process that eliminates classes
without enough voxels. Each class deletion is followed by
updating W .
This step is repeated until each ai, i = 1, . . . ,
m, is greater than a given threshold. The deletion step is des-bed in the section (3.3).
0 being estimated, the segmentation is carried on by maximizing the MAP criterion using neighboring informations,
via HMRF and multiple resolution ICM (MR-ICM) algorithm.
The different steps of the proposed segmentation algorithm are:

In this section, we describe some particular aspects of


the algorithm. We present the statistical model we chosed,
then we introduce the information criterion minimization
process and finally the deletion step.
3.1. Statistical model

ven

We supposed that the distribution of the MR volume giis a Gaussian mixture model defined by [5]:

c
,.

P ( Y = yl0) =

a&(Y = y p i )

(2)

i= I

Let S be the lattice, composed of 72 voxels, associated to the


random field Y = {Ys},
s E S C Z3. As each k, is an
independent random variable, we have

P(Y = 2/10) =

l-I C a i p i ( Y s= Ysl0a)
SES

( *

i=l

(3)

As Bouman [ 2 ] ,we chose to firstly divide the MR volume into M blocks of equal size. The lattice S is then composed of a partition set { A , + }of A4 blocks fork = 1 , . .. , M.
We assume that each block Ak , IC = 1, . . . A4, corresponds
to one unique class. The mean and the variance of each class
is respectively estimated as the empirical mean and the empirical variance. Moreover, we assume that each block is independent. Thus, the initial apriori probability of any class
is equal to l/M. It should be notice that this is not equivalent to assume that each block is an independent random
variable. However, the resulting distribution is still a multinomial distribution, parametrized by the npriori probability
of each block.
Thanks to these hypothesis, we have at our disposal an
initial block estimation of the mixture model 0 . Then, the
is estimated by blocks merging during the
block model
IC minimization stage. In the end of the article, we uses the
notation 0 for the mixture model.
~

3.2. Information Criterion (IC) minimization

1 . Estimation of the model 0:


(a) Initial block estimation of 0 by minimizing
the information criterion I C ( m ) based on the
hypothesis of block independence,

lo48

The problem is to jointly estimate the model 0of the


mixture and the number 7nof classes in the mixture.
can
be estimated, for a fixed value of m, via a Maximum Likelihood Estimation (MLE). However, the MLE does not allow
to jointly estimate 7n and 0 because such estimation will

generally give a large number of classes, theoretically equal


to the number of data, which is unacceptable. One possibility [5]is to estimate 0" by minimizing an IC which is the
MLE of the model penalized by a term depending on the
number of observed data and the number of free parameters in the model. Different informationcriteria exist in the
literature as the Akaike IC [6] used in [2] by Bouman, the
Bayesian IC (71 or the $0 IC [8].Because it has been proved that the Akaike IC is not consistent, we choose to use
$0 IC. Its general form [8] is:

I G ( m ) = -2logP(YI@)

+ C+,(n).p

(4)

where

- p = 3772 - 1is the number of free parameters in e",


- C,, (72) = lOg(lOg(7Z)) with

q$pkM-qgfp.

However, the estimation of the mixture model by minimizing I C ( m ) is still substantially difficult. Bouman proposed to use the modified criterion :

I6")
+ c,, (n).p

(5)

8, = ( a a , p a , g : ) ( Jby) the empirical frequency,mean and variance of the current block


partition {A:?-') 1for i = 1, . . . , 7 ? ~ ( 3 - 1 ) ,
2. the merging of the two classes:
For each couple of classes { c k , q},
(k,l) E A = [l;.. , ? ? L ( j - ' ) ] X [l;.. ,7R(J-')],
k # I, we compute the change caused in (eq: 5 ) by
the merging of the classes c k and
Let c ( k , l ) be
this cost. Thus, the merging of any classes C k and
Cl when the cost c ( k , I ) is negative, will decrease

c,.

(77~).

Consequently, we merge the two classes Cg and Cf


defined by c ( i , i ) = min(k,l)EA{c(k,l)< U}, and

= 7k?-l - 1,

<En},

(6)

where E, is the deletion threshold and R is ordered in increasing values of a b . The class elimination is made by matching and merging each class of R with one class of r. Let
c
k be a class to be eliminatedand cl its matching class. The
merging of c k with cl is the one which penalizes the less
the quality of the estimated model. Thus, c ( k ,I) verifies:
(7)

1. the estimation of

&?

The aim of this step is to reduce the number of classes


when it is considered as too large and leads to oversegmentation of the M R volume.
At the end of the IC minimization step, Grn is the best
model according to I C (772). However, in the case of oversegmentation,we propose to improve the result of the segmentation by eliminating the regions which do not contain
enough voxels. This set of regions is defined by

c(k,I) =minc(k,i) i = I , . . . ,7jL;i # k

The estimation of Om is an iterative process which successively minimizes IC' (m)by three distinct procedures.During the estimation of one of the parameters, the others are
supposed to be fixed. Let j be a particular step of the prccess. The three procedures are then:

IC'

3.3. Class deletion algorithm

~={Ck,vk=1,"',7?LlCkk

In [8],accurate estimation results are obtained by using


to this value.
the lower bound of 9.Thus, we fixed here ,!l

IC' (m)= -2 log P ( Y,A I ,. . . ,Am-l

These three procedures are repeated until the merging


cost is greater than or equal to zero for all the couples of
classes.
For more precise details, we invite the reader to refer to [2].

kfl

3. the maximum likelihood estimation of the partition


set{A;)}fori= l , . . . ,7h(J).
The class assigned to Af)is the most likely class given the observed data.

1049

It is obvious that class merging increases IC'(?n),but we


estimate that the quality of the segmentation is improved
in case of oversegmentation. Moreover, the rising of energy in the model can be limited by the following modification: if c ( k , I ) > 6 with c ( k , l ) = min,c(k,i), i =
1, . . . ,f i ;i # k and 6 a restrictive threshold, then the merging process is not performed.
The deletion step is followed by updating the model
&. These operations are repeated until the a priori probability a, is greater than E,, for i = 1,. . . ,&
The final model being estimated, the segmentation is estimated by the MR-ICM.

4. EXPERIMENTS AND RESULTS


We tested the proposed segmentation algorithm about

15 MR volumes. The images were acquired on GE 1 , s Tesla


MR scanner of the General University Hospital (CHU) of
Poitiers. The acquisition type we used is the T1. The standard dimensionof a volume is about 256 x 256 x 100. The
dimension of an initial block is 4 x 4 x 4,allowing the creation of an initial model made of more than one hundred
thousand classes. The use of a mask which limits the region of interest on the brain [3] and a requantification of the
mean and the variance allow to decrease the block number
to about three hundred.

The number of classes I% obtained after the first estimation step ranges from 15 to 35 which leads to an oversementation of the volume. Thus. the class deletion steD is
neiessary. D+g this step, the number of classes is re!uced
to 7 with 7n E [5,9]for E,, =
and 6 = 2% of IC (7R).
The figure. a) shows two slices of a
volume.
We add on these slices the frontiersbetween the main brain
structures. These frontiers were manually drawn by an expert. The figure (1- b) represents the corresponding slices
segmented with our algorithm. Seven regions were found.
As we can see, the regions are consistent with the frontiers.
The figure. (1- c) represents the slices segmented with FCM
algorithm. We notice that, thanks to the Markovian regularization, the classes obtained with our algorithm are less
scattered than the ones obtained by the fuzzy one. Moreover, our algorithm manages to differentiate the tumor from
the white matter whereas these regions are merge together
with the FCM algorithm.

&

[6] H. Akaike, Informationtheory and an extension of the maximum likelihood principle, Second Int. Symposium on Inforrnation Theory, pp. 267-291, 1973.
171 G. s c h w a , Estimating the dimension of a model, Ann.
Statist., vol. 6 , pp. 461-464,1978.
[8] 0. Alata and C. Olivier, Order selection of 2-D AR model
using a lattice representation and information criteria for texture analysis, Signal Processing, EUSIPCO, pp. 1823-1826,
2000.

5. CONCLUSION

This article presents an unsupervised algorithm for the


automatic segmentationof three-dimensionalMRbrain images. It is divided into three distinct stages. The first one
estimates a preliminary Gaussian mixture model thanks to
a block merging algorithm, by minimizing an Information
Criterion. In the second stage, we propose to reduce the
class number by merging the smallest classes with respect
to the model quality. The last stage consists in estimating
the segmentationby a multiple resolutionIterated Conditional Mode algorithm. In our future work, we will study more
precisely the influence of the parameter 3 and the influence
of the deletion step on the convergence of the algorithm.In
particular, we will study the impact of different thresholds.
In the same time, an expert evaluation of the images will be
done by Dr. Ferrie, radiologist at the CHU of Poitiers. This
expertise will allows us to determine heuristics adapted to
3D M R images on the whole parameters of the algorithm.

- a - original slices

- b - slices segmented by the proposed algorithm

6. REFERENCES
[l] T. Gkraud, Segmentation des structures internes du cerveau en
imagerie p a r resonance magnetique tridimentionnelle, Ph.D.
thesis, Tklkcom Paris, ENST, 1998.
[2] C. Bouman and B. Liu, Multiple segmentation of textured
images, Transactions on Pattern Analysis and Machine intelligence, vol. 13, no. 2, pp. 99-113, 1991.
[3] A.S. Capelle, 0.Alata, C. Femandez, S. Lefevre, and J.C Ferne, Unsupervised segmentation for automatic detection of
brain tumors in nui, IEEE ICIPZOOO, vol. 1, pp. 613-616,
Sept 2000.
[4] S. Geman and D. Geman, Stochasticrelaxation, gibbs distribution, and the bayesian restauration of images, IEEE Trans.
Pattern Anal. Machine Intell., vol. PAMI-6, Nov. 1984.
[5] J. B a n g and J.W Modestho, Unsupervised image segmentation using a gaussian model, Con& Information Sciences
and Sysfems, Princeton, 1988.

1050

- c - slices segmented by FCM algorithm

FIG. 1 -. Slices 45 ( l e f ) and 75 (right) o f a volume composed


of IO0 slices ( in black: expertfrontiers )