You are on page 1of 6

1

Intelligent Skin Model Selection for Face Detection


Setiawan Hadi1 Adang Suwandi A2 Iping Supriana S3 Farid Wazdi3
1
Mathematics Department
Universitas Padjadjaran, Jalan Dipati Ukur 35 Bandung, Indonesia
Tel: +62-22-2503271, Fax: +62-22-77946963, E-mail:setiawanhadi@ieee.org
2
Department of Electrical Engineering
Insitut Teknologi Bandung, Jalan Ganesha 10 Bandung, Indonesia
Tel: +62-22-3687802, Fax: +62-22-365-7443, E-mail: asaisrg@yahoo.com
3
Informatics Engineering Department
Insitut Teknologi Bandung, Jalan Ganesha 10 Bandung, Indonesia
Tel: +62-22-3687802, Fax: +62-22-365-7443, E-mail: iping@Informatika.org, faridwazdi@informatika.org

Abstract facial image. Several skin models have been generated.


Then, a skin model is used for detection of a single face
We propose an algorithm for selecting appropriate skin image in three color representation. However, the problem
model that is used for face localization in digital image. raised if we use different skin model for same face image.
Nine skin models that have been created from previous Experiment showed that using inappropriate skin model
experiment are analyzed. Statistical information related to will give a false positive or false negative detection result.
the skin models are collected and examined. The problem
In this paper, analyzing and evaluating the skin models
that wants to be solved is how to choose appropriate and
have been performed and construction of an algorithm for
accurate model which give the best result in the detection
selecting the best model is generated. To improve the
stage. Several preprocessing steps have been implemented
algorithm, pre-processing steps are involved such as spatial
to enhance the result. The method used for final face
noise filtering approaches and morphological filters. For
detection is adaptive k-means using validity function for
localizing face, k-means clustering algorithm has been
centroid determination. Experiment has been conducted
implemented with improvement in centroid selection by
using single and multiple face images.
evaluating a validity function [3] that checked the optimal
Keywords: result of the centroid location.
Face detection, skin color model, artificial intelligence, k-
means, centroid determination Approach and Methods

Introduction General Framework


This research is focused on the intelligent selection of skin
Computer-based face detection has drawn much attention models to obtain appropriate and accurate face detection
because it simulates the human ability to detect face of result. Generation of skin models has been explored and
people by utilizing a machine (computer). Application of reported in [2]. Figure 2 shows general framework of our
face detection is wide and varies such as for face approach for face detection that evaluates the result based
identification, head gesture recognition and human-head on every skin model that is used.
reaction understanding. There are a number of face
locating methods have been proposed and published in the
literature which can be categorized into two approaches: (i)
geometric-information-based approaches and (ii) skin- Sk in M ode ls
color-information-based approaches.
There is no approach that is able to show a perfect and fully
accurate result. Each approach has its advantages and
disadvantages based on the internal and external situation Pr obe I m a ge
and condition. However, color-information-based
approaches have been proven to offer several advantages
over geometric-information-based approaches in terms of
robustness under partial occlusion, rotation in depth, scale Figure 1 General Framework
changes, resolution changes, and different lighting
conditions [1].
In the previous experiment [2], mathematical model of skin Probe image is input for the process, and each skin model is
color has been developed and implemented using single selected and evaluated for obtaining the best result. Several
2

detection parameters are implemented and explained in the Face localization methods
next part of this paper.
Face localization is to detect face or faces in the image after
Thresholding in skin-color detection previous preprocessing steps. After skin blobs have been
detected, we try to estimate how many faces are in the
Detection of skin color in digital image is performed by
image and put line boundary (ellipse or rectangular) to the
applying skin color model to the image. This is done with
detected faces. This can be performed using k-means
assumption that distribution of skin color is located in a
clustering algorithm or Hough transform for ellipse/circle
collection and can be modeled using a Gaussian distribution
(or normal distribution) as G1 = (m1 , V12 ) where
( )
detection. Other methods such as template matching and
curve detection can be also used including image
m1 = r , g projection.

rr rg
The algorithm for k-means classification is a widely known
V1 =
gr gg
algorithm for classification which is capable of providing
useful performance, although it does have some limitations.
If we have a multivariate input data set X which is defined
as an M x N matrix, there are M input points in N-
m1 and V1 are mean and covariance of skin color in related dimensional space. It is assumed there exist k compact
color space respectively. For example in RGB space, for classes of data, where k < n. The data is classified by
each image pixel, it is necessary to quantify the possibility allocating each data point to a class and then iteratively
of the pixel belonging to the skin segments. This can be moving the data points between classes until we obtain the

d ( x, y ) = [ I ( x, y ) m1 ]V11 [ I ( x, y ) m1 ]
realized by calculating tightest overall cluster of points in each class. The specific
algorithm is defined as follows:
1. Choose the number of classs k.
where I ( x, y ) is the color intensity vector at ( x, y ) in r-g 2. If not supplied, randomly determine a set of k class
color space and d ( x, y ) is a similarity map function. centers from the data.
3. Classify each data point into the nearest class.
To determine whether a pixel is a candidate pixel of the
4. Compute the sample mean of each cluster.
face region, a threshold d is set in advance. Threshold value
5. Reclassify each data point to the cluster with the
can be calculated based on the mean and standard deviation
nearest mean.
of each skin color model. There are two interpretation
6. If the change in the mean is small enough, stop.
methods of standard deviation: (i) based on the empirical
Otherwise go to step 4.
rule, and (ii) based on the Chebyshev theorem. In empirical
rule, if a sample of observation has a mound-shaped Figure 2 illustrated the process of clustering using k-means.
distribution, the interval of the distribution is 68%, 95% and The algorithm is trying to find the best centroids of the
99.7% for k=1,2,3 consecutively, where k is constant pixels collection. This process can be relatively easy to
multiplier for standard deviation in Gaussian distribution perform by human using his intuitive and knowledge.
interval. In Chebyshev theorem, the proportion of
observation in any sample that lies within k standard
deviation of the mean is at least (1 1/ k 2 ) for k > 1 .

Filtering and filing


The result of skin detection is binary image that may
contain noise pixels. The noise can be eliminated using
three methods, (i) simple noise removal, where each orphan
pixel is categorized as noise and be removed (set pixel to
0), (ii) component labeling noise removal, where every
pixel and its 4 or 8-neigbouring pixels are labeled. Every
label that less than threshold may be considered as noise
and eliminated, (iii) kFill noise removal, which is designed
to reduce isolated noise and noise on contours up to a
selected limit in size. Detail implementation of this method
can be seen in [4].
Morphological filters are commonly used for noise
reduction and feature detection, with the objective that
noise be reduced as much as possible without eliminating Figure 2. Clustering Visualization
essential features. In this research dilation and erosion
morphological filter are utilized to eliminate noise in a
digital face image. Since different values of the k parameter can lead to
different results, we iterate the k-means algorithm with
3

k [2, n] , and then select the optimal k value as the one space and CbCr for YCbCr space) are calculated to measure
the 2 dimensional data of skin chromaticity for each skin

1+ C
minimizing the validity function V , defined as:
color model.
V= +
k'
J' 1
min
Table 1 shows of the statistical analysis result for all skin
i =1
2
N i models in r-g space.

where k ' represents the number of good clusters, i.e. Table 1. Skin Model Statistical Values

{ }
clusters that are not too small, J ' is k-means objective

min = min i j ( i , j ) is the minimum distance between


function, but now it only takes into account good clusters, Ob Standard Cova
Mean
ject Deviation riance
SM1 0.4423, 0.3310 0.0316, 0.0118 -0.0002
cluster centroids, and Ci is the cardinality of cluster Ci . SM2 0.4394, 0.3130 0.0067, 0.0045 0.0000

function J ' divided by min


The first term of above equation represents the goal
SM3 0.4337, 0.3102 0.0125, 0.0049 -0.0001
2
, i.e. clusters well separated
SM4 0.4612, 0.3044 0.0172, 0.0062 -0.0001
provide better solutions, whereas the second term
represents a penalty factor for small clusters. SM5 0.4258, 0.3154 0.0076, 0.0037 0.0000

SM6 0.4158, 0.3166 0.0110, 0.0057 -0.0001


Skin models for experiment
SM7 0.4253, 0.3026 0.0139, 0.0055 -0.0001
Skin model images that are used for this experiments have
generated from nine sets of face databases, that are SM8 0.3957, 0.3493 0.0125, 0.0047 0.0000
collected from various resources such as UNPAD students SM9 0.4057, 0.3275 0.0114, 0.0043 0.0000
and lecturers database, UNPAR students database, ITB
lecturers database and FERET color database. The number
and type of images are vary and also the type of face Referring to table 1, we can obtain information that every
ethnicity. The detail explanation of these features and how skin model has zero covariance, which means that the value
to generate them can be viewed in [2]. Figure 3 show the of color component in each color space is independence
skin models that are used in this experiment. (change of one color element will not influence the value of
other color component). The distribution of skin color in
all skin color models is Gaussian which is illustrated in the
figure 4.

Red Channel Green Channel Blue Channel


Figure 4. Histogram of skin chromaticity of Skin Model 2

Evaluation criteria for skin model selection


We try to evaluate which skin models giving the accurate
result (give the correct face detection result). Based on the
framework explained before, we designed an algorithm for
face detection which is built from combination of several
steps, which each step has its own parameters.
After skin detection, the resulting detected image is
Figure 3. Skin Model Images evaluated based on the image profile thru projecting image
in both directions, vertical and horizontal. Projections of a
binary image indicates the number of 1 pixels in each
The nine skin models, then, are evaluated by estimating the column or row (or diagonal if necessary) in that image.
statistical distribution for every color model used in this
research (Normalized RGB, HSI and YCbCr). First is Kotropoulos and Pitas [7] presented a rule-based
calculating average by simply enumerate all pixels in every localization method based on the projection method. First,
color space and perform average calculation of each color facial features are located with a projection method to
element intensity. After that calculate standard deviation to locate the boundary of a face. Let I ( x, y ) be the intensity
estimate the spread of data. Covariance of each skin color value of a m x n image at position ( x, y ) , the horizontal
chromaticity (rg for normalized RGB space, hs for HIS and vertical projections of the image are defined as
4

HI ( x) = I ( x, y )
n
6a
y =1

VI ( x, y ) = I ( x, y )
m

x =1
The horizontal profile of an input image is obtained first,
and then the two local minima, determined by detecting
abrupt changes in HI, are said to correspond to the left and
right side of the head. Similarly, the vertical profile is
obtained and the local minima are determined for the 6b
locations of mouth lips, nose tip, and eyes. These detected
features constitute a facial candidate.
Although projections occupy much less memory that the
image they were derived from, they still contain essential
information about it.
Figure 5 shows the illustration of binary image projection
for image in the figure 6a which is detected using skin
model 8 with selected parameters. 6c

6d

Figure 5. Vertical and Horizontal Projection of an Image


6e
We can see that there are 7 filled-curves in Horizontal
projection and only one filled-curve in the vertical
projection. So, by analyzing the projection model of the
detected image, we can decide whether a skin model is
accurate or inaccurate (skip to next skin model).

Results of experiment
6f
As mention above, nine skin models have been used to
evaluate which one is the best model. Experiment has been
conducted by applying the models into a selected face
image. Figure 6 below illustrates this process. Figure 6a is
selected face image that will be detected. The content of
this image is processed with skin model 1 (SM1) to skin
model 9 (SM9) in three color space. The results of the
detection are shown in Figure 6b to 6j consecutively. Figure 6g
6b is the result of applying skin model 1 to the original
image, Figure 6b is the result of applying our algorithm
using skin model 2 and so on. The selected parameters that
are used in figure 6b to 6j are, (i) Color model is in
Normalized RGB, (ii) Confidence level of detection is
93.75% Chebyshev theorem (iii) Centroid initial location is
in the beginning of the original image.
5

Table 2. Pixel match counting result


6h Pixel Match
Ob ject
NRGB HSI YCbCr
SM1 144219 111535 153340
SM2 7667 61158 15268
SM3 15279 50707 101321
SM4 29179 60302 30867
SM5 8989 57735 46875
SM6 27194 86667 70202
6i SM7 15342 141358 131120
SM8 26217 56570 28260
SM9 58849 91769 101216

The result of face localization algorithm performance using


k-means is presented in table 3. It is shown that the skin
6j model 8 give the best result in normalized RGB color
model. Using this setup (the skin model and selected
parameters), all faces are able to be detected and cropped
accurately, as we see in the figure 7.

Table 3. Result of Face Detection using Selected


Parameters (See explanation in the text)
%Accuracy
Figure 6. (from top to bottom); 6a is the original Image, Ob ject
NRGB HSI YCbCr
figure 6b to 6j are the result of detection original image
SM1 n.a. n.a. n.a.
using SM 1 to SM 9.
SM2 50% n.c 36%
SM3 57% 32% n.a.
The result of detection is in binary image format; it means SM4 50% 21% 68%
there are only two value of intensity available in the SM5 39% 46% 51%
resulting image. Figure 6e and 6i show representative SM6 n.c. 32% 32%
result. Face images are detected, although there are noise SM7 29% n.a. n.a.
pixels detected around targeted face. There are two types of
SM8 100% n.c. 71%
errors in detection process; they are false negative and false
SM9 14% 7% n.a.
positive. A false negative, also called a miss, exists when
detection reports, incorrectly, that a face was not detected
when, in fact, it was present. A false positive, also called When performing experiment, the usage of several skin
false alarm, exists when detection reports, incorrectly, that models in the detection stage gave a not convergence
it has found a face where none exists in the image. condition (n.c.), which means that the centroid cannot be
On the table 2, information about the number of matched found after a number of iteration. In other word, objective
pixels (true positive) is presented in three color function can not be reached after a threshold number of
representation. The information of pixel quantities does not iteration. Meanwhile, n.a. means that the centroid number
represent the performance of an algorithm. It means that the is too many (threshold is > 100).
larger number of pixels can not be interpreted that the skin
model used is the best skin model for the detection.
However, from the table we can obtain the general
information about the performance of the models.
In the experiment, the image size is 631x244 of PNG pixel
format (24bpp). The size of skin model is 100x100 in
24bpp BMP format. The process of counting is simply
enumerate the binary image for the matched pixels (pixels
that is located in the Gaussian distribution curve of the
model) Figure 7 Multiple Face Detection Result
based on the Selected Parameters

Figure 8 shows another result of our algorithm. In this case


we use different color space, which is YCbCr space.
6

Conclusion
1. A combined algorithm for detecting faces in an image
has been proposed and successfully implemented using
multiple face image
2. A simple evaluation criteria for skin model selection,
based on image profiling in horizontal and vertical
projection, has been experimented.
3. Nine skin models have been explored and used in the
experiment for selecting the best model that give the
best face detection result.
4. Several preprocessing steps in image processing such
statistical thresholding using empirical and
Chebyshevs rules, filtering using sand and pepper
noise filtering have been implemented and can be used
for enhancement of the targeted image
5. Face localization technique based on intelligence k-
means clustering algorithm has been implemented
successfully.

Acknowledgement
This research is funded by Indonesian Government
thru BPPS and supported by Universitas Padjadjaran
and Institut Teknologi Bandung.

References
Figure 8. Another Detection Result using YCbCr space [1] Jinshan Tang and Scott Acton, Locating human
faces in a complex background including non-face
skin colors. Journal of Electronic Imaging, vol. 12
Discussion no. 3. pp. 423-430, 2003.
[2] Setiawan Hadi et. Al., Mathematical Model Of
In this research, there is a step for eliminating non-face Skin Color For Face Detection, International
skin object such as finger, hand, arm and neck, which Conference on Applied Mathematics 2005, Institut
should be explored next. There are several methods Teknologi Bandung Indonesia, 22-26 August 2005
that can be used such as (i) using template matching [3] Haria Bartolini, Paolo Ciaccia, Marco Patella,
with pyramids that run over the entire image (ii) by WINDSURF: a Region-Based Image Retrieval
detecting curve boundary between face and hair and System. DEIS-SITE-CNR, University of Bologna,
(iii) detecting face features such as eyes, noose and Italy
mouth.

[4] Michael Seul, Lawrence OGorman and Michael J.


Evaluation criteria for selecting appropriate model Sammon, Practical Algorithms for Image Analysis,
have a high dependency to skin model characteristics. Cambridge University Press, 2001.
Different skin model will give different result.
[5] Setiawan Hadi, Laporan Pengembangan Metode 2
However, to make the system more intelligence,
dan Percobaan/Analisis/Survey 2, Semester 2
several methods still has to be investigating, such as
2004-2005, Departemen Informatika ITB, 2005.
advance neural network and advance machine learning.

[6] Greg Hamerly and Charles Elkan, Learning the k


K-means clustering algorithm has a limitation if the
in k-means, Neural Information Processing Society
number of data (pixels in this case) is not large enough.
(NIPS), 2003
Other methods should be explored to detect multiple
faces, such as template matching and geometric-based [7] C. Kotropoulos and I. Pitas, Rule-Based Face
detection. Detection in Frontal Views, Proc. Intl Conf.
Acoustics, Speech and Signal Processing, vol. 4,
pp. 2537-2540, 1977

You might also like