You are on page 1of 143

Ain Shams University

Faculty of Computer
& Information Sciences
Computer Science Department

A New Illumination Normalization


Approach for Face Recognition
A thesis submitted to the Department of Computer Science, Faculty of Computer and
Information Sciences, Ain Shams University, in partial fulfillment of the requirements
for the degree of Master of Computer and Information Sciences

By:

Ahmed Salah ELDin Mohammed ELSayed


B.Sc. in Computer Science,
Faculty of Computer and Information Sciences,
Ain Shams University.
Cairo, Egypt

Supervised By:

Prof. Dr. Taha I. ELAreif Dr. Haitham ELMessairy


CS Dept., Faculty of Computer and CS Dept., Faculty of Computer and
Information Sciences, Information Sciences,
Ain Shams University Ain Shams University

Faculty of Computer and Information Sciences


Ain Shams University
Cairo – 2009
Acknowledgement

In the name of Allah, most Beneficent, most Merciful: “And whatever of


comfort you enjoy, it is from Allah…” Al-Nahl (53). First and foremost, I humbly
give my deep thanks to Allah and my Parents for giving me the opportunity and
the strength to accomplish this work. Then, all thanks to students, colleagues,
relatives and all who pray for me to finish my work.

I would like to thank Prof. Dr. Taymoor Nazmy - the vice dean of the faculty of
Computer and Information Sciences, Ain Shams University- for his support. My
great thanks to Prof. Dr. Saied ELGhonaimy, Computer Systems Dept. – Faculty
of Computer and Information Sciences, Ain Shams University, for his
encouragement and support. Special thanks to Prof. Dr. Mohammed Hashem,
Head of Information Systems Dept. – Faculty of Computer and Information
Sciences, Ain Shams University, for his advices and support.

I would like to express my deep appreciation and thanks to all who supervise me:
Prof. Dr. Mostafa Seyam (god bless him), Prof. Dr. Taha I. El-Areif, Dr.
Khaled A. Nagaty and Dr. Haitham ELMessairy, for their great help and
encouragement during the execution of this work. Special thanks to Dr. Khaled
A. Nagaty for his follow, care, patience and support for me. Also, I would like to
thank my colleagues Mona Wagdy, Amr EL-Desoky, Kareem Emara,
Mahmoud Hossam and Mohammed Hamdy for their valuable help and
cooperation.

I would like to thank the team of “Face Recognition using Eigenface, NN and
Mosaicing Techniques” graduation project, Faculty of Computer and Information
Sciences, Ain Shams University, 2005, for their project which is used in
experiments of this work.

II
Publications

The work presented in this thesis has been published in the following conferences:
1. T.I. El-Arief, K.A. Nagaty, and A.S. El-Sayed, “Eigenface vs.
Spectroface: A Comparison on the Face Recognition Problems”,
IASTED Signal Processing, Pattern Recognition, and Applications
(SPPRA’07), Austria, 2007.
2. S. El-Sayed, K. A. Nagaty, T. I. El-Arief, “An Enhanced Histogram
Matching Approach using the Retinal Filter’s Compression Function
for Illumination Normalization in Face Recognition”, ICIAR’08,
Springer-Verlag LNCS 5112, pp. 873–883, Portugal, 2008.

III
Abstract

Although many face recognition techniques and systems have been proposed,
evaluations of the state-of-the-art techniques and systems have shown that
recognition performance of most current technologies degrades due to the
variations of illumination. In the last face recognition vendor test FRVT 2006,
they conclude that relaxing the illumination condition has a dramatic effect on the
performance. Moreover, It has been proven both experimentally and theoretically
that the variations between the images of the same face due to illumination are
almost always larger than image variations due to change in face identity.

There has been much work dealing with illumination variations in face
recognition. Although most of these approaches can cope with illumination
variation well, some may bring negative influence on images without illumination
variation. In addition, some approaches show great difference on performance
when combined with different recognition methods. Some other approaches
require perfect alignment of face within the image which is difficult to achieve in
practical/real-life systems.

In this thesis, we propose an illumination normalization approach that proves


flexibility to different face recognition approaches and robustness to non-aligning
of faces in addition to having the minimum negative influence on images without
illumination variations. The proposed approach is based on enhancing the image
resulting from histogram matching, called GAMMA-HM-COMP.

To verify both the flexibility to face recognition approaches and robustness to non-
aligning of faces, the proposed approach is tested over two face recognition
methods representing the two broad categories of the holistic-based approach –

IV
namely Standard Eigenface method from the Eigenspace-based category and
Spectroface from the frequency-based category. In each method, the testing is
done using both aligned and non-aligned versions of the Yale B database.

In order to compare the proposed approach with other approaches, we select four
best-of-literature illumination normalization approaches among 38 different
approaches based on surveying nine different comparative studies. All five
approaches are compared using Eigenface and Spectroface methods on images
with illumination variations and images with other facial and geometrical
variations.

In illumination variation, the proposed approach gives the best results on the
Eigenface and the second best results on the Spectroface, when images are not
perfectly aligned. Moreover, the proposed approach is the minimum affected
approach (i.e. most robust) due to the non-aligning of faces on both methods.

In other facial and geometrical variations, the proposed approach has the minimum
negative influence on each of the two methods among the four other approaches.

In this work, all illumination normalization approaches are tested on two face
recognition methods representing the two broad categories of the holistic-based
approach. It’s important to extend this work to include local-based face
recognition methods in testing these approaches as they may introduce great
difference in performance when combined with such methods.

Moreover, this work introduces a technology evaluation for the proposed approach
and the other best-of-literature approaches. In order to complete the thorough
evaluation cycle, both scenario and operational evaluations need to be performed
for these approaches.

V
VI
Table of Contents
ACKNOWLEDGEMENT .......................................................................................................................... II 
PUBLICATIONS........................................................................................................................................ III 
ABSTRACT ................................................................................................................................................ IV 
TABLE OF CONTENTS ..........................................................................................................................VII 
LIST OF FIGURES.................................................................................................................................... IX 
LIST OF TABLES.....................................................................................................................................XII 
CHAPTER 1: INTRODUCTION ............................................................................................................... 1 
1.1 BIOMETRICS AND FACE RECOGNITION ................................................................................................. 1 
1.2 PROBLEM DEFINITION........................................................................................................................... 3 
1.3 METHODS CATEGORIZATION ................................................................................................................ 3 
1.4 VARIATIONS CATEGORIZATION ............................................................................................................ 3 
1.5 SUCCESSFUL SCENARIOS ...................................................................................................................... 4 
1.6 COMMERCIAL SYSTEMS ........................................................................................................................ 5 
1.7 RECENT EVALUATIONS ......................................................................................................................... 6 
1.8 THESIS OBJECTIVES AND ORGANIZATION ............................................................................................. 7 
CHAPTER 2: FACE RECOGNITION APPROACHES .......................................................................... 9 
2.1 INTRODUCTION ..................................................................................................................................... 9 
2.2 LOCAL-BASED APPROACHES .............................................................................................................. 10 
2.3 HOLISTIC-BASED APPROACHES .......................................................................................................... 15 
2.3.1 Eigenspace-based Category ....................................................................................................... 16 
2.3.2 Frequency-based Category ........................................................................................................ 21 
2.3.3 Other Holistic-Based Approaches.............................................................................................. 24 
2.4 HYBRID APPROACHES......................................................................................................................... 27 
2.5 PERFORMANCE EVALUATIONS AND COMPARATIVE STUDIES.............................................................. 28 
2.5.1 Performance Evaluation ............................................................................................................ 28 
2.5.2 Comparative Studies .................................................................................................................. 30 
CHAPTER 3: ILLUMINATION NORMALIZATION APPROACHES .............................................. 31 
3.1 INTRODUCTION ................................................................................................................................... 31 
3.2 MODEL-BASED APPROACHES ............................................................................................................. 32 
3.3 IMAGE-PROCESSING-BASED APPROACHES ......................................................................................... 40 
3.3.1 Global Approaches .................................................................................................................... 40 
3.3.2 Local Approaches ...................................................................................................................... 45 
3.4 COMPARATIVE STUDIES & BEST-OF-LITERATURE APPROACHES ........................................................ 53 
CHAPTER 4: SETUP THE ENVIRONMENT ....................................................................................... 62 
4.1 INTRODUCTION ................................................................................................................................... 62 
4.2 METHODS DESCRIPTIONS ................................................................................................................... 63 
4.2.1 Standard Eigenface Method ....................................................................................................... 63 
4.2.2 Spectroface Method ................................................................................................................... 63 
4.3 DATABASES DESCRIPTIONS ................................................................................................................ 65 
4.3.1 UMIST database ........................................................................................................................ 65 
4.3.2 Yale B database.......................................................................................................................... 65 
4.3.3 Grimace database ...................................................................................................................... 66 
4.3.4 JAFFE database......................................................................................................................... 67 
4.3.5 Nott-faces database .................................................................................................................... 68 
4.3.6 Yale database ............................................................................................................................. 68 
4.3.7 Face 94 database ....................................................................................................................... 68 

VII
4.4 EXPERIMENTAL RESULTS ................................................................................................................... 69 
4.4.1 Pose Variation ........................................................................................................................... 69 
4.4.2 Facial Expressions Variation ..................................................................................................... 70 
4.4.3 Non-Uniform Illumination Variation ......................................................................................... 72 
4.4.4 Translation Variation ................................................................................................................. 73 
4.4.5 Scaling Variation ....................................................................................................................... 75 
4.5 SUMMARY .......................................................................................................................................... 76 
CHAPTER 5: THE PROPOSED ILLUMINATION NORMALIZATION APPROACH................... 77 
5.1 INTRODUCTION ................................................................................................................................... 77 
5.2 IDEA OF THE PROPOSED APPROACH .................................................................................................... 77 
5.3 HISTOGRAM MATCHING ALGORITHM ................................................................................................. 79 
5.4 IMAGE ENHANCEMENT METHODS ...................................................................................................... 81 
5.4.1 Histogram Equalization (HE) .................................................................................................... 81 
5.4.2 Log Transformation (LOG) ........................................................................................................ 81 
5.4.3 Gamma Correction (GAMMA) .................................................................................................. 81 
5.4.4 Compression Function of the Retinal Filter (COMP) ................................................................ 82 
5.5 THE ENHANCED HM APPROACHES ..................................................................................................... 83 
5.5.1 Enhancement After HM .............................................................................................................. 83 
5.5.2 Enhancement Before HM ........................................................................................................... 84 
5.5.3 Further Enhancement ................................................................................................................ 85 
5.6 VERIFICATION OF THE SELECTION CONDITIONS ................................................................................. 85 
5.7 EXPERIMENTAL RESULTS ................................................................................................................... 87 
5.8 SUMMARY .......................................................................................................................................... 97 
CHAPTER 6: EVALUATE THE PROPOSED APPROACH................................................................ 99 
6.1 INTRODUCTION ................................................................................................................................... 99 
6.2 IMPLEMENTATION OF THE COMPARED APPROACHES .......................................................................... 99 
6.2.1 Preprocessing Chain Approach (CHAIN).................................................................................100 
6.2.2 Local Normal Distribution (LNORM) .......................................................................................101 
6.2.3 Single Scale Retinex with Histogram Matching (SSR-HM) ......................................................101 
6.2.4 Local Binary Patterns (LBP) ....................................................................................................102 
6.2.5 Proposed Approach (GAMMA-HM-COMP) ............................................................................102 
6.3 COMPARISON ON ILLUMINATION VARIATIONS ..................................................................................103 
6.3.1 Aligned Faces............................................................................................................................103 
6.3.2 Non-Aligned Faces....................................................................................................................103 
6.4 COMPARISON ON OTHER VARIATIONS ...............................................................................................106 
6.4.1 Pose Variations .........................................................................................................................107 
6.4.2 Facial Expressions Variations ..................................................................................................108 
6.4.3 Translation Variations ..............................................................................................................111 
6.4.4 Scaling Variations .....................................................................................................................115 
6.5 SUMMARY .........................................................................................................................................116 
CHAPTER 7: CONCLUSIONS AND FUTURE WORKS ....................................................................118 
7.1 CONCLUSIONS....................................................................................................................................118 
7.2 FUTURE WORKS ................................................................................................................................120 
REFERENCES ..........................................................................................................................................121 

VIII
List of Figures
Figure 1.1: Distribution of some biometrics over the market 1 
Figure 1.2: Number of published items and citations on face recognition between 1991 and 2006 2 
Figure 1.3: Easy scenarios in face recognition 4 
Figure 1.4: Easy scenarios in face recognition 4 
Figure 1.5: Difficult scenarios for face recognition 5 
Figure 2.1: Face bunch graph (FBG) serves as a general representation of faces. It is designed to
cover all possible variations in the appearance of faces. The FBG combines information from a
number of face graphs. Its nodes are labeled with set of jets, called bunches, and its edges are
labeled averages of distance vectors. During comparison to an image, the best fitting jet in each
bunch, indicated by gray shading, is selected independently. 11 
Figure 2.2: A visualized example for the steps of automatically localizing features. In (e), the black
cross on a white background indicates an extracted and stored feature vector at this location
while white cross on black background indicates an ignored feature vector. 12 
Figure 2.3: System overview of the component-based face detector using four components 13 
Figure 2.4: Sample of the normalized whole face image and the three regions that are used for the
local analysis 14 
Figure 2.5: Example for illustrating the basic LBP operator 15 
Figure 2.6: Examples of circular neighborhoods with number of sampled points P and radius R (P,R)
15 
Figure 2.7: Block diagram of the standard Eigenface method 16 
Figure 2.8: The subspace LDA face recognition system 17 
Figure 2.9: Flowchart for the Face recognition using evolutionary pursuit (EP) method 18 
Figure 2.10: Image synthesis model for Architecture 1. To find a set of IC images, the images in X are
considered to be a linear combination of statistically independent basis images, S, where A is an
unknown mixing matrix. The basis images are estimated as the learned ICA output U. 19 
Figure 2.11: Example of the projection map and the projection-combined image 19 
Figure 2.12: 3-level wavelet decomposition 22 
Figure 2.13: (a) input image, (b) the log-magnitude of its DCT, (c) the scanning strategy of
coefficients 22 
Figure 2.14: Most variant frequencies: a) real, b) imaginary and c) selected numbering 23 
Figure 2.15: Spectroface representation steps 24 
Figure 2.16: Examples of FBT of (A) an 8 radial cycles image, (B) a 4 angular cycles image and (C)
an image of the average of these images. The magnitude of the FBT coefficients is presented in
colored levels (red indicates the highest value) 24 
Figure 2.17: Block diagram for face recognition based on moments 25 
Figure 2.18: Examples of the Trace transform on (a) full image (b) masked with rectangular shape
and (c) masked with elliptical shape. 26 
Figure 2.19: Training and recognition stages of Face Recognition Using Local and Global Features
approach 27 
Figure 2.20 (a) A gray-scale face image, (b) it’s edginess image, and (c) the cropped eyes. 28 
Figure 3.1: The same individual imaged with the same camera and the same facial expression may
appear dramatically different with changes in the lighting conditions. 31 
Figure 3.2: Effect of applying QIR on an illuminated face image from Yale B database 34 
Figure 3.3: Effect of applying SQI approach to illuminated face images from Yale B and CMU PIE
databases. 34 
Figure 3.4: The effect of the scale, σ, on processing an illuminated facial image using the SSR. 37 
Figure 3.5: Histogram fitted version of SSR with σ = 6 38 

IX
Figure 3.6: Discretization lattice for the PDE in equation 3.23 39 
Figure 3.7: Effect of applying GROSS approach on some illuminated face images from Yale B
database 40 
Figure 3.8: Effect of applying histogram equalization on an illuminated image 41 
Figure 3.9: Histogram matching process to an illuminated image 42 
Figure 3.10: Transformation functions of LOG and GAMMA (L: number of gray levels) 43 
Figure 3.11: Effect of applying LOG approach to an illuminated face image. 44 
Figure 3.12: Effect of applying GIC to an illuminated face image 45 
Figure 3.13: Effect of applying NORM approach to an illuminated image. (Note that the gray-level of
the resulting image is stretched to [0,255] for displaying purpose only) 45 
Figure 3.14: Effects of applying the three local normalization methods to an illuminated face image46 
Figure 3.15: An example of ideal region partition 46 
Figure 3.16: The four regions for illumination normalization 47 
Figure 3.17: The effects of applying region-based strategy of HE and GIC over the four face regions
47 
Figure 3.18: Block histogram matching. In each image pair, the left one is the input image while the
right one is the reference image. 47 
Figure 3.19: The windowing filter H used in the Block HM method 48 
Figure 3.20: Images before and after intensity normalization with BHM. (a) Input images, (b)
corresponding output images after applying BHM 49 
Figure 3.21: The LBP operator 49 
Figure 3.22: The extended LBP operator with (8,2) neighborhood. Pixel values are interpolated for
points which are not in the center of a pixel. 50 
Figure 3.23: Original image (left) processed by the LBP operator (right). 50 
Figure 3.24: Effects of applying the image processing steps proposed by [91] 51 
Figure 3.25: Examples of images of one person from the Extended Yale-B frontal database. The
columns respectively give images from subsets 1 to 5. 53 
Figure 3.26: Summarization for the first four comparative studies. For each study, it shows the
normalization approach to be compared, the face databases and the face recognition
approaches in addition to the best normalization approaches from each study (grayed boxes) 55 
Figure 3.27: Summarization for the nine comparative studies showing some relations between these
studies in addition to the final best normalization approaches from all studies (dark grayed
boxes). For each study, it shows the normalization approach to be compared, the face
databases and the face recognition approaches in addition to the best normalization approaches
from each study (light grayed boxes) 60 
Figure 4.1: Standard Eigenface block diagram 63 
Figure 4.2: Spectroface block diagram 64 
Figure 4.3: UMIST: selected images for one subject in both training and testing sets 66 
Figure 4.4: Yale B: Training images for one subject in the four subsets with the light angle of each
image 66 
Figure 4.5: Selected images for one subject from each database used for studying the facial
expression variation 67 
Figure 4.6: Example images from JAFFE database. The images in the database have been rated by
60 Japanese female subjects on a 5-point scale for each of the six adjectives. The majority vote
is shown underneath each image (with natural being defined through the absence of a clear
majority) 67 
Figure 4.7: Face 94: 15 images for each subject in both training and testing sets 69 
Figure 4.8: Translation Variation: example for translating with and without circulation 74 
Figure 5.1: Histogram matching process to an illuminated image 80 

X
Figure 5.2: Transformation functions of LOG and GAMMA (L: number of gray levels) 82 
Figure 5.3: Effect of the four enhancement methods on an illuminated face 83 
Figure 5.4: Block diagram of applying the image enhancement method after the HM 83 
Figure 5.5: Effects of applying the image enhancement methods after applying the HM 84 
Figure 5.6: Block diagram of applying the image enhancement method before the HM 84 
Figure 5.7: Effects of applying the image enhancement methods before applying the HM 84 
Figure 5.8: Block diagram showing the further enhancement of combinations in 5.5.1 and 5.5.2 85 
Figure 5.9: Effects of further enhancement on both HM-GAMMA and GAMMA-HM combinations
using each of the four enhancement methods 85 
Figure 5.10: Sample faces from Yale B database – automatically and manually cropped 86 
Figure 5.11: Eigenface method over YALE B-AUTO: Effects of further enhancement over the eight
single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP 89 
Figure 5.12: Eigenface method over YALE B-MANU: Effects of further enhancement over the eight
single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP 91 
Figure 5.13: Spectroface method over YALE B-AUTO: Effects of further enhancement over the eight
single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP 92 
Figure 5.14: Spectroface method over YALE B-MANU: Effects of further enhancement over the
eight single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP 93 
Figure 5.15: Effects of the five enhancement combinations that satisfy the three conditions 95 
Figure 6.1: Average increasing/decreasing in recognition rates after applying each of the five
illumination normalization approaches on YALE B-MANU version 105 
Figure 6.2: Average increasing/decreasing in recognition rates after applying each of the five
illumination normalization approaches on YALE B-AUTO version 106 
Figure 6.3: Performance decreasing of each normalization approach due to the non-aligning of faces
(i.e. subtracting the performance on YALE B-AUTO from the performance on YALE B-
MANU) 106 
Figure 6.4: Average difference in recognition rates after applying each of the five illumination
normalization approaches on UMIST database 108 
Figure 6.5: Average difference in recognition rates after applying each of the five illumination
normalization approaches on Yale database 109 
Figure 6.6: Average difference in recognition rates after applying each of the five illumination
normalization approaches on Grimace database 110 
Figure 6.7: Average difference in recognition rates after applying each of the five illumination
normalization approaches on JAFFE database 110 
Figure 6.8: Average difference in recognition rates after applying each of the five illumination
normalization approaches on Nott-faces database 110 
Figure 6.9: Average decreasing in recognition rates after translating with circulation 114 
Figure 6.10: Average decreasing in recognition rates after translating without circulation 115 
Figure 6.11: Average decreasing in recognition rates when applying each of the five illumination
normalization approaches before and after scaling the Face 94 database 116 

XI
List of Tables
Table 1.1: Different applications of face recognition 2 
Table 2.1: A brief comparison between holistic-based and local-feature-based approaches 9 
Table 3.1: Default parameter settings for CHAIN approach 52 
Table 3.2: List for 24 illumination normalization approaches that LNORM perform better than them
57 
Table 3.3: The 38 different illumination normalization approaches appear in the above nine
comparative studies together with the corresponding studies numbers. (Note that the cited
approaches, from 29 to 38, are not described in details in their corresponding comparative
studies) 61 
Table 4.1: Comparison between results in Lai et al. [43] and in our implementation (better rates are
italic) 65 
Table 4.2: Pose Variation: recognition rates over 12 training cases (top four rates in each method are
italic) 70 
Table 4.3: Expressions Variation: recognition rates over four databases with two Eigenface tests 71 
Table 4.4: Illumination Variation: recognition rates over 25 training cases (top three rates in each
method are italic) 73 
Table 4.5: Translation Variation: chosen cases from the six databases and their recognition rates 73 
Table 4.6: Translation Variation: average decreasing in the recognition rates of both methods after
translating with circulation in the four directions 74 
Table 4.7: Translation Variation: average decreasing in the recognition rates of both methods after
translating without circulation in the four directions 75 
Table 4.8: Scaling Variation: description of the training cases 75 
Table 4.9: Scaling Variation: decreasing in recognition rates after scaling all images in the testing set
76 
Table 5.1: The 25 different training cases used in testing 87 
Table 5.2: The number of combinations that lead to increase the recognition rates after using each of
the enhancement methods for further enhancement 94 
Table 5.3: Results of using the best five combinations with the Eigenface method over the two
versions of the database. Average recognition rate is calculated over the 25 different training
cases. (The best average differences are italic) 96 
Table 5.4: Results of using the best five combinations with the Spectroface method over the two
versions of the database. Average recognition rate is calculated over the 25 different training
cases. (The best average differences are italic) 97 
Table 6.1: Default parameter settings for CHAIN approach 100 
Table 6.2: Results of applying CHAIN with and without sliding on Spectroface method on both
versions of the YALE B database 100 
Table 6.3: Difference between our implementation of the LNORM and the original one 101 
Table 6.4: Results of applying LNORM with and without sliding on Spectroface method on both
versions of the YALE B database 101 
Table 6.5: Difference between our implementation of the SSR-HM and the original one 102 
Table 6.6: Results of applying each of the five illumination normalization approaches with both
Eigenface and Spectroface methods over YALE B-MANU version. Average recognition rate is
calculated over the 25 different training cases. 104 
Table 6.7: Results of applying each of the five illumination normalization approaches with both
Eigenface and Spectroface methods over YALE B-AUTO version. Average recognition rate is
calculated over the 25 different training cases. (0: NONE, 1: LNORM, 2: LBP, 3: CHAIN, 4:
SSR-HM, 5: GAMMA-HM-COMP, nor: normal, ver: vertical, hor: horizontal) 105 

XII
Table 6.8: Results of applying each of the five illumination normalization approaches with both
Eigenface and Spectroface methods over UMIST database. Average recognition rate is
calculated over all training cases. 107 
Table 6.9: Results of applying each of the five illumination normalization approaches with both
Eigenface and Spectroface methods over Grimace, Yale, JAFEE, and Nott-faces databases.
Average recognition rate is calculated over all training cases. 109 
Table 6.10: Average decreasing in the recognition rates of both methods after translating with
circulation in the four directions while applying (a) LNORM, (b) LBP, (c) CHAIN, (d) SSR-
HM and (e) GAMMA-HM-COMP approaches as preprocessing step. 111 
Table 6.11: Average decreasing in the recognition rates of both methods after translating without
circulation in the four directions while applying (a) LNORM, (b) LBP, (c) CHAIN, (d) SSR-
HM and (e) GAMMA-HM-COMP approaches as preprocessing step. 113 
Table 6.12: Decreasing in recognition rates after applying each of the five illumination normalization
approaches with both Eigenface and Spectroface methods over Face 94 database. Average
decreasing in recognition rate is calculated over all training cases. 115 

XIII
XIV
CHAPTER 1: Introduction
1.1 Biometrics and Face Recognition
Biometric recognition [1] refers to the use of distinctive physiological (e.g., fingerprints,
face, retina, iris) and behavioral (e.g., gait, signature) characteristics, called biometric
identifiers, for automatically recognizing individuals. Because biometric identifiers
cannot be easily misplaced, forged, or shared, they are considered more reliable for
person recognition than traditional token or knowledge-based methods. Others typical
objectives of biometric recognition are user convenience (e.g., service access without a
Personal Identification Number), better security (e.g., difficult to forge access). Fig.1.1
shows the distribution of some biometrics over the market.

Figure 1.1: Distribution of some biometrics over the market


As one of the most successful applications of image analysis and understanding, face
recognition has recently gained significant attention, especially during the past several
years. This is evidenced by the emergence of specific face recognition conferences such
as AFGR and CVPR and systematic empirical evaluation of face recognition techniques
(FRT), including the XM2VTS, FERET, FRGC and FRVT. Fig.1.2 shows how many
items on face recognition were published between 1991 and 2006, and also the number of
citations to source items indexed within WoS - Web of Science. The figure proves that
the face recognition still a hot research area. There are at least two reasons for such a
trend, the first is the wide range of commercial and law enforcement applications and the
second is the availability of feasible technologies after 35 years of research.

1
Figure 1.2: Number of published items and citations on face recognition between 1991 and 2006
The strong demand for user friendly systems which can secure our assets and protect our
privacy without losing our identity in a sea of numbers is obvious. At present, one needs
a PIN to get cash from an ATM, a password for a computer, a dozen others to access the
internet, and so on. Although extremely reliable methods of biometric personal
identification exist, e.g., fingerprint analysis and retinal or iris scans, these methods have
yet to gain acceptance by the general population due to their needs to the cooperation of
the participants. A personal identification system based on analysis of frontal or profile
images of the face is non intrusive and therefore user friendly. Moreover, personal
identity can often be ascertained without the participant’s cooperation or knowledge. In
addition, the need for applying FRT has been boosted by recent advances in multimedia
processing along with others such as IP (Internet Protocol) technologies. Table-1.1 lists
some of the applications of face recognition [2]:
Table 1.1: Different applications of face recognition
Area Specific Applications
Drivers’ Licenses, Entitlement Programs
Biometrics Immigration, National ID, Passports, Voter Registration
Welfare Fraud
Desktop Logon (Windows XP, Windows Vista)
Application Security, Database Security, File Encryption
Information Security
Intranet Security, Internet Access, Medical Records
Secure Trading Terminals
Advanced Video Surveillance, CCTV Control
Law Enforcement and
Portal Control, Post-Event Analysis
Surveillance
Shoplifting and Suspect Tracking and Investigation
Smart Cards Stored Value Security, User Authentication
Access Control Facility Access, Vehicular Access

2
1.2 Problem Definition
A general statement of face recognition problem can be formulated as follows [2]: Given
still or video images of a scene, identify or verify one or more persons in the scene using
a stored database of faces. Available collateral information such as race, age, gender,
facial expression and speech may be used in narrowing the search (enhancing
recognition). The solution of the problem involves segmentation of faces (face detection)
from cluttered scenes, feature extraction from the face region, identification or
verification. In identification problems, the input to the system is an unknown face, and
the system reports back the decided identity from a database of known individuals,
whereas in verification problems, the system needs to confirm or reject the claimed
identity of the input face.

1.3 Methods Categorization


A number of intensity-image face recognition methods have been proposed and
implemented in commercial systems. These methods fall into two broad approaches,
namely, holistic-based – where features are mainly extracted from the whole face – and
local-feature-based in which features are mainly extracted from some locations in the
face. Even though approaches of all these types have been successfully applied to the task
of face recognition, they do have certain advantages and disadvantages. Thus an
appropriate approach should be chosen based on the specific requirements of a given
task.
However, most of current face recognition techniques assume that several (at least two)
samples of the same person are always available for training. Unfortunately, in many
real-world applications, the number of training samples we actually have is by far smaller
than that we supposedly have [3]. More specifically, in many application scenarios,
especially in large-scale identification applications, such as law enforcement, driver
license or passport card identification, there is usually only one training sample per
person in the database. In addition, we seldom have opportunity to add more samples of
the same person to the underlying database, because collecting samples is costly and even
we can do so. These raise the need for developing face recognition techniques that
specially deal with one-sample-per-person problem.

1.4 Variations Categorization


Many issues hinder research efforts in the field of face recognition. Variation exists in
every imaging modality used, and finding fast, simple algorithms that are robust to
variation is difficult (as evidenced by years of research). Categorizing the variation may
be helpful in the development of effective face recognition algorithms [4]. Intrinsic
sources of variation include identity, facial expression, speech, gender, and age [7].
Extrinsic sources of variation include viewing geometry, illumination, imaging processes,

3
and other objects. Viewing geometry includes pose changes, either by the observer or the
object to be recognized; illumination changes include shading, color, self-shadowing, and
specular highlights; imaging process variations include resolution, focus, imaging noise,
sampling technique, and perspective distortion effects; variation from other objects
include occlusions, shadowing, and indirect illumination. These sources of variation may
or may not hinder the recognition process depending on which algorithm is used. It is
possible that the variation due to factors such as facial expression, lighting, occlusions,
and pose is larger than the variation due to identity [6], [7]. That makes identification
under such varying environments a difficult task. However, human proficiency at face
recognition [8] has motivated enormous research in this area despite these challenges.
(The ability of humans to recognize faces is also an actively researched field with widely
varying results depending on numerous factors. Additional information on this topic can
be found predominantly in the psychology literature [9], [10].)

1.5 Successful Scenarios


The approaches proposed in the last years have been able to solve specific still face
images recognition applications. Examples of scenarios where face recognition achieves
very good results are given in Fig.1.3 and Fig.1.4, [11].

Figure 1.3: Easy scenarios in face recognition

Figure 1.4: Easy scenarios in face recognition


When the scenario departs from the easy scenario, then face recognition approaches
experience severe problems. Among the special challenges let us mention: pose variation,
illumination conditions, scale variability, images taken years apart, glasses, moustaches,
beards, low quality image acquisition, partially occluded faces etc. Fig.1.5 shows
different images which present some of the problems encountered in face recognition. In
search of finding solutions for difficult face recognition scenarios, some help is found in
two broad areas: video-based face recognition and multimodal approaches.

4
Figure 1.5: Difficult scenarios for face recognition

1.6 Commercial Systems


Currently there are many commercial face tracking and recognition systems available.
For obvious reasons, many companies are reluctant to disclose the technology used in
their products. We list below many commercialization of face recognition technology
together with the specific approach used [4]:
1. HNeT (Holographic/quantum Neural Technology) Facial Recognition System by
AcSys Biometrics Corporation [122]. This neural network approach was
developed by John Sutherland.
2. ZN-Face by ZN Vision Technologies uses an undisclosed neural network
approach [135].
3. Nvisage by Neurodynamics uses an undisclosed neural network approach [132].
4. FaceTools by Viisage uses a proprietary algorithm based on the eigenfaces
approach developed at the MIT Media Lab [133].
5. Biometrica uses eigenfaces, but does not disclose further details [124].
6. FaceIt by Identix (formerly Visionics) uses Local Feature Analysis (LFA)
developed by Dr. Joseph J. Atick to generate and measure intra-feature distances
for recognition [129].
7. Face Guardian by Keyware uses local feature analysis. No information on this
product is available on their website [131].
8. Visec-FIRE by Berninger Software uses a facial processing approach [123].
9. ID2000 by Imagis uses a proprietary wavelet representation of the face for
recognition [130].
10. BioID uses an undisclosed multimodal system implementing face, voice, and lip
movement identification [126], [128].
11. FaceVACS by Cognitec applies transforms to specific areas of the face in order to
create a user specific feature vector [125].
12. UnMask by Vision Sphere Technologies Inc. uses a proprietary feature analysis
algorithm [134].
13. Face Key by Intelligent Verification Systems uses an undisclosed face and
fingerprint recognition algorithm [127].

5
1.7 Recent Evaluations
Since 1993, a series of six face recognition technology evaluations sponsored by U.S.
Government have been held. In thirteen years, performance has improved by two orders
of magnitude and there exist numerous companies selling face recognition systems [12].
The evaluations provided regular assessments of the state of the technology and helped to
identify the most promising approaches. The challenge problems also nurtured research
efforts by providing large datasets for use in developing new algorithms. The Face
Recognition Technology (FERET) program, Face Recognition Grand Challenge (FRGC)
and Face Recognition Vendor Test (FRVT) evaluations and challenge problems were
instrumental in advancing face recognition technology, and they show the potential for
the evaluation and challenge problem paradigm to advance biometric, pattern recognition,
and computer vision technologies.
One of the main conclusions of the face recognition vendor test FRVT 2002 is that face
recognition from outdoor imagery remains a research challenge area.
Moreover, the primary goal of the last technology evaluation in the series (FRVT 2006)
is to look at recognition from high-resolution still images and three-dimensional (3D)
face images, and measures performance for still images taken under controlled and
uncontrolled illumination. Following are some conclusions from this evaluation, [12]:
• The FRVT 2006 results show that relaxing the illumination condition still has a
dramatic effect on performance.
• Face recognition performance on still frontal images taken under controlled
illumination has improved by an order of magnitude since the FRVT 2002. There are
three primary components to the improvement in algorithm performance since the
FRVT 2002:
1. The recognition technology,
2. Higher resolution imagery,
3. Improved quality due to greater consistency of lighting.
• Since performance was measured on the low-resolution dataset in both the FRVT
2002 and the FRVT 2006, it is possible to estimate the improvement in performance
due to algorithm design. The improvement in algorithm design resulted in an increase
in performance by a factor of between four and six depending on the algorithm. For
the results on the high and very-high resolution datasets, the improvement in
performance comes from a combination of algorithm design and image size and
quality. This is because new recognition techniques have been developed to take
advantage of the larger high quality face images.
• The FRVT 2006 and the Iris Challenge Evaluation (ICE 2006) compared recognition
performance from very-high resolution still face images, 3D face images, and single-
iris images. On the FRVT 2006 and the ICE 2006 datasets, recognition performance

6
of all three biometrics is comparable when all three biometrics are acquired under
controlled illumination.
Since the human visual system contains a very robust face recognition capability that is
excellent at recognizing familiar faces [5]. However, human face recognition capabilities
on unfamiliar faces fall far short of the capability for recognizing familiar faces. The
FRVT 2006, for the first time, integrated measuring human face recognition capability
into an evaluation. Performance of humans and computers was compared on the same set
of images. The FRVT 2006 human and computer experiment measures the ability to
recognize faces across illumination changes. This experiment found that algorithms are
capable of human performance levels, and that at false accept rates in the range of 0.05,
machines can out-perform humans.

1.8 Thesis Objectives and Organization


As we can see from the conclusions of the recent technology evaluations, although many
face recognition techniques and systems have been proposed, the recognition
performance of most current technologies degrades due to the variations of illumination.
Moreover, It has been proven both experimentally [13] and theoretically [14] that the
variations between the images of the same face due to illumination are almost always
larger than image variations due to change in face identity.
There has been much work dealing with illumination variations in face recognition.
Although most of these approaches can cope with illumination variation well, some may
bring negative influence on images without illumination variation. In addition, some
approaches show great difference on performance when combined with different
recognition approaches. Some other approaches require perfect alignment of face within
the image which is difficult to achieve in practical/real-life systems.
This thesis aims to propose an illumination normalization approach that proves flexibility
to different face recognition approaches and independency to face alignment in addition
to having the minimum negative influence on images without illumination variations. We
do this through the following:
1. Study the face recognition approaches.
2. Study the illumination normalization approaches for face recognition.
3. Propose an illumination normalization approach that proves flexibility to different
face recognition approaches and independency to face alignment.
4. Make comparative study between the proposed approach and other best-of-
literature approaches over images with and without illumination variations.
The thesis is organized as follows:
Chapter 2 surveys the three main face recognition approaches which are local-based,
holistic-based and hybrid approaches with brief descriptions about some methods under

7
each approach. In addition, the types of performance evaluations and literature
comparative studies are introduced in this chapter.
Chapter 3 surveys the two main illumination normalization approaches for face
recognition, namely model-based and image-processing-based approaches. It introduces
brief descriptions about some methods under each approach. In addition, nine
comparative studies are introduced at the end of the chapter to select the best-of-literature
approaches.
Chapter 4 introduces the detailed descriptions about the environment that we build up in
order to use it for testing our proposed illumination normalization approach and the other
approaches. The chapter includes descriptions about the selected face recognition
methods and the selected databases that cover five different face recognition variations.
The experimental results of the selected methods over each database are also introduced
in this chapter. All experiments are done without applying any illumination normalization
approach. This allows us to study the effects of any illumination normalization approach
on the selected methods over each variation separately.
Chapter 5 proposes an illumination normalization approach based on enhancing the
image resulting from histogram matching. Four different image enhancement methods
are experimentally tried in two different approaches, 1) After HM; on the resulting image
from HM, 2) Before HM; on the reference image before matching the input image on it.
The best approach is chosen such that it proves flexibility to the two selected face
recognition methods and independency to face alignment.
Chapter 6 evaluates the proposed illumination normalization approach and other best-of-
literature approaches over images with illumination variation and images with other
facial and geometrical variations using the two selected face recognition methods.
Chapter 7 contains the final conclusions of this work in addition to suggestions for
future works.

As a start for this work, the following chapter surveys the three main face recognition
approaches which are local-based, holistic-based and hybrid approaches with brief
descriptions about some methods under each approach. In addition, the types of
performance evaluations and literature comparative studies are introduced in this chapter.

8
CHAPTER 2: Face Recognition Approaches

2.1 Introduction
A number of intensity-image face recognition methods have been proposed and
implemented in commercial systems. Basically, they can be divided into holistic-based,
local-feature-based, and hybrid approaches. Even though approaches of all these types
have been successfully applied to the task of face recognition, they do have certain
advantages and disadvantages. Thus an appropriate approach should be chosen based on
the specific requirements of a given task.
Local-feature-based methods rely on the identification of certain fiducial points on the
face such as the eyes, the nose, the mouth, etc. The location of those points can be
determined and used to compute geometrical relationships between the points as well as
to analyze the surrounding region locally. Thus, independent processing of the eyes, the
nose, and other fiducial points is performed and then combined to produce recognition of
the face.
Holistic-based methods treat the image data simultaneously without attempting to
localize individual points. The face is recognized as one entity without explicitly isolating
different regions in the face. Holistic techniques utilize statistical analysis, neural
networks, and transformations. They usually require large samples of training data. The
advantage of holistic-based methods is that they utilize the face as a whole and do not
destroy any information by exclusively processing only certain fiducial points. Thus, they
generally provide more accurate recognition results. However, such techniques are
sensitive to variations in position, scale, etc., which restrict their use to standard, frontal
mug-shot images. Table-2.1 shows a brief comparison between both approaches.
Table 2.1: A brief comparison between holistic-based and local-feature-based approaches
Holistic-based Local-feature-based
Extract Feature Vector from the whole face Extract Feature Vector at certain locations
Sensitive to pose and illumination changes Robust to pose and illumination changes
Depend on accurate feature detection, which is
Feature detection is not required
not simple
Computationally less expensive Computationally more expensive
The rest of this chapter is organized as follows: section 2 contains brief description about
some local-based approaches. Section 3 describes some holistic-based approaches.
Examples of hybrid approaches appear in section 4. Section 5 describes the types of
performance evaluation and introduces results from existing comparative studies.

9
2.2 Local-Based Approaches

1. Elastic Bunch Graph Matching 1997


This method is considered one of the famous local-based methods in the literature. The
work in [15] presents a system for recognizing human faces from single images out of a
large database containing one image per person. Faces are represented by labeled graphs,
based on a Gabor wavelet transform, in which nodes are located at facial landmarks and
labeled with a 40-dimensional Gabor-based complex vector called jets. The edges are
labeled with two dimensional distance vectors between corresponding nodes.
In order to extract the image graphs automatically, the face bunch graph (FBG) is first
constructed for certain pose by combining a representative set of individual model graphs
into a stack-like structure as shown in Fig.2.1. Each model graph has the same grid
structure and the nodes refer to identical fiducial points. The first set of model graphs is
generated manually. A set of jets referring to one fiducial point is called a bunch. An eye
bunch, for instance, may include jets from closed, open, female and male eyes…etc. to
cover these local variations. The corresponding FBG graph is then given the same grid
structure as the individual model graphs, its nodes are labeled with the bunches of jets
and its edges are labeled with the averaged distances between these jets. Once the system
has an FBG, graphs for new images can be generated automatically by elastic bunch
graph matching which based on maximizing a graph similarity between an image graph
and the FBG of identical pose. A heuristic algorithm is used to find the image graph
which maximizes the graph similarity function. First, the location of the face is found by
a sparse scanning of the FBG over the image. Then, the FBG is varies in size and aspect
ratio to adapt to the right format of the face. These steps are of no cost in the topography
term of the similarity function because the edge labels are transformed accordingly.
Finally, all nodes are moved locally and relative to each other to optimize the graph
similarity further.
After having extracted model graphs from the gallery image and image graph from the
probe image, recognition is done by comparing an image graph to all model graphs and
selecting the one with the highest similarity value.

2. Face Recognition Based on Multiple Facial Features 2000


In [16], a different facial feature detection scheme, which is based on the framework of
Elastic Bunch Graph Matching in [15], is utilized. Only 17 facial features instead of the
48 in [15], all of which have clear meanings and exact correct positions, are localized for
each new face image. The whole facial feature detection process consists of three
stages— global face search, individual facial feature localization and graph adjusting.
The first stage serves to find a face in an image and provides near-optimal starting points
for the following individual facial feature localization stage. In the second stage, each of

10
the 17 facial features is localized individually, without taking its relative positions to
other facial features into consideration. In graph adjusting stage, relative positions
between facial features are utilized to localize those misplaced facial features from the
second stage.
After facial feature detection, 17 basic facial features, each labeled with a 40-dimensional
Gabor-based complex vector, are detected for each new face image. Face recognition is
then executed on the basis of these complex vectors, which represent local features of the
areas around the multiple facial features. Two face recognition approaches, named Two-
Layer Nearest Neighbor (TLNN) and Modular Nearest Feature Line (MNFL)
respectively, are proposed.

Figure 2.1: Face bunch graph (FBG) serves as a general representation of faces. It is designed to
cover all possible variations in the appearance of faces. The FBG combines information from a
number of face graphs. Its nodes are labeled with set of jets, called bunches, and its edges are labeled
averages of distance vectors. During comparison to an image, the best fitting jet in each bunch,
indicated by gray shading, is selected independently.

3. Biometric system: A Face recognition approach 2000


The work in [17] proposes an automatic system in which informative feature locations in
the face image are automatically located by Gabor filters, which make the system not
dependent on accurate detection of facial features. The system starts by filtering the
image with a set of Gabor filters. The filtered image is then multiplied with a 2-D
Gaussian to focus on the center of the face, and avoid extracting features at the face
contour. This Gabor filtered and Gaussian weighted image is then searched for peaks,
which considered as interesting feature locations for face recognition. At each peak, a
feature vector consisting of Gabor coefficients is extracted. In testing, the Euclidian
distances to all feature vectors in the gallery are calculated and rank them accordingly. A
visualized example for the steps of localizing features is shown in Fig.2.2.

11
Figure 2.2: A visualized example for the steps of automatically localizing features. In (e), the black
cross on a white background indicates an extracted and stored feature vector at this location while
white cross on black background indicates an ignored feature vector.

4. Face Recognition with Support Vector Machines: Global versus


Component-based Approach 2001
In [18], a local-based method and two holistic-based methods based on using the support
vector machine SVM are presented and evaluated with respect to robustness against pose
changes. Extensive tests are performed on a database which included faces rotated up to
about 40° in depth. The local-based method clearly outperformed both holistic-based
methods on all tests.
The Local-based method starts with locating 14 different facial components, extracting
and combining them into a single feature vector which is classified by a Support Vector
Machine (SVM). To locate facial components, a two-level, component-based face
detector is implemented. On the first level, 14 component classifiers independently detect
the facial components. On the second level, a geometrical configuration classifier
performs the final face detection by combining the results of the component classifiers.
The steps of the component-based face detector are illustrated in Fig.2.3. Given a 58 × 58
window over the input image, the maximum continuous outputs of the component
classifiers within rectangular search regions around the expected positions of the
components are used as inputs to the geometrical configuration classifier. The search
regions have been calculated from the mean and standard deviation of the components’
locations in the training images. The geometrical classifier is also provided with the X–Y
locations of the maxima of the component classifier outputs relative to the upper left
corner of the 58 × 58 window.
To train the component-based face detector, the component is located and extracted from
set of synthetic images to build a positive component training set. The negative
component training set is extracted from non-face patterns. Thus, 14 linear SVMs are
trained on the component data and applied to the whole training set in order to generate

12
the training data for the geometrical classifier. The geometrical configuration classifier,
which is again a linear SVM, is trained on the X–Y locations and continuous outputs of
the 14 component classifiers.

Figure 2.3: System overview of the component-based face detector using four components
In training stage, the component-based face detector is first run over each image in the
training set to extract the local facial components. Only 10 out of the 14 local facial
components are kept for face recognition, removing those that either contained few gray
value structures (e.g. cheeks) or strongly overlapped with other components. Each of the
10 components is then normalized in size and their gray values are combined into a single
feature vector. These feature vectors are used to train a one-vs-all linear SVM for every
person in the database. In testing stage, the feature vector of the probe image is first
extracted and then provided to the one-vs-all linear SVM for every person in order to
recognize it. The matched person is the one that its corresponding SVM gives the
maximum value for the probe feature vector.

5. Automatic Face Recognition System Based on Local Fourier-Bessel


Features 2005
The work in [19] presents an automatic face verification system inspired by known
properties of biological systems. In the proposed algorithm, the whole image is converted
from the spatial to polar frequency domain by a Fourier-Bessel Transform (FBT). The
local feature vector is then constructed from the FBT coefficients of the upper right
region, upper middle region, and the upper left region of the face based on ground-truth
information as shown in Fig.2.4. Using local features is compared to the case where the
whole image FBT coefficients are considered. The resulting representations are

13
embedded in a dissimilarity space, where each image is represented by its distance to all
the other images, and a Pseudo-Fisher discriminator is built.

Figure 2.4: Sample of the normalized whole face image and the three regions that are used for the
local analysis
Verification test results on the FERET database show that the local-based algorithm
outperforms the global-FBT version. The local-FBT algorithm performed as state-of-the-
art methods under different testing conditions, indicating that the proposed system is
highly robust for expression, age, and illumination variations. In addition, the
performance of the proposed system is also evaluated under strong occlusion conditions
and found that it is highly robust for up to 50% of face occlusion. However, when the
verification system is automated completely by implementing face and eye detection
algorithms, the performance of the local approach is reduced and becomes only slightly
superior to the global approach.

6. Local Features for Biometrics-Based Recognition 2004


The work in [20] introduces an approach combining a simple local representation method
with a k-nearest neighbors-based direct voting scheme for both face and speaker
recognition. The extraction of local features starts by first selecting pixels that have local
variance above a certain global threshold. Then, for each selected pixel, a w2-dimensional
vector is obtained by applying a w × w window around it. Finally, the dimensionality of
this vector is reduced using PCA and each vector is labeled with an identifier of the class.
Each test image is then classified by first computing the k-nearest neighbors of each of its
corresponding feature vectors. Then the class with the largest number of votes
accumulated over all the vectors is selected.

7. Face Description with Local Binary Patterns: Application to Face


Recognition 2006
In [21], a novel and efficient facial representation is proposed. It is based on dividing a
facial image into small regions and computing a texture description of each region using
local binary patterns (LBP). These descriptors are then combined into a spatially
enhanced histogram (or feature vector). The spatially enhanced histogram encodes both
the appearance and the spatial relations of facial regions.
To extract the LBP texture descriptor of a region, the operation starts by first assigning a
label to every pixel of a region by thresholding the 3 × 3-neighborhood of each pixel with

14
the center pixel value and considering the result as a binary number. Then the histogram
of the labels can be used as a texture descriptor. See Fig.2.5 for an illustration of the basic
LBP operator.

Figure 2.5: Example for illustrating the basic LBP operator


To be able to deal with textures at different scales, a circular LBP operator is used.
Defining the local neighborhood as a set of sampling points evenly spaced on a circle
centered at the pixel to be labeled allows any radius and number of sampling points.
Bilinear interpolation is used when a sampling point does not fall in the center of a pixel.
Fig.2.6 shows some examples of circular neighborhoods.

(8,1) (16,2) (8,2)


Figure 2.6: Examples of circular neighborhoods with number of sampled points P and radius R (P,R)
As the m facial regions have been determined, a histogram of the LBP texture description
is computed independently within each of the m regions. The resulting m histograms are
combined yielding the spatially enhanced histogram with size m × n where n is the length
of a single LBP histogram.
In the spatially enhanced histogram, the face is effectively described on three different
levels of locality: the LBP labels for the histogram contain information about the patterns
on a pixel-level, the labels are summed over a small region to produce information on a
regional level and the regional histograms are concatenated to build a global description
of the face. However, since some facial features (such as eyes) play more important roles
in human face recognition than other features, the regions can be weighted based on the
importance of the information they contain. So, a weighted distance such as the Chi
square distance can be used for classification.

2.3 Holistic-Based Approaches


Most of holistic-based approaches can be classified into two wide categories, Eigenspace-
based category and frequency-based category [22]. In the following subsections, we give
an introduction about each category in addition to some brief description about existing
methods under both categories. Moreover, we also introduce some brief description about
some other holistic-based methods that are not belonging to any of the two categories.

15
2.3.1 Eigenspace-based Category
In Eigenspace-based category, Principal Component Analysis (PCA) – usually called
Eigenface – plays a key role in many holistic methods. Sirovich and Kirby [23] propose a
method that uses Karhunen-Loève transform to represent human faces. In 1991, Turk and
Pentland [24] develop a face recognition system using PCA (K-L expansion). Along this
direction, many Eigenspace based recognition systems have been developed, they differ
mostly in the kind of projection approaches (standard-, differential- or kernel-
Eigenspace), in the projection algorithm employed (PCA, ICA and FLD), in the use of
simple or differential images before/after projection, and in the similarity matching
criterion or classification method employed (Euclidean, Mahalanobis, Cosine distances
and SOM-Clustering, RBF, LDA, and SVM). Many comparative studies between
different Eigenspace-based methods have been established. Following are some brief
descriptions about existing Eigenspace-based methods in addition to some Eigenspace-
based comparative studies.

1. Eigenfaces for Recognition 1991


The method in [24] is considered a key-role in developing many Eigenspace-based
methods in literature. The system is initialized by first acquiring the training set (ideally a
number of examples of each subject with varied lighting and expression). Eigenvectors
and eigenvalues are computed on the covariance matrix of the training images. The first
eigenvectors with the highest eigenvalues are kept to construct the face space. Finally, the
known individuals are projected into this face space, and their weights are stored. This
process is repeated as necessary.
The new image is then projected into the same face space. The Euclidean distance
measures the distance between the new projected image and a class of projected faces. If
the minimum distance measure is less than a threshold, the face is recognized. Fig.2.7
shows the block diagram of this method.

Figure 2.7: Block diagram of the standard Eigenface method

16
2. Subspace Linear Discriminant Analysis (LDA) 1999
The method in [25] consists of two steps: first, the face image is projected from the
original vector space to a face subspace via PCA where the subspace dimension is
carefully chosen, and then use Linear Discriminant Analysis (LDA) to obtain a linear
classifier in the subspace. The criterion that is used to choose the subspace dimension
enables the system to generate class-separable features via LDA. Fig.2.8 shows the main
steps of this method.

Figure 2.8: The subspace LDA face recognition system

3. Evolutionary Pursuit and Its Application to Face Recognition 2000


Evolutionary Pursuit (EP) implements strategies characteristic of genetic algorithms
(GAs) for searching the space of possible solutions to determine the optimal basis. In
[26], EP starts by projecting the original data into a lower dimensional whitened Principal
Component Analysis (PCA) space. Directed but random rotations of the basis vectors in
this space are then searched by GAs where evolution is driven by a fitness function
defined in terms of performance accuracy (empirical risk) and class separation
(confidence interval). Accuracy indicates the extent to which learning has been successful
so far, while separation gives an indication of the expected fitness on future trials. Fig.2.9
shows a flowchart for the main steps of this method.

4. Face Recognition Using ICA and SVM 2003


The work in [27] uses the Independent Component Analysis (ICA) for feature extraction,
which can be considered as a generalization of the PCA, followed by using the Support
Vector Machine (SVM) as a classifier. It shows that the results obtained by using the
combination PCA/SVM are not very far from those obtained with ICA/SVM. They
suggest that SVMs are relatively insensitive to the representation space.

17
Figure 2.9: Flowchart for the Face recognition using evolutionary pursuit (EP) method

5. Face Recognition by Independent Component Analysis 2002


In [28], a version of ICA derived from the principle of optimal information transfer
through sigmoidal neurons is used. ICA is performed on face images under two different
architectures, one which treats the images as random variables (mixtures) and the pixels
as outcomes (sources), and a second which treats the pixels as random variables and the
images as outcomes. It founds that both ICA representations are superior to
representations based on PCA for recognizing faces across days and changes in
expression. In addition, a classifier that combines the two ICA representations gives the
best performance.
In architecture 1 for example, the ICA model is given by:
U M × N = WM × M × X M × N (2.1)
Where XM×N contains M mixtures (training images) each of dimension N, WM×M is the
unmixing matrix and UM×N contains M sources each with dimension N. Due to the high
dimensionality of the mixtures (images), which makes the problem of solving W matrix
intractable and takes long time. Instead, the dimension of the input images is first reduced
by projecting them on PCA, then use these lower-dimension representation to solve W.
Fig.2.10 explains the image synthesis model for architecture 1.
In training, each image is first projected on the PCA then the result is projected on the
ICA to get the feature vector. In testing, the same steps is applied on the probe image to
extract its feature vector, then the cosine distance is used as the similarity measure
between the input and the stored feature vectors.

18
Figure 2.10: Image synthesis model for Architecture 1. To find a set of IC images, the images in X are
considered to be a linear combination of statistically independent basis images, S, where A is an
unknown mixing matrix. The basis images are estimated as the learned ICA output U.

6. Face Recognition with One Training Image per Person 2002


In [29], an extension of the Eigenface technique called Projection-Combined Principal
Component Analysis (PC)2A, is proposed. (PC)2A combines the original face image with
its horizontal and vertical projections and then performs principal component analysis on
the enriched version of the image. It requires less computational cost than the standard
Eigenface technique and experimental results show that on a gray-level frontal view face
database where each person has only one training image, (PC)2A achieves 3%-5% higher
accuracy than the standard Eigenface technique through using 10%-15% fewer
Eigenfaces. Fig.2.11 shows an example of the projection map and the projection-
combined image.

Figure 2.11: Example of the projection map and the projection-combined image

7. Bayesian Modeling of Facial Similarity 1998


In [30], the intrapersonal space (IPS) and extrapersonal space (EPS) are constructed first
by computing the intrapersonal differences (i.e. difference images between any two
image pairs belonging to the same individual) and the extrapersonal differences (by
matching images of different individuals in the gallery), then, performing a separate PCA

19
analysis on each. All images are then projected on both intrapersonal and extrapersonal
spaces. In testing, the probe image is projected on both spaces and then the Euclidean
distances are computed between the interior projected vectors and the exterior projected
vectors of both the input image and training images in order to get the Bayesian similarity
score which used for recognition.

8. Intra-Personal Kernel Space for Face Recognition 2004


In [31], an intrapersonal space (IPS) is constructed first by collecting all the difference
images between any two image pairs belonging to the same individual, to capture all
intra-personal variations. Then, the probabilistic analysis of kernel principal components
(PKPCA) is performed on this IPS which actually derives the intrapersonal kernel
subspace. Finally, the Mahalanobis distance is used for recognition. The recognition
performance demonstrates the advantage of this approach over other traditional subspace
approaches including PCA, Kernel PCA, ICA, Kernel ICA, Fisher Discriminant Analysis
FDA and Kernel FDA.

9. Eigenspace-based Comparative Studies


Many comparative studies between different Eigenspace-based methods have been
established [28], [32-40]. Among these comparisons, [39] presented an independent,
comparative study of three most popular appearance-based face recognition projection
methods (PCA, ICA and LDA) and their accompanied four distance metrics (L1, L2,
cosine and Mahalanobis) in completely equal working conditions. The results show that
no particular projection-metric combination is the best across all standard FERET tests
and the choice of appropriate projection-metric combination can only be made for a
specific task.
In addition, the work in [40] presents another independent comparative study among
different Eigenspace-based approaches. The study considers standard, differential and
kernel Eigenspace-based methods. In the case of the standard ones, three different
projection algorithms (PCA, FLD and EP) and eight different similarity measures
(Euclidean, Whitening Euclidean (Mahalanobis), Cosine, and Whitening Cosine
distances, SOM and Whitening SOM Clustering, FFC and Whitening FFC) are
considered. In the case of the differential methods, two approaches are used, the pre-
differential and the post-differential. In both cases Bayesian and SVM classification are
employed. Finally, regarding kernel methods, Kernel PCA and Kernel FD are used
together with the eight similarity measures employed for the standard approaches.
Simulations are performed using the Yale Face Database, a database with few classes and
several images per class, and FERET, a database with many classes and few images per
class. They conclude that:

20
• Considering recognition rates, generalization ability as well as processing time, the
best results are obtained with the post-differential approach, using either a Bayesian
Classifier or SVM.
• In the specific case of the Yale Face Database, where the requirements are not very
high, any of the compared approaches gives rather similar results. Thanks to their
simplicity, Eigenfaces or Fisherfaces are probably the best alternatives.
• Although kernel methods obtain the best recognition rates, they suffer from problems
such as low processing speed and the difficulty to adjust the kernel parameters.

2.3.2 Frequency-based Category


In frequency-based category, the main idea is to map the image from spatial domain to
frequency domain, and then construct the feature vector from this domain. Z. Pan et al.
[41] and Spiess H. et al. [42] use DCT and FFT respectively to extract most important
features. J. H. Lai et al. [43], apply FWT to make the image less sensitive to expressions
variations, then apply FFT twice to make the features set invariant to translation, scale,
and on-the-plane rotation. Also, Dai D. et al. [44] and [45] suggest the use of wavelet
transform, followed by applying LDA as a classifier in [44] or PCA as a representation
method in [45]. Following are some brief description about existing Frequency-based
methods.

1. Human Face Recognition Using PCA on Wavelet Subband 2000


In [45], an image is decomposed into a number of subbands with different frequency
components using the wavelet transform (WT). A mid-range frequency subband image
(HH after three decomposition levels) with resolution 16 × 16 is selected, subband
number 4 in Fig.2.12. The subbands of the training images are used to construct the PCA
subspace in which all training images are projected on it to get their corresponding
feature vectors. In recognition stage, the probe image is first subtracted by the mean value
of the reference images, then, a mid-range frequency subband image (HH after three
decomposition levels) is extracted followed by projecting it on the PCA subspace to get
the feature vector. Finally, the similarity measurement between the feature vectors of the
probe image and the reference images is performed to determine whether the input probe
image matched any of the images.
The proposed method reduces the computational complexity significantly. Moreover,
experimental results demonstrated that applying PCA on WT sub-image with mid-range
frequency components gives better recognition accuracy and discriminatory power than
applying PCA on the whole original image.

21
Figure 2.12: 3-level wavelet decomposition

2. Face Recognition Based on Local Fisher Features 2000


Authors of [44] propose a Localized LDA (LLDA) system based on applying the LDA on
the mid-range subband of wavelet transform. The experiments show that this system has
good classification power. The LLDA features are edge information of images. It
concentrates on a mid-range subband by the two reasons: (1) it gives edge information;
(2) it overcomes the difficulty in solving a singular Eigen value problem.

3. High speed face recognition based on discrete cosine transforms and


neural networks 2000
In this method [47], discrete cosine transforms (DCTs) are used to reduce the
dimensionality of face space by truncating high frequency DCT components. The
remaining coefficients are fed into a neural network for classification. The selection of
the DCT coefficients is done as in Fig.2.13. Because only a small number of low
frequency DCT components are necessary to preserve the most important facial features,
the proposed DCT-based face recognition system is much faster than other approaches.

Figure 2.13: (a) input image, (b) the log-magnitude of its DCT, (c) the scanning strategy of
coefficients

22
4. Face Recognition in Fourier Space 2000
The work in [42] describes a simple face recognition system based on an analysis of faces
via their Fourier spectra. The feature vectors are constructed by taking the Fourier
coefficients at selected frequencies as shown in Fig.2.14. Recognition is done by finding
the closest match between feature vectors using the Euclidian distance classifier.

Figure 2.14: Most variant frequencies: a) real, b) imaginary and c) selected numbering

5. Face Recognition Using Holistic Fourier Invariant Features 2001


Authors of [43] introduce the Spectroface representation which is based on the wavelet
transform and holistic Fourier invariant features as illustrated in Fig.2.15. Wavelet
transform is applied to the face image to eliminate the effect of facial expressions. Then,
the holistic Fourier invariant features (Spectroface) are extracted from the low frequency
subband image (LL) by applying Fourier transform twice. The first Fourier transform is
applied to the low frequency subband to make it invariant to the spatial translation. Then
the second Fourier transform is applied to the polar transformation of the result to make it
invariant to scale and on-the-plane rotation. Recognition is done by finding the closest
match, using Euclidian distance, between the Spectroface of the probe image and those
stored in the gallery.

6. Face Recognition Based on Polar Frequency Features 2006


A novel biologically motivated face recognition algorithm based on polar frequency is
presented in [48]. Polar frequency descriptors are extracted from face images by Fourier-
Bessel transform (FBT), which based on converting the image from Cartesian to polar
coordinates followed by extracting the Fourier-Bessel series. Examples of the FBT of
some images are shown in Fig.2.6. Next, the Euclidean distance between FBT’s of all
images is computed and each image is now represented by its dissimilarity to the other
images. A Pseudo-Fisher Linear Discriminant was built on this dissimilarity space. The
results indicate the high informative value of the polar frequency content of face images
in relation to recognition and verification tasks.

23
Figure 2.15: Spectroface representation steps

Figure 2.16: Examples of FBT of (A) an 8 radial cycles image, (B) a 4 angular cycles image and (C)
an image of the average of these images. The magnitude of the FBT coefficients is presented in
colored levels (red indicates the highest value)

2.3.3 Other Holistic-Based Approaches

1. A Hybrid Feature Extraction Approach for Face Recognition based


on Moments 2004
In this method [49], different feature extraction techniques such as Fourier descriptors,
Zernike moments, Hu moments and Legendre moments are considered and classification
techniques such as Nearest Neighbor classifiers, Linear Discriminant Analysis LDA
classifiers and neural network classifiers are compared. Results on ORL [112] database
show that using hybrid features composed of Fourier descriptors and Zernike moments
with back-propagation NN as a classifier give the best recognition results. Fig.2.17 shows
the block diagram for this face recognition system.

24
Figure 2.17: Block diagram for face recognition based on moments

2. Face Recognition with Support Vector Machines: Global versus


Component-based Approach 2001
Authors of [18] present a local-based method and two holistic-based methods based on
using the support vector machine SVM and evaluate them with respect to robustness
against pose changes. Local-based method starts with locating facial components,
extracting and combining them into a single feature vector which is classified by a
Support Vector Machine (SVM), described above in the Local-based section. The two
holistic-based methods recognize faces by classifying a single feature vector consisting of
the gray values of the whole face image. In the first one, a single SVM classifier is
trained for each person in the database. The second system consists of sets of viewpoint-
specific SVM classifiers and involves clustering during training. Extensive tests are
performed on a database which included faces rotated up to about 40° in depth. The local-
based method clearly outperformed both holistic-based methods on all tests.

3. Face Recognition with Pose and Illumination Variations using new


SVRDM Support Vector Machine 2005
A new support vector representation and discrimination machine (SVRDM) classifier is
proposed in [50] and face recognition-rejection results are presented using the CMU PIE
face database; both pose and illumination variations are considered. The recognition
approach is based on a view-based two-step strategy, in which the pose of a test input is
first estimated using SVRDM and this is followed by an identity classification with
another SVRDM assuming the estimated pose. Four different classifiers are compared;
namely SVM, SVRDM, Eigenface and Fisherface classifiers. Experimental results shows
that the SVRDM performs best among all classifiers using the two-step strategy and that
the SVRDM is less sensitive to the size of the classification problem than are other
classifiers.

25
4. Face Recognition using a New Texture Representation of Face Images
2003
Authors of [51] present a new texture representation of face image using a robust feature
from the Trace transform. The masked Trace transform (MTT) offers texture information
for face representation which is used to reduce the within-class variance by masking out
the background and non pure-face information. Fig.2.18 shows an example of the trace
transform on a full face image and its masked version. The method starts with
transforming the image space to the Trace transform space to produce the MTT.
Weighted Trace transform (WTT) is then calculated which identifies the tracing lines of
the MTT that produce similar values irrespective of intra-class variations. Finally, a new
distance measure is proposed by incorporating the WTT for measuring the dissimilarity
between reference and test images.

Figure 2.18: Examples of the Trace transform on (a) full image (b) masked with rectangular shape
and (c) masked with elliptical shape.

5. Embedded Bayesian networks for face recognition 2002


The work in [52] introduces a family of embedded Bayesian networks (EBN), which
considered a generalization of the embedded hidden Markov models, and investigates
their performance for face recognition. Results show that the members of the EBN family
outperform some of the existing approaches such as the Eigenface method and the
embedded HMM method.

26
2.4 Hybrid Approaches

1. Face Recognition Using Local and Global Features 2004


The work in [53] proposes to combine local and global facial features for face
recognition. Four popular face recognition methods, namely, Eigenface [24], Spectroface
[43], independent component analysis (ICA) [54], and local Gabor wavelet [15] are
selected for combination.
Since each of Spectroface, PCA, and ICA use distance measurement for classification,
while local Gabor wavelet use similarity measurement, these measurements should be
normalized at the same scale to be able to combine the four methods. Two normalization
methods, namely, linear-exponential normalization method and distribution-weighted
Gaussian normalization method, are proposed here.
In addition, to choose the best set of classifiers for recognition, a simple but effective
algorithm for classifiers selection is proposed. It is based on the leave-one-out algorithm
through an iterative scheme. The basic idea of the scheme is that if one classifier is
redundant, the accuracy will increase if that classifier is removed from combination.
Finally, a weighted combination of classifiers based on the sum rule is used instead of
assigning equal weight to each classifier. Fig.2.19 shows both the training and
recognition stages. The experimental results show that the proposed method has 5–7%
accuracy improvement over using a single global/local classifier.

Figure 2.19: Training and recognition stages of Face Recognition Using Local and Global Features
approach

2. A Probabilistic Fusion Methodology for Face Recognition 2005


The work in [55] considers three facial features, two global and one local, which are the
entire face (i.e., the gray-level image of the face), the edginess image of the face, and the
eyes, respectively. Fig.2.20 shows the three facial features. The edginess image is a
global facial feature that is reasonably robust to illumination. It is a measure of the
change in intensity from one pixel to the next. The eyes are manually located and cropped

27
to be used as a facial feature that is robust to facial expressions and occlusions (especially
when the lower part of the face is fully occluded).

Figure 2.20 (a) A gray-scale face image, (b) it’s edginess image, and (c) the cropped eyes.
Next, the facial features are encoded to lower-dimensional feature spaces using the
principal component analysis (PCA) in conjunction with Fisher’s Linear Discriminant
(FLD). Three individual spaces are constructed corresponding to the three facial features.
The distance-in-feature-space (DIFS) values are calculated for all the images in the
training set and in each of the feature spaces. These values are used to compute the
distributions of the DIFS values. Given a new test image, the three facial features are first
extracted and their DIFS values are computed in each feature space. Each feature
provides an opinion on the claim in terms of a confidence value. The confidence values
of all the three features are fused for final recognition. The identity established by the
proposed fusion technique is more reliable compared to the case when features are used
individually.

2.5 Performance Evaluations and Comparative Studies


2.5.1 Performance Evaluation
Performance evaluations of biometric technology are divided into three categories:
technology, scenario, and operational. Each category of evaluation takes a different
approach and studies different aspects of the system. A thorough evaluation of a system
for a specific purpose starts with a technology evaluation, followed by a scenario
evaluation and finally an operational evaluation [113].
The goal of a technology evaluation is to compare competing algorithms from a single
technology, which in this case is facial recognition. Testing of all algorithms is done on a
standardized database collected by a "universal" sensor and should be performed by an
organization that will not see any benefit should one algorithm outperform the others.
The use of a test set ensures that all participants see the same data. Someone with a need
for facial recognition can look at the results from the images that most closely resemble
their situation and can determine, to a reasonable extent, what results they should expect.
Technology evaluations are always completely repeatable. Results from a technology
evaluation typically show specific areas that require future research and development, as
well as provide performance data that is useful when selecting algorithm(s) for scenario

28
evaluations. This evaluation class includes the FERET (face recognition technology)
series of face recognition evaluations and the FRVT (face recognition vendor test) series
[114].
Scenario evaluations aim to evaluate the overall capabilities of the entire system for a
specific application. In face recognition, a technology evaluation would study the face
recognition algorithms only but the scenario evaluation studies the entire system,
including camera and camera-algorithm interface, for a specific application. An example
is face recognition systems that verify the identity of a person entering a secure room.
Each tested system would normally have its own acquisition sensor and would thus
receive slightly different data. Scenario evaluations are not always completely repeatable
for this reason, but the approach used can always be completely repeatable. Scenario
evaluations typically take a few weeks to complete because multiple trials, and for some
scenario evaluations, multiple trials of multiple subjects/areas, must be completed.
Results from a scenario evaluation typically show areas that require additional system
integration, as well as provide performance data on systems for a specific application. An
example of the scenario evaluation is the UK Biometric Product Testing [56].
At first glance, an operational evaluation appears very similar to a scenario evaluation,
except that the test is at the actual site and uses actual subjects. Rather than testing for
performance, however, operational evaluations aim to study the workflow impact of
specific systems installed for a specific purpose. Operational evaluations are not very
repeatable unless the actual operational environment naturally creates repeatable data.
Operational evaluations typically last from several weeks to several months. The
evaluation team must first examine workflow performance prior to technology insertion,
and again after users are familiar with the technology. Accurate analysis of the benefit of
the new technology requires a comparison of the workflow performance before and after
the technology insertion.
In an ideal three-step evaluation process, technology evaluations are performed on all
applicable technologies that could conceivably meet requirements. The technical
community will use the results to plan future research and development (R&D) activities,
while potential end-users will use the results to select promising systems for application-
specific scenario evaluations. Results from the scenario evaluation will enable end-users
to find the best system for their specific application and have a good understanding of
how it will operate at the proposed location. This performance data, combined with
workflow impact data from subsequent operational evaluations, will enable decision
makers to develop a solid business case for a large-scale installation.

29
2.5.2 Comparative Studies
Many comparative studies have been established in the last 10 years due to increasing in
the number of available algorithms and techniques. These comparative studies usually try
to evaluate two or more face recognition algorithms using one or more small to medium
size databases. They can be classified into two categories according to the nature of
databases they work on:
1. Comparative studies using global database(s) usually contain more than one
variation to determine which algorithm is better over these databases, as in [40],
[57-60].
2. Comparative studies for specific variation(s) using suitable database(s) each with
one variation only to study the algorithms over each variation separately, as in
[21], [39], [61-65].
Under the first category, [57] evaluate three holistic-based algorithms and a local one
using seven different classifiers over FERET database with variations in illumination and
aging. In [40], seven Eigenspace-based algorithms with five similarity matching criteria
are compared over two databases, Yale with variations in illumination and expressions
and FERET. Five algorithms based on local binary patterns are compared with a holistic
and a local algorithm in [59] over two databases, BANCA with complex background and
difficult lightning conditions and XM2VTS with uniform background. In [60], two
holistic algorithms and a local one are compared over four different databases, each with
two or more variations.
In the second category, only the pose variation is considered in [61] and [62]. In [61], two
holistic-based algorithms are compared using two databases (ALAN and UMIST). In
[62], five holistic-based algorithms are compared using one database (FERET). In [39]
and [21], expressions, illumination and aging variations are tested separately using
FERET database over three holistic-based algorithms in [39] and two holistic-based and
two local-based algorithms in [21]. Non-uniform illumination variation is considered in
[63] in which five holistic-based algorithms are compared over two different databases,
CMU-PIE and YALE B each with illumination variations only.

As the main aim of this thesis is to propose an illumination normalization approach, the
next chapter will discuss the different illumination normalization approaches in the
literature. Also, nine comparative studies are introduced at the end of the chapter to select
the best-of-literature approaches.

30
CHAPTER 3: Illumination Normalization Approaches
3.1 Introduction
Although many face recognition techniques and systems have been proposed in the last
20 years, evaluations of the state-of-the-art techniques and systems have shown that
recognition performance of most current technologies degrades due to the variations of
illumination [78], [12]. In the last face recognition vendor test FRVT 2006 [12], they
conclude that relaxing the illumination condition has a dramatic effect on the
performance. Moreover, It has been proven both experimentally [84] and theoretically
[14] that the variations between the images of the same face due to illumination are
almost always larger than image variations due to change in face identity. As is evident in
Fig.3.1, the same subject, with the same facial expression, can appear strikingly different
when light source direction and viewpoint vary [79].

Figure 3.1: The same individual imaged with the same camera and the same facial expression may
appear dramatically different with changes in the lighting conditions.
There has been much work dealing with illumination variations in face recognition.
Generally, these approaches can be classified into two categories: model-based and
image-processing-based approaches.
Model-based approaches derive a model of an individual face, which will account for
variation in lighting conditions. Examples of this approach include spherical harmonics
representation [81], Eigen Light-Fields [82], illumination cone [64], Quotient Image
[100], Self Quotient Image [107] and Retinex algorithms [89], [90], [66]. Though the
model-based approaches are perfect in theory, they require a training set with several
different lighting conditions for the same subject, which can be considered as a weakness
for realistic applications. Although some work has been done to enlarge a small learning
set by virtually re-imaging the input face image as in [83], [96], the requirements of
additional constraints or assumptions in addition to the highly computational cost make
the model-based approaches unsuitable for realistic applications [80], [71].
Image-processing-based approaches attempt to normalize the variation in appearance due
to illumination, either by image transformations or by synthesizing a new image from the
given image in some normalized form. Recognition is then performed using this

31
canonical form. Examples of this approach include histogram equalization/matching
HE/HM [71], gamma intensity correction GIC [84], local binary patterns LBP [80] and
local normal distribution LNORM [72]. Compared to the model-based approach,
preprocessing has two main advantages: it is completely stand-alone and thus can be used
with any classifier. Moreover, it transforms images directly without any training images,
assumptions or prior knowledge. Therefore, they are more commonly used in practical
systems for their simplicity and efficiency.
The rest of this chapter is organized as follows: sections 2 and 3 contain the description
of some model-based and image-processing-based approaches, respectively. Section 4
contains the results of existing comparative studies among different illumination
normalization approaches focusing on the best approaches of each comparison and then
concludes the best-of-literature approaches from these studies.

3.2 Model-Based Approaches

1. Quotient Illumination Relighting (QIR)


The work in [84] firstly proposes the Quotient Illumination Relighting (QIR) approach
for robust face recognition under varying lighting conditions. QIR is based on the
Lambertian model in which the face image can be described by the product of the albedo
and the cosine angle between a point light source and the surface normal, as follows:
I ( x, y ) = ρ ( x , y ) n( x, y ) T s (3.1)
where 0 ≤ ρ(x, y) ≤ 1 is the surface reflectance (albedo) associated with point x, y in the
image, n(x, y) is the surface normal direction associated with point x, y in the image, and
s is the light source direction (point light source) and whose magnitude is the light source
intensity [100].
To understand the idea of QIR, we firstly need to consider the following three definitions
from [84]:
1. Ideal class of objects: is a collection of 3D objects that have the same shape but
differ in the surface albedo function. The image space of such a class is
represented by:
ρ i ( x, y ) n ( x, y ) T s j (3.2)
where ρi (x, y) is the albedo of object i of the class, n(x, y) is the surface normal of
the object (the same for all objects of the class), and sj is the light source
direction, which can vary arbitrarily.
2. Quotient Illumination: for the lighting condition sj of an ideal class of objects
(whose shape is n) is defined as:
n( x, y )T .s j
R j ( x, y ) = (3.3)
n( x, y )T .s0

32
Where s0 (point light source) be a pre-defined canonical lighting condition.
Obviously, the Quotient Illumination is completely independent of the surface
reflectance (albedo), and depends only on the variance of the lighting condition
from the pre-defined canonical lighting one (considering all the shapes are
assumed to be the same). Thus, Quotient Illumination can be computed easily by
calculating the quotient between the images of the object i of the ideal class of
objects as follows:
ρ i ( x, y )n( x, y )T .s j I ij ( x, y )
R j ( x, y ) = = (3.4)
ρ i ( x, y )n( x, y ) T .s0 I i 0 ( x, y )
where Iij is the image of object i captured under the j-th lighting condition, and Ii0
is the image of the same object i captured under the canonical lighting condition.
3. Quotient illumination bootstrap set: since faces are not strictly ideal class of
objects as the 3D shapes of faces are different despite their approximate
similarity. Therefore, a Quotient illumination bootstrap set needs to be
constructed. It consists of a set of pairs of face images captured under some non-
canonical lighting condition and under the pre-defined canonical lighting
condition, as follows:
{(I ij
, I i 0 ) | i = 1,2,..., N ; j = 1,2,..., L} (3.5)
where N is number of persons and L is number of non-canonical lighting
conditions in the system. Given such a bootstrap set, Quotient Illumination can be
statistically modeled or computed simply as the mean over all the faces in the set,
for instance:
1 N I ij ( x, y )
R j ( x, y ) =
N
∑I
i =1 ( x, y )
, j = 1,2,..., L (3.6)
i0

Finally, the Quotient Illumination Relighting (QIR) can be computed as follows:


Given an image of arbitrary face, Iij, assume that it is lighted by the j-th known lighting
condition, and the j-th quotient illumination Rj has been computed too. Then, its
canonical image captured under the pre-defined 0-th lighting condition can be derived by:
I ij ( x, y )
I i 0 ( x, y ) = (3.7)
R j ( x, y )
This provides a direct and simple way for illumination normalization but under the
condition that the direction of the lighting source of the image can be known. See Fig.3.2
for its intuitive effect on an illuminated face image from Yale B database.
Note that the QIR is based on the assumption that the lighting modes of the images, both
probe and references, are known or can be estimated. This is a strong constraint in a
practical application system.

33
(a) Illuminated Image (b) QIR image

Figure 3.2: Effect of applying QIR on an illuminated face image from Yale B database

2. Self-Quotient Image (SQI)


To avoid the requirement for knowing/estimating the lighting modes of images in QIR,
the concept of Self-Quotient Image (SQI) is first introduced in [107] for robust face
recognition under varying lighting conditions. SQI is based also on the Lambertian model
[107] rather than the reflectance illumination model [107] and it’s defined by:
I I
Q= = (3.8)
ˆI F * I
where Iˆ is the smoothed version of I, F is the smoothing kernel, and the division is
point-wise. Fig.3.3 shows the effect of applying SQI on illuminated face images from two
face databases, Yale B and CMU PIE.

a) Yale B

b) CMU PIE
Figure 3.3: Effect of applying SQI approach to illuminated face images from Yale B and CMU PIE
databases.
The only processing needed for SQI is smoothing filtering. A weighed Gaussian filter is
designed for anisotropic smoothing according to the following equation:
F = G *W (3.9)

34
where W is the weight and G is the Gaussian kernel, and N is the normalization factor for
which:
1
N Ω
∑WG = 1 (3.10)

where Ω is the convolution kernel size. The convolution region is divided into two sub-
regions M1 and M2 with respect to a threshold τ. Where M1 has more pixels than M2, τ is
calculated by:
τ = Mean ( I Ω ) (3.11)
For the two sub-regions, W has corresponding value:
⎧0 I (i, j ) ∈ M 2
W (i, j ) = ⎨ (3.12)
⎩1 I (i, j ) ∈ M 1
If the convolution image region is smooth, i.e. little gray value variation (non-edge
region), there is also little difference between the smoothing the whole region and part of
the region. If there is large gray value variation in convolution region, i.e. edge region,
the threshold can divide the convolution region into two parts M1 and M2 along the edge
and the filter kernel will convolute only with the large part M1, which contains more
pixels. Therefore the halo effects can be significantly reduced by the weighted Gaussian
kernel.
The essence of this anisotropic filter is that it smoothes only the main part of convolution
region (i.e. only one side of edge region in case of step edge region).
The division operation in the SQI may magnify high frequent noise especially in low
signal noise ratio regions, such as in shadows. To reduce noise in Q, a nonlinear
transformation function is used to transform Q into D,
D = T (Q) (3.13)
where T is a nonlinear transform which may be Log, Arctangent or Sigmoid nonlinear
function.
The implementation of SQI approach is summarized below:
1. Select several smoothing kernel G1, G2, …, Gn and calculate corresponding
weights W1, W2, …, Wn according to image I, and then smooth I by each weighted
anisotropic filter WGi.
1
Iˆk = I * WGk , k = 1,2,..., n (3.14)
N
Calculate self-quotient image (SQI) between input image I and its smoothing
version
I
Qk = , k = 1,2,..., n (3.15)
ˆI
k

35
2. Transfer self-quotient image (SQI) with nonlinear function
Dk = T (Qk ), k = 1,2,..., n (3.16)
3. Summarize nonlinear transferred results
n
Q = ∑ mk Dk , k = 1,2,..., n (3.17)
k =1

The m1, m2,… mn are the weights for each scale of filter and are set to one in
experiments of [107].

3. Single Scale Retinex with Histogram Matching (SSR-HM)


When the dynamic range of a scene exceeds the dynamic range of the recording medium,
the visibility of color and detail will usually be quite poor in the recorded image.
Dynamic range compression attempts to correct this situation by mapping a large input
dynamic range to a relatively small output dynamic range. Simultaneously, the colors
recorded from a scene vary as the scene illumination changes. Color constancy aims to
produce colors that look similar under widely different viewing conditions and
illuminants. The Retinex is an image enhancement algorithm that provides a high level of
dynamic range compression and color constancy [89].
The work in [66] proposes an illumination normalization approach based on applying the
Retinex followed by histogram matching for illumination invariant face recognition.
Different from the QIR and SQI approaches, Retinex algorithms are based on the
reflectance illumination model [107] rather than the Lambertian model [107]:
I ( x, y ) = R ( x , y ) × L ( x , y ) (3.18)
where I is the image, R is the reflectance of the scene and L is illuminance/lighting at
each point (x, y).
Many variants of the Retinex have been published over the years. The last version from
Land [90] is now referred to as the Single Scale Retinex (SSR) and is defined for a point
(x, y) in an image as:
I R ( x, y ) = log I i ( x, y ) − log[F ( x, y ) ⊗ I i ( x, y )] (3.19)
where IR(x, y) is the Retinex output and Ii(x, y) is the image distribution in the i-th
spectral band. There are three spectral bands – one each for red, green and blue channels
in a color image.
In equation 3.19, the symbol ⊗ represents the convolution operator and F(x, y) is the
Gaussian surround function given by equation 3.20. The final image produced by Retinex
processing is denoted by IR:
F ( x1 , x2 ) = κ exp[− ( x12 + x22 ) σ 2 ] (3.20)

36
where σ is the standard deviation of the filter and controls the amount of spatial detail that
is retained, and κ is a normalization factor that keeps the area under the Gaussian curve
equal to one.
The standard deviation of the Gaussian σ is referred to as the scale of the SSR. A small
value of σ provides very good dynamic range compression but at the cost of poorer color
rendition, causing graying of the image in uniform areas of color. Conversely, a large
scale provides better color rendition but at the cost of dynamic range compression [89].
Since face recognition is conventionally performed on grey-scale images, the loss of
color is out of concern here. Moreover, the dynamic range compression gained by small
scales is the essence of the illumination normalization process proposed in [66]. All the
shadowed regions are grayed out to a uniform color, eliminating soft shadows and
specularities and hence creating an illumination invariant signature of the original image.
Fig.3.4 illustrates the effect of Retinex processing on a facial image, I, for different
values of σ. As σ increases, the normalized image IN, contains reduced graying and lesser
loss of intensity values, as seen in Fig.3.4 (c) and (d). However, for larger values of σ, the
shadow is still visible. On the other hand, with σ = 6 in Fig.3.4 (b), the resulting image
has grayed out the shadow region to blend in with the rest of the face.

(a) Illuminated (b) IR with σ = 6 (c) IR with σ = 50 (d) IR with σ = 100


Image
Figure 3.4: The effect of the scale, σ, on processing an illuminated facial image using the SSR.
Finally, histogram matching/fitting is then applied in to bring all the images that have
been processed by the SSR to the same dynamic range of intensity [66]. The histogram of
IR is modified to match a histogram of a specified target image ÎR as shown in Fig.3.5.

4. GROSS Method
The work in [85] proposes an illumination normalization approach, we call it GROSS
approach, for illumination invariant face recognition. Same as Retinex algorithms,
GROSS is based on the reflectance illumination model rather than the Lambertian model,
see equation 3.18.
The GROSS approach is motivated by two widely accepted assumptions about human
vision:
1. Human vision is mostly sensitive to scene reflectance and mostly insensitive to
the illumination conditions.

37
2. Human vision responds to local changes in contrast rather than to global
brightness levels.
Having these assumptions, the goal is to find an estimate of L(x, y) such that when it
divides I(x, y) it produces R(x, y) in which the local contrast is appropriately enhanced.

(a) Illuminated Image, I (b) IR with σ = 6 (d) Well-lit Image, Î (e) ÎR with σ = 6

(c) Source SSR Histogram (f) Target SSR Histogram

(g) Histogram Matched/Fitted Image (h) Histogram of Image in (g)

Figure 3.5: Histogram fitted version of SSR with σ = 6


This view is called perception gain model, in which R(x, y) takes the place of perceived
sensation, I(x, y) takes the place of the input stimulus and L(x, y) is then called perception
gain which maps the input sensation into the perceived stimulus, that is:
1
I ( x, y ) = R ( x, y ) (3.21)
L ( x, y )
The solution for L(x, y) is found by minimizing:
J ( L) = ∫∫ ρ ( x, y )(L − 1) dxdy + λ ∫∫ (L2x + L2y )dxdy
2

Ω Ω
(3.22)
where the first term drives the solution to follow the perception gain model, while the
second term imposes a smoothness constraint. Here Ω refers to the image. The parameter

38
λ controls the relative importance of the two terms. The space varying permeability
weight ρ(x, y) controls the anisotropic nature of the smoothing constraint.
The Euler-Lagrange equation for this calculus of variation problem yields:
λ
L + (Lxx + L yy ) = I (3.23)
ρ
Discretized on a rectangular lattice, this linear partial differential equation (PDE)
becomes:
⎡ ⎤

Li , j + λ ⎢
1
(Li , j − Li , j−1 ) + 1
(Li , j − Li , j+1 ) + 1
(Li , j − Li−1, j ) + 1
(Li , j − Li+1, j )⎥⎥ = I (3.24)
hρ 1 hρ 1 hρ 1 hρ 1
⎢⎣ i , j − 2 i, j+
2
i− , j
2
i+ , j
2
⎥⎦

where h is the pixel grid size and the value of each ρ is taken in the middle of the edge
between the center pixel and each of the corresponding neighbors (see Fig.3.6). In this
formulation, ρ controls the anisotropic nature of the smoothing by modulating
permeability between pixel neighbors. Equation 3.24 can be solved numerically using
multi-grid methods for boundary value problems [108] in order O(N).

Figure 3.6: Discretization lattice for the PDE in equation 3.23


The smoothness is penalized at every edge of the lattice by weights ρ (see Fig.3.6). To
make the weight changes proportionally with the strength of the discontinuities, the
following relative measure of local contrasts is used that will equally "respects"
boundaries in shadows and bright regions:
∆I Ia − Ib
hρ a +b = = (3.25)
2 I min(I a , I b )
where ρ a+b is the weight between two neighboring pixels whose intensities are Ia and Ib.
2

Fig.3.7 shows the effect of applying GROSS approach on some illuminated face images
from Yale B database.

39
Note that the GROSS approach does not require any training steps, knowledge of 3D face
models or reflective surface models. Only a single parameter, which meaning is intuitive
and simple to understand, needs to be adjusted by the user.

(a) Illuminated images from Yale B

(b) Processed images by GROSS approach

Figure 3.7: Effect of applying GROSS approach on some illuminated face images from Yale B
database

3.3 Image-Processing-Based Approaches


The Image-processing-based approaches are applied either globally on the whole image
or locally over blocks or regions. The global approaches usually produce realistic images
while the local approaches have the disadvantage that the output is not necessarily
realistic. But since the objective is to obtain a representation of the face that is invariant
to illumination, while keeping the information necessary to allow a discriminative
recognition of the subjects, it’s possible to use them for illumination normalization.
However, since local approaches are highly dependent on the local distribution of the
pixels within the image, this make them sensitive to any geometrical effects on the
images such as translation, rotation and scaling rather than global approaches which are
not affected by such geometrical changes. Following are some examples under each
approach.

3.3.1 Global Approaches

1. Histogram Equalization (HE)


It’s one of the most common illumination normalization approaches [74]. It aims to
create an image with uniform distribution over the whole brightness scale by using the
cumulative density function of the image as a transfer function. Thus, for an image of

40
size M × N with G gray levels and cumulative histogram H(g), the transfer function at
certain level T(g) is given as follows:
H ( g ) × (G − 1)
T (g) = (3.26)
M ×N
Fig.3.8 shows the effect of applying histogram equalization to an illuminated image. The
illuminated image and its histogram are shown in Fig.3.8 (a), (b) while the equalized
image and its corresponding histogram are shown in Fig.3.8 (c), (d).

Illuminated
Image

(a) (b)

Equalized
Image

(c) (d)
Figure 3.8: Effect of applying histogram equalization on an illuminated image

2. Histogram Matching/Fitting (HM)


HM is most commonly used techniques of histogram adjustment. Given an illuminated
face image X and a well-lit face image Y, histogram matching [74] is applied to bring the
illumination level of the input image X to that of the reference image Y. This is done by
making the histogram of the X approximately “match” to the histogram of Y which makes
both images having roughly the same mean and variance in their histograms. Fig.3.9
demonstrates the histogram matching process to an illuminated image. The illuminated
image (source) and its histogram are shown in Fig.3.9 (a), (b). The well-lit image (target)
and its corresponding histogram are shown in Fig.3.9 (c), (d) respectively. The resulting
image and its histogram after applying the histogram matching are shown in Fig.3.9 (e),
(f).

41
Illuminated
Image

(a) (b)

Well-lit
Image

(c) (d)

Resulting
Image

(e) (f)
Figure 3.9: Histogram matching process to an illuminated image
To explain the algorithm, Let H(i) be the histogram function of an illuminated image X,
and G(i) be the desired histogram of the well-lit image Y, we wish to map H(i) to G(i) via
a transformation FHÆG(i). We first compute a transformation function for both H(i) and
G(i) that will map the histogram to a uniform distribution, U(i). These functions are
FHÆU(i) and FGÆU(i), respectively. Equations 3.27 and 3.28 depict the mapping to a
uniform distribution, which is also known as histogram equalization [74].
i

∑ H ( j)
j =0
FH → U(i) = N −1 (3.27)
∑ H ( j)
j =0

∑ G( j)
j =0
FG → U(i) N −1 (3.28)
∑ G( j)
j =0

Where N is the number of discrete intensity levels. N = 256 for 8-bit grayscale images.

42
To find the mapping function, FHÆG(i), the function FGÆU(i) is inverted to obtain FUÆG(i).
Since the domain and the range of the functions of this form are identical, the inverse
mapping is trivial and is found by cycling through all values of the function. However,
due to the discrete nature of these functions, inverting can yield a function which is
undefined for certain values. Thus, linear interpolation is used and assumes smoothness
to fill undefined points of the inverse function according to the values of well-defined
points in the function. As a result, a fully defined mapping FUÆG(i) is generated which
transforms a uniform histogram distribution to the distribution found in histogram G(i).
The mapping FHÆG(i) can then be defined as in equation 3.29, [66].
FH →G ( i ) = FU →G ( FH →U ( i ) ) (3.29)
It’s common in literature to match all images, in both training and testing sets, with a
single histogram of either a fixed well-lit image as in [71], [67] or an average image as in
[72].

3. Logarithmic Transform (LOG)


LOG is a frequently used technique of gray-scale transform. It simulates the logarithmic
sensitivity of the human eye to the light intensity. The general form of the log
transformation [74] is:
s = c log(1 + r ) (3.30)
Where r and s are the old and new intensity value, respectively and c is a gray stretch
parameter used to linearly scaling the result to be in the range of [0, 255]. The shape of
the log curve in Fig.3.10 shows that this transformation maps a narrow range of dark
input gray-levels (shadows) into a wider range of output gray levels. The opposite is true
for the higher values of the input gray-levels. Fig.3.11 shows the effect of applying LOG
to an illuminated face image.

Figure 3.10: Transformation functions of LOG and GAMMA (L: number of gray levels)

43
LOG

Illuminated Resulting
Image Image

Figure 3.11: Effect of applying LOG approach to an illuminated face image.

4. Gamma intensity correction (GIC)


Gamma correction is a technique commonly used in the field of Computer Graphics. It
concerns how to display an image accurately on a computer screen. Images that are not
properly corrected can look either bleached out, or too dark. Gamma correction can
control the overall brightness of an image by changing the Gamma parameter, see
Fig.3.10 for the effect of choosing Gamma (γ) to be greater than one which map a narrow
range of dark input values (shadows) into a wider range of output values.
Unlike the traditional Gamma correction technique in Computer Graphics, but motivated
by its idea, the Gamma Intensity Correction (GIC) method is proposed by [84] to correct
the overall brightness of the face images to a pre-defined “canonical” face images. It is
formulated as following:
Predefine a canonical face image, I0, which should be lighted under some normal lighting
condition. Then, given any face image, I, captured under some unknown lighting
condition. Its canonical image is computed by a Gamma transform pixel by pixel over the
image position x, y:
I xy′ = G (I xy ; γ * ) (3.31)
Where the Gamma coefficient γ* is computed by the following optimization process,
which aims at minimizing the difference between the transformed image and the
predefined normal face image I0:
γ * = arg min ∑ [G (I xy ;γ ) − I 0 (x, y )]
2

γ (3.32)
x, y

Where Ixy is the gray-level of the image position x, y; and


1

G (I xy ; γ ) = c.I xy
γ (3.33)

is the Gamma transform; c is a gray stretch parameter used to linearly scaling the result to
be in the range of [0, 255], and γ is the Gamma coefficient.
From equations 3.32 and 3.33, intuitively, the GIC is expected to make the overall
brightness of the input images best fit that of the pre-defined normal face images. Thus,
its intuitive effect is that the overall brightness of all the processed face images is

44
adjusted to the same level as that of the common normal face I0. See Fig.3.12 for its
intuitive effect.

GIC

Illuminated Normal Image GIC Image


Image (I) ( I 0) (I’)

Figure 3.12: Effect of applying GIC to an illuminated face image

5. Normal Distribution (NORM)


This technique normalizes the image by assuming the gray values form a normal
distribution [72]. The idea is to make the mean (µr) and the standard deviation (σr) of the
resulting image to zero and one respectively. For an image of mean (µi) and standard
deviation (σi), the output image is calculated using the following equation.
I ( x, y ) − µ i
f (I (x, y )) = (3.34)
σi
The effect of applying the NORM to an illuminated face image is shown in Fig.3.13.

NORM

Illuminated Resulting
Image Image

Figure 3.13: Effect of applying NORM approach to an illuminated image. (Note that the gray-level of
the resulting image is stretched to [0,255] for displaying purpose only)

3.3.2 Local Approaches

1. Local Normalization Methods


The local normalization methods have the disadvantage that the output is not necessarily
realistic. In the face recognition problem the objective is not to have a realistic image but
to obtain a representation of the face that is invariant to illumination, while keeping the
information necessary to allow a discriminative recognition of the subjects. With this idea
in mind, it makes sense to use local illumination normalization methods this type of
application.

45
There are three local normalization methods proposed by [72] which are: Local
Histogram Equalization (LHE), Local Histogram Matching (LHM) and Local Normal
Distribution (LNORM). They are the same as their global counterparts described in
section 3.3.1 but applied locally. Applying a function locally mean the following; take a
window from the image, starting in the up left corner, with a window size considerably
smaller than the image size. The global normalization function is applied to the
windowed image. This process is repeated by moving the window pixel by pixel all over
the image and for each one applying the normalization function. Because the windows
overlap, the final pixel value is the average of all the results for that particular pixel.
Fig.3.14 shows the effect of applying the three local normalization methods, LHE, LHM
and LNORM to an illuminated face image.

(a) Illuminated Image (b) LHE (c) LHM (d) LNORM

Figure 3.14: Effects of applying the three local normalization methods to an illuminated face image

2. Region-based strategy combining GIC and HE


It is obvious that both HE and GIC are global transforms over the whole image area.
Therefore, they are doomed to fail when side lighting exists. To partly solve this problem,
[84] propose to process the face images based on different local regions, that is,
performing HE or GIC in some pre-defined face regions in order to better alleviate the
highlight, shading and shadow effect caused by the unequal illumination. Ideally, it is
expected to strictly partition the face according to the structure of the facial organs, for
instance, as illustrated in Fig.3.15.

Figure 3.15: An example of ideal region partition


However, complex region partition needs complicated region segmentation approach,
which is often impractical. And, since the possible side lighting mainly cause the non-
symmetry between the left and right part of the face, as well as the intensity variance
between the top region and the bottom region. In [84], the strategy is to simply partition
the face into four regions according to the given eye centers as shown in Fig.3.16.

46
Figure 3.16: The four regions for illumination normalization
After the coarse partition of the face regions, HE or GIC can be conducted in the four
regions separately. Hereafter, the region-based HE is abbreviated to RHE, and the region-
based GIC to RGIC. The effects of the RHE and RGIC can be seen from Fig.3.17.

(a) Illuminated Image (b) RHE (c) RGIC (d) RGIC + RHE

Figure 3.17: The effects of applying region-based strategy of HE and GIC over the four face regions

3. Block Histogram Matching (BHM)


In [55], a simple block histogram matching (BHM) technique is proposed for
illumination compensation. It assumes that a reference image taken under well-controlled
lighting conditions is available. Let X and Y be the input and the reference images,
respectively, of size N × N pixels. The goal is to bring the illumination level of the input
image X to that of the reference image Y by applying BHM. Consider a block image BI
from the input image X with pixel locations ranging from 1 to M and also a block image
BR from the reference image Y at the corresponding pixel locations (Fig.3.18). A
histogram matching is applied to the input image block BI to make the pixel intensity
distribution of BI equivalent to the pixel intensity distribution of BR.

Figure 3.18: Block histogram matching. In each image pair, the left one is the input image while the
right one is the reference image.

47
The histogram-matching block image intensity values are scaled with a windowing filter
H which is as defined below and is pictorially shown in Fig.3.19:
BO (n, m ) = BO (n, m )H (n, m ),1 ≤ n, m ≤ M (3.35)
where
⎧ 4nm M
⎪ 2
, 1 ≤ n, m ≤ ,
M 2
⎪ 4m(M − n + 1) M M
⎪ , < n ≤ M ,1 ≤ m ≤ ,

H (n, m ) = ⎨
2
M 2 2
4n(M − m + 1) M M (3.36)
⎪ , 1≤ n ≤ , < m ≤ M,

2
M 2 2
⎪ 4(M − n + 1)(M − m + 1) M
⎪⎩ 2
, < n, m ≤ M .
M 2

Figure 3.19: The windowing filter H used in the Block HM method


By simultaneously shifting the blocks in both the horizontal and the vertical directions in
steps of M/2 + 1 pixel locations (as shown in Fig.3.18), and adding pixel intensity values
in overlapping regions, the final image Z is achieved. The intensity changes are smoothed
out across adjacent blocks. The blocks are overlapped to avoid edges and patches from
appearing in the illumination compensated image. The window H is defined such that the
sum of the weights in the overlapping region is 1. Fig.3.20 shows examples of images
taken under different illumination directions and the corresponding intensity normalized
images using BHM method. The reference image was kept the same for all the images.

48
(a) (b)

Figure 3.20: Images before and after intensity normalization with BHM. (a) Input images, (b)
corresponding output images after applying BHM

4. Local Binary Patterns (LBP)


The local binary pattern (LBP) is a non-parametric operator which describes the local
spatial structure of an image. Ojala et al. [86] first introduced this operator and showed its
high discriminative power for texture classification. At a given pixel position (xc, yc),
LBP is defined as an ordered set of binary comparisons of pixel intensities between the
center pixel and its eight surrounding pixels, Fig.3.21.

Figure 3.21: The LBP operator

The decimal form of the resulting 8-bit word (LBP code) can be expressed as follows:
7
LBP(xc , yc ) = ∑ s(in − ic )2 n (3.37)
n =0

where ic corresponds to the grey value of the center pixel (xc, yc), in to the grey values of
the 8 surrounding pixels, and function s(x) is defined as:
⎧1, x ≥ 0,
s(x ) = ⎨ (3.38)
⎩0, x < 0.
By definition, the LBP operator is unaffected by any monotonic gray-scale
transformation which preserves the pixel intensity order in a local neighborhood. Note
that each bit of the LBP code has the same significance level and that two successive bit

49
values may have a totally different meaning. Actually, The LBP code may be interpreted
as a kernel structure index.
Later, Ojala et al. [87] extended their original LBP operator to a circular neighborhood of
different radius size. Their LBPP,R notation refers to P equally spaced pixels on a circle of
radius R. In [80], the LBP8;2 operator illustrated in Fig.3.22 is used to preprocess the input
image before providing it to the face authentication algorithms: the face is represented
with its texture patterns given by the LBP operator at each pixel location as shown in
Fig.3.23.

Figure 3.22: The extended LBP operator with (8,2) neighborhood. Pixel values are interpolated for
points which are not in the center of a pixel.

LBP operator

Figure 3.23: Original image (left) processed by the LBP operator (right).
The conducted experiments in [80] show that the LBP operator provides a texture
representation of the face which improves the performances of two different face
authentication classifiers (PCA-LDA and HMM) as compared to histogram equalization.
Moreover, obtained results are comparable when using the GROSS algorithm proposed in
[85] and described previously in section 3.2, on the same databases [88] and removes the
need for parameter selection.

5. Laplacian of Gaussian
In order to remove the influence caused by illumination variations, [91] apply image
processing which is a combination of histogram equalization, Laplacian of Gaussian filter
and contrast adjustment. Histogram equalization is applied first to enhance biased
contrast image in which some pixels are concentrated on a narrow range of the pixel
intensity.
In order to remove the information of pixel intensity while reserving local features that
are useful to recognition, [91] apply Laplacian of Gaussian (LoG) filter following the
histogram equalization. The equation of the 2D LoG function centered on zero and with
Gaussian standard deviation σ has the form:

50
1 ⎡ x 2 + y 2 ⎤ − x2σ+ y
2 2

LoG (x, y ) = − 4 ⎢1 −
2σ 2 ⎥⎦ e
2
(3.39)
πσ ⎣
Fig.3.24 shows the effects of the three steps of this approach. As shown in Fig.3.24 (c),
local features of each face are reserved and the influence of lighting is almost removed.
However, contrast of processed image is biased to a certain range. In order to improve
this contrast and emphasize the local features, contrast adjustment is applied. The final
processed image is shown in Fig.3.24 (d).

(a) Illuminated Image (b) HE (c) LoG (d) Contrast Adjustment

Figure 3.24: Effects of applying the image processing steps proposed by [91]

6. Preprocessing Chain Approach (CHAIN)


The work in [109] proposes a preprocessing chain that incorporates a series of steps
chosen to counter the effects of illumination variations, local shadowing and highlights,
while still preserving the essential elements of visual appearance for use in recognition.
The chain consists of the following consecutive steps:
1. Gamma Correction
2. Difference of Gaussian (DoG)
3. Masking
4. Contrast Equalization
Gamma Correction replaces gray level I with I γ , where γ ∈ [0, 1] is a user-defined
parameter. It has the effect of enhancing the local dynamic range of the image in dark or
shadowed regions, while compressing it in bright regions and at highlights.
Difference of Gaussian (DoG) Filtering. Gamma correction does not remove the
influence of overall intensity gradients such as shading effects. Shading induced by
surface structure is potentially a useful visual cue but it is predominantly low frequency
spatial information that is hard to separate from effects caused by illumination gradients.
High pass filtering can be applied to remove the effects of this shading. Moreover,
suppressing the highest spatial frequencies reduces aliasing and noise, and in practice it
often manages to do so without destroying too much of the underlying signal on which
recognition needs to be based. DoG filtering is a convenient way to obtain the resulting
bandpass behavior. Fine spatial detail is critically important for recognition so the inner
(smaller) Gaussian is typically quite narrow (σ0 ≤ 1 pixel), while the outer one might have
σ1 of 2–4 pixels or more, depending on the spatial frequency at which low frequency
information becomes misleading rather than informative. The work in [109] finds that σ1

51
≈ 2 typically gives the best results, but values up to about 4 are not too damaging and
may be preferable for datasets with less extreme lighting variations.
Masking. If a mask is needed to suppress facial regions that are felt to be irrelevant or
too variable, it should be applied at this point. Otherwise, either strong artificial gray-
level edges are introduced into the convolution, or invisible regions are taken into
account during contrast equalization.
Contrast Equalization. The final step of the CHAIN approach is to globally rescale the
image intensities to standardize a robust measure of overall contrast or intensity variation.
It is important to use a robust estimator because the signal typically still contains a small
admixture of extreme values produced by highlights, garbage at the image borders and
small dark regions such as nostrils. A simple and rapid approximation based on a two
stage process is applied to accomplish this:
I ( x, y )
I ( x, y ) =
[ ((
mean I (x′, y ′)
a
1
a
) )] (3.40)

I ( x, y )
I ( x, y ) =
[mean(min(τ , I (x′, y′) ) )] a
1
a (3.41)

Here, a is a strongly compressive exponent that reduces the influence of large values, τ is
a threshold used to truncate large values after the first phase of normalization, and the
mean is over the whole (unmasked part of the) image.
The resulting image is now well scaled but it can still contain extreme values. To reduce
their influence on subsequent stages of processing, a nonlinear function is finally applied
to compress over-large values. In [109], the hyperbolic tangent I(x, y) = τ tanh(I(x, y)/τ),
is used thus limiting I to the range (−τ, τ).
In [109], the default settings of the various parameters of CHAIN approach are
summarized in Table-3.1. Moreover, it’s found that the CHAIN approach gives similar
results over a broad range of parameter settings, which greatly facilitates the selection of
parameters. Fig.3.25 shows the effect of applying CHAIN approach on various
illuminated faces from Extended Yale B database.
Table 3.1: Default parameter settings for CHAIN approach
Procedure Parameter Value
Gamma Correction γ 0.2

σ0 1
DoG Filtering
σ1 2

Contrast α 0.1
Equalization τ 10

52
(a) Illuminated face images (b) After applying CHAIN approach

Figure 3.25: Examples of images of one person from the Extended Yale-B frontal database. The
columns respectively give images from subsets 1 to 5.

3.4 Comparative Studies & Best-of-Literature Approaches


Here, we give brief description about nine different comparative studies presented in
literature. For each comparative study, we state the illumination normalization
approaches that are compared, the database(s) they compared on and the result(s) of the
study. At the end, we give a summarization about the results of these studies and
introduce some relationships between these studies as a try to deduce the best
illumination normalization approaches in literature.

1. Study 1
The study in [71] empirically compares five image-processing-based approaches for
illumination insensitive face recognition, which are:
1. Histogram Equalization (HE)
2. Histogram Matching (HM)
3. Logarithmic Transform (LOG)
4. Gamma intensity correction (GIC)
5. Self-Quotient Image (SQI)
These approaches are compared on the CMU-PIE database [102], the FERET database
[120] and the CAS-PEAL database [103]. The PCA followed by LDA approach is used
for face recognition.
The results on the lighting subsets of the three databases show that HM gives the best
results among the four other approaches over FERET and CAS-PEAL, while comes after
GIC over CMU-PIE.

2. Study 2
The work in [84] proposes the following three illumination normalization approaches:
1. Gamma Intensity Correction (GIC) method.
2. Region-based strategy combining GIC and the Histogram Equalization (HE).

53
3. Quotient Illumination Relighting (QIR) method.
Experiments are then conducted to compare the following approaches:
1. HE: Histogram equalization globally over the images.
2. RHE: Region-based Histogram equalization.
3. GIC: Gamma Intensity Correction globally.
4. RGIC: Region-based GIC.
5. GIC+RHE: perform RHE after GIC.
6. RGIC+RHE: perform RHE after RGIC.
7. RHE+RGIC: perform RGIC after RHE.
8. HE+RGIC: perform RGIC after HE.
9. QIR: Quotient Illumination Relighting.
The above approaches are empirically compared on the Yale B database [64] and
Harvard database [106]. The simplest normalized correlation, i.e., cosine of the angle
between two image vectors, is exploited as the distance measurement. And for all
experiments, classification is performed using the nearest neighbor classifier.
The results show that the proposed QIR approach gives the best results on both databases.
However, the terrific performance of QIR is based on the assumption that the lighting
modes of the images are known or can be estimated. This is a strong constraint in a
practical application system. In contrast, the RHE combined with RGIC methods are
more general and practical to be exploited in a recognition system efficiently, since they
need not the illumination estimation procedure.

3. Study 3
The work in [96] empirically compares the following 12 approaches for illumination
insensitive face recognition:
1. Correlation method [93].
2. Eigenface method [24].
3. Eigenface method without the first three principle components [32].
4. Nearest Neighbor using 9 training images per subject [93].
5. Linear Subspace [94].
6. Gradient angles [97]
7. Cones-attached [95].
8. Cones-cast [95].
9. Harmonic images (no cast shadow) [96].
10. Harmonic images-cast (with cast shadows) [96].
11. Nine point of lights (9PL) using simulated images [96].
12. Nine point of lights (9PL) using real images [96].
The above approaches are empirically compared on the Yale B database [64]. In all the
experiments on the last four methods, the actual recognition algorithm is straightforward.

54
For each test image, the usual L2 (Euclidean distance) is computed between the image
and all the subspaces. The identity associated with the subspace that gives the minimal
distance to the image is declared to be its identity.
The results show that only two out of the 12 approaches give 100% recognition rate, they
are the Cones-cast and the Nine point of lights (9PL) using real images.

4. Study 4
The work in [85] introduces a simple and automatic image-processing-based approach for
illumination normalization in face recognition, we call it GROSS approach. It empirically
compares the proposed approach with two other approaches which are Histogram
Equalization (HE) and Gamma Correction (GAMMA).
These approaches are compared on the Yale B database [64] and the CMU PIE database
[102]. In all experiments, the recognition accuracies are reported for two algorithms:
Eigenfaces (Principal Component Analysis (PCA)) and FaceIt [104], a commercial face
recognition system from Identix.
The results show the superiority of the proposed approach (GROSS) over the HE and
GAMMA.
Fig.3.26 gives a summarization about the above four studies showing the best approaches
from each study.

Study 1 Study 2 Study 3 Study 4


Approaches: Approaches: Approaches: Approaches:
HE, HM, GIC, LOG, SQI HE, RHE, GIC, RGIC, Correlation, Eigenface, GROSS, HE, GAMMA
Databases: GIC+RHE, RGIC+RHE,
Eigenface w/o 1st 3 PCs, Databases:
CMU-PIE, FERET, CAS- RHE+RGIC, HE+RGIC,
NN (9 images), LS, GA, Yale B, CMU PIE
PEAL QIR Recog. Methods:
Recog. Method: Databases: Cones-attached, Cones-
Eigenfaces, FaceIt
PCAÆLDA Yale B, Harvard cast, Harmonic (no cast),
Recog. Method: Harmonic-cast, 9PL
CosineÆNN
(simulated), 9PL (real)
Database:
Yale B
Recog. Method:
Euclidean Distance

HM, GIC QIR, RHE+RGIC 9PL (real), Cones-cast GROSS

HE: Histogram Equalization LS: Linear Subspace


RHE: Region-based Histogram Equalization GA: Gradient Angles
HM: Histogram Matching NN: Nearest Neighbor
GIC: Gamma Intensity Correction 9PL: Nine Point of Lights
RGIC: Region-based Gamma Intensity Correction LOG: Logarithmic Function
QIR: Quotient Illumination Relighting SQI: Self Quotient Image

Figure 3.26: Summarization for the first four comparative studies. For each study, it shows the
normalization approach to be compared, the face databases and the face recognition approaches in
addition to the best normalization approaches from each study (grayed boxes)

55
5. Study 5
The work in [66] proposes an illumination normalization approach based on applying the
Single Scale Retinex (SSR) followed by histogram matching (HM) to bring all the
images to the same dynamic range of intensity, we notice it by SSRÆHM. It then
compares the proposed approach with the histogram matching (HM) approach for
illumination insensitive face recognition.
The Yale B face database [64] is used for face recognition experiments. Support Vector
Machine (SVM) is used as the learning scheme [92] for the face recognition experiments.
The results show that using the proposed approach, SSRÆHM, as an illumination
normalization approach gives better recognition rates than using the HM alone.

6. Study 6
The study in [72] empirically compares the following seven image-processing-based
approaches, four global and three local, for illumination insensitive face recognition:
1. Gamma Intensity Correction (GIC).
2. Histogram Equalization (HE).
3. Histogram Matching (HM).
4. Normal Distribution (NORM).
5. Local Histogram Equalization (LHE).
6. Local Histogram Matching (LHM).
7. Local Normal Distribution (LNORM).
These approaches are compared on the Extended Yale B database [96] and the Yale B
database [64]. In all experiments, the nearest neighbor classifier using the Euclidean
distance between the images is used for recognition.
The results of the first experiment on the Extended Yale B database show that LNORM
gives the best results among the six other approaches.
The second experiment on the Yale B database is performed to be able to compare the
results of the LNORM with results found in the literature. When using LNORM with
window size 5 × 5 and training with five randomly images from Subset 1, the results in
[72] show that:
1. LNORM outperforms all the nine illumination normalization approaches appear
in Study 2 [84].
2. LNORM performs better than 10 approaches from Study 3 [96], while it gives
comparable results to the Study 3’s best two approaches which are Cones-cast and
the Nine point of lights (9PL) using real images.
3. LNORM perform better than Harmonic Image Exemplars approach [98].
As a result, we can find that LNORM is better than a total of 24 approaches. These
approaches are listed below in Table-3.2 in conjunction with the corresponding Study of
each approach.

56
Table 3.2: List for 24 illumination normalization approaches that LNORM perform better than them
I Approach Study No.
1 HE: Histogram equalization globally over the images Study 2
2 RHE: Region-based Histogram equalization Study 2
3 GIC: Gamma Intensity Correction globally Study 2
4 RGIC: Region-based GIC Study 2
5 GIC+RHE: perform RHE after GIC Study 2
6 RGIC+RHE: perform RHE after RGIC Study 2
7 RHE+RGIC: perform RGIC after RHE Study 2
8 HE+RGIC: perform RGIC after HE Study 2
9 QIR: Quotient Illumination Relighting Study 2
10 Correlation method [93] Study 3
11 Eigenface method [24] Study 3
12 Eigenface without the first three principle components [32] Study 3
13 Nearest Neighbor using 9 training images per subject [93] Study 3
14 Linear Subspace [94] Study 3
15 Gradient angles [97] Study 3
16 Cones-attached [95] Study 3
17 Harmonic images (no cast shadow) [96] Study 3
18 Harmonic images-cast (with cast shadows) [96] Study 3
19 Nine point of lights (9PL) using simulated images [96] Study 3
20 Harmonic Image Exemplars [98] Study 6
21 Histogram Matching (HM) Study 6
22 Normal Distribution (NORM) Study 6
23 Local Histogram Equalization (LHE) Study 6
24 Local Histogram Matching (LHM) Study 6

7. Study 7
The work in [105] introduces the Logarithmic Total Variation (LTV) model as a
preprocessing technique for face recognition under varying illumination. The proposed
approach is empirically compared with the following four approaches:
1. Quotient Image (QI) [100].
2. Quotient Illumination Relighting (QIR) proposed in Study 2 [84].
3. Self Quotient Image (SQI) [101].
4. Histogram Equalization (HE).
These approaches are compared on Yale B face database [64] and the CMU PIE database
[102]. Then an outdoor database [105] is used for evaluating the performance under
natural lighting condition. Two different face recognition approaches are used for
evaluation, template matching and PCA.
The results show that the proposed approach (LTV) always gives the best results among
the four other normalization approaches.

57
Moreover, the results on Yale B database show that the LTV gives 100% recognition rate
over all the five subsets which is better than the Harmonic Image Exemplar [98] and
comparable (or may better) to the 9PL (real images) and Cones-cast from Study 3 [96]
that give the same recognition rate but on four subsets only (no results are reported
wherever, for the tests on subset 5).
In addition, the proposed approach (LTV) reached similar results to the ones obtained by
Corefaces [99] based on PCA recognition on Yale B and CMU PIE database.
As a result, we can find that LTV is better than the following four normalization
approaches:
1. Quotient Image (QI) [100].
2. Quotient Illumination Relighting (QIR) proposed in Study 2 [84].
3. Self Quotient Image (SQI) [101].
4. Histogram Equalization (HE).
While gives comparable (or may better) results to the following three normalization
approaches:
1. Nine Point of Lights (9PL) form Study 3 [96].
2. Cones-cast from Study 3 [96].
3. Corefaces [99].

8. Study 8
The work in [80] proposes an original preprocessing technique based on Local Binary
Pattern (LBP) for illumination robust face authentication. It empirically compares the
proposed approach with two other approaches which are GROSS approach from Study 4
[85] and Histogram Equalization (HE).
The efficiency of the proposed approach is empirically demonstrated using both an
appearance-based (LDA) and a feature-based (HMM) face authentication systems on two
databases: BANCA and XM2VTS (with its darkened set).
Conducted experiments show that the proposed preprocessing approach (LBP) is suitable
for face authentication: results are comparable with or even better than those obtained
using the GROSS approach proposed in Study 4 [85], while it removes the need for
parameter selection.

9. Study 9
The work in [109] presents a simple and efficient preprocessing chain (CHAIN),
described previously in section 3.3.2, that eliminates most of the effects of changing
illumination while still preserving the essential appearance details that are needed for
recognition. It empirically compares the proposed approach with the following
approaches:

58
1. Histogram Equalization (HE).
2. Multi Scale Retinex (MSR) [89].
3. Self Quotient Image (SQI) [101].
4. Logarithmic Total Variation (LTV) proposed in Study 7 [105].
5. GROSS approach proposed in Study 4 [85].
These approaches are empirically compared over three different databases, namely Face
Recognition Grand Challenge version 1 experiment 4 (FRGC-104) [110], Extended Yale
B [96], and CMU PIE [102]. The efficiency of the proposed approach is empirically
demonstrated using Local Ternary Patterns (LTP) [109], a generalization of the Local
Binary Pattern (LBP), combined with a local distance transform (DT) based similarity
metric as a classifier [111].
The results show that the proposed CHAIN approach outperforms all the five approaches
over the three databases. However, the LTV approach gives recognition rates marginally
less than the CHAIN on Extended Yale B database and equal to it on CMU PIE database
but about 300 times slower than CHAIN approach [109].

Fig.3.27 completes the summarization appear previously in Fig.3.26 about all the above
nine comparative studies showing some relations between these studies in addition to the
best approaches resulting from all studies.

We can conclude from these nine comparative studies that among 38 different
illumination-normalization approaches appear in Table-3.3, the following 7 approaches
are the best for face recognition:
1. Single Scale Retinex followed by Histogram Matching (SSRÆHM) [66].
2. Local Normal Distribution (LNORM) [72].
3. Preprocessing Chain Approach (CHAIN) [109].
4. Nine Point of Lights with real images (9PL) [96].
5. Cones-cast [95].
6. Corefaces [99].
7. Local Binary Patterns (LBP) [80].
Observe that the second approach (LNORM) is proven to be better than 24 out of these
38 different illumination-normalization approaches as shown previously in Table-3.2.

Four out of the above seven best-of-literature approaches will be compared later in
chapter 6 together with the proposed approach in chapter 5. The comparisons are done on
illuminated and non-illuminated face images in order to study both the positive and
negative effects of each approach.

59
Study 1 Study 2 Study 3 Study 4
Approaches: Approaches: Approaches: Correlation, Approaches:
HE, HM, GIC, LOG, SQI HE, RHE, GIC, RGIC, Eigenface, Eigenface w/o 1st GROSS, HE, GAMMA
Databases: GIC+RHE, RGIC+RHE,
RHE+RGIC, HE+RGIC,
3 PCs, NN (9 images), LS, Databases:
CMU-PIE, FERET, CAS-
GA, Cones-attached, Cones- Yale B, CMU PIE
PEAL QIR
Databases: cast, Harmonic (no cast), Recog. Methods:
Recog. Method:
Harmonic-cast, 9PL Eigenfaces, FaceIt
PCAÆLDA Yale B, Harvard
Recog. Method: (simulated), 9PL (real)
CosineÆNN Database: Yale B
Recog. Method:
Euclidean Distance

HM, GIC QIR, RHE+RGIC 9PL (real), Cones-cast GROSS


QIR
HM
Study 5 Study 6 Study 7 Study 8
Approaches: Approaches: GIC, HE, Approaches: Approaches:
HM, SSR Æ HM HM, NORM, LHE, LHM, LTV, QI, QIR, SQI, HE LBP, HE, GROSS
Databases: LNORM, HIE [98] Corefaces [99], HIE [98] Databases:
Yale B Databases: Databases: Yale B, CMU BANCA, XM2VTS
Recog. Method: Extended Yale B, Yale B PIE, Outdoor DB [42] Recog. Methods:
SVM Recog. Method: Recog. Methods: LDA, HMM
Euclidean Distance Template Matching, PCA

SSRÆHM LNORM, LTV, 9PL (real), Cones-cast, LBP


9PL (real), Cones-cast Corefaces
LTV
Study 9
Approaches:
CHAIN, LTV, SQI, MSR,
GROSS, HE
Databases: Ext. Yale B,
CMU PIE, FRGC-104
: Approach is compared by direct implementation Recog. Methods: LTP/DT
: Approach is compared by published results
[ ] : Cited approach is compared by published results CHAIN

HE: Histogram Equalization LS: Linear Subspace LTV: Logarithmic Total Variation
RHE: Region-based HE GA: Gradient Angles HIE: Harmonic Image Exemplar
HM: Histogram Matching NN: Nearest Neighbor QI: Quotient Image
GIC: Gamma Intensity Correction QIR: Quotient Illumination Relighting 9PL: Nine Point of Lights
RGIC: Region-based GIC LOG: Logarithmic Function LBP: Local Binary Pattern
QIR: Quotient Illumination Relighting SQI: Self Quotient Image SSR: Single Scale Retinex
LHE: Local Histogram Equalization LHM: Local Histogram Matching MSR: Multi Scale Retinex
LNORM: Local Normal Distribution NORM: Normal Distribution CHAIN: Preprocessing Chain Appr.

Figure 3.27: Summarization for the nine comparative studies showing some relations between these
studies in addition to the final best normalization approaches from all studies (dark grayed boxes).
For each study, it shows the normalization approach to be compared, the face databases and the face
recognition approaches in addition to the best normalization approaches from each study (light
grayed boxes)

60
Table 3.3: The 38 different illumination normalization approaches appear in the above nine
comparative studies together with the corresponding studies numbers. (Note that the cited
approaches, from 29 to 38, are not described in details in their corresponding comparative studies)
Illumination Normalization Study Illumination Normalization Study
I I
Approach No. Approach No.
Histogram Equalization (HE) 1,2,4, Histogram Matching (HM)
1 2 1,5,6
6,7,8
3 Gamma Intensity Correction (GIC) 1,2,6 4 Logarithmic Function 1
5 Normal Distribution 6 6 Gamma correction 4
7 Local HE 6 8 Local HM 6
9 Local Normal Distribution 6 10 Local Binary Pattern 8
11 Region-based HE 2,6 12 Region-based GIC 2,6
GIC followed by Region-based Region-based GIC followed by
13 2,6 14 2,6
HE Region-based HE
Region-based HE followed by HE followed by Region-based GIC
15 2,6 16 2,6
Region-based GIC
Single Scale Retinex followed by Quotient Illumination Relighting
17 5 18 2,6,7
HM
19 Self Quotient Image 1,7 20 Quotient Image 7
21 Logarithmic Total Variation 7 22 GROSS method 4,8
Harmonic images (no cast Harmonic images-cast (with cast
23 3 24 3
shadow) shadows)
Nine point of lights (9PL) using Nine point of lights (9PL) using
25 3,6,7 26 3
simulated images real images
27 Multi Scale Retinex 9 28 Preprocessing Chain approach 9
29 Harmonic Image Exemplar [98] 6,7 30 Corefaces [99] 7
31 Linear Subspace [94] 3 32 Gradient Angles [97] 3
33 Cones-attached [95] 3 34 Cones-cast [95] 3
35 Correlation method [93] 3 36 Eigenface method [24] 3
Eigenface without the first three Nearest Neighbor using 9 training
37 3 38 3
principle components [32] images per subject [93]

After surveying the different face recognition approaches in chapter 2 and the different
illumination normalization approaches in chapter 3, we will introduce in the following
chapter the detailed descriptions about the environment that we build up in order to use it
for testing our proposed illumination normalization approach and the other approaches.
The chapter includes descriptions about the selected face recognition methods and the
selected databases that cover five different face recognition variations. The experimental
results of the selected methods over each database are also introduced in this chapter. All
experiments are done without applying any illumination normalization approach. These
results are considered as a baseline and allow us to study the effects of any illumination
normalization approach on the selected methods over each variation separately.

61
CHAPTER 4: Setup the Environment
4.1 Introduction
This chapter introduces the detailed descriptions about the environment that we build up
in order to use it for testing our proposed illumination normalization approach and the
other approaches. The chapter includes descriptions about the selected face recognition
methods and the selected databases in addition to the reasons behind these selections. The
experimental results of the selected methods over each database are also introduced in
this chapter.
In Chapter 2, we revise the main face recognition approaches and we found that many
face recognition methods fall into the holistic-based approach. One possible reason is that
these methods usually utilize the face as a whole and do not destroy any information by
exclusively processing only certain fiducial points which make them generally provide
more accurate recognition results. Moreover, most of these holistic-based approaches are
fall into two broad categories, Eigenspace-based category and frequency-based category.
To be able to widely studying the effects of any preprocessing/illumination normalization
approach on the two broad holistic-based categories, we chose one method under each
category representing the main characteristics of this category. The two chosen methods
are the Standard Eigenface method [24] which considered the core of many Eigenspace-
based methods, and the Holistic Fourier Invariant Features (Spectroface) method [43] that
represents the main characteristics of the frequency-based category.
These two methods are compared over five different face recognition variations using
suitable database(s) for each variation. These variations are divided into two-geometrical,
which are translation and scaling, and three-facial, which are 3D pose, facial expressions,
and non-uniform illumination. No preprocessing/illumination normalization approach is
applied to both methods during the comparison.
The aim of these comparisons is to establish a base that can be used for further studying
the effects of any preprocessing/illumination normalization approach on each of the five
variations separately using two methods representing the two broad holistic-based
categories, Eigenspace-based and frequency-based.
The rest of this chapter is organized as follows; section 2 describes the two face
recognition methods. Section 3 describes the face databases used in comparisons and how
they are prepared and configured for training and testing. Section 4 contains the
comparison results over each of the five variations in addition to some observations about
these results. Finally, the chapter summary is presented in section 5.

62
4.2 Methods Descriptions
4.2.1 Standard Eigenface Method
The standard Eigenface method [24] approximates the face images by lower dimensional
feature vectors. In training phase, the projection matrix (W ∈ RN × M) – which achieves the
dimensional reduction – is obtained using all the database face images, where N and M
denote for the dimension of image and feature vector respectively. Eigenvectors and
eigenvalues are computed on the covariance matrix of the training images. The M highest
eigenvectors are kept – which form the projection matrix W. Finally, the known
individuals are projected into the face space (pk), where p denotes for the feature vector
and k denotes for person number. These feature vectors are stored in addition to the mean
face.
The recognition process works as in Fig.4.1: a preprocessing module transforms the face
image into a unitary vector using a normalization module [24]. Then, subtract the mean
face from this unitary vector. The resulting vector I is projected using the projection
matrix W. This projection corresponds to a dimensional reduction of the input, starting
with vector I in RN (where N is the dimension of the image vector) and obtaining the
projected vector q in RM, with M<<N. Then, the similarity of q with each of the reduced
vectors (pk) is computed using Euclidean distance. The class of the most similar vector is
the result of the recognition process, i.e. the identity of the face.

Figure 4.1: Standard Eigenface block diagram


In this work, we use a MATLAB implementation for the standard Eigenface method
publicly available on the web and link them with C# project to facilitate the training and
testing steps.

4.2.2 Spectroface Method


Spectroface [43] representation is based on the wavelet transform and holistic Fourier
invariant features. Wavelet transform is applied to the face image to eliminate the effect
of facial expressions. Also, decomposing the face image will reduce its resolution, which

63
in turn, reduces the computation of the recognition system. After decomposing the face
image, the holistic Fourier invariant features (Spectroface) are extracted from the low
frequency subband image by applying Fourier transform twice. The first FFT is applied
to the low frequency subband to make it invariant to the spatial translation. Then the
second Fourier transform is applied to the polar transformation of the result to make it
invariant to scale and on-the-plane rotation. The block diagram of the Spectroface
representation is shown in Fig.4.2.
In the recognition stage, the probe image is translated into Spectroface representation,
and then matches it – using Euclidean distance – with those referent images stored in the
gallery to identify the face image.
Our implementation for this method is done using C++ language. Different from
implementation in [43], we do not use the two preprocessing steps, namely histogram
equalization and intensity normalization, as when we use them over the ORL [112]
database, the recognition rates are always decreased. One possible reason for this
decreasing is that there are no illumination changes in the ORL database. Table-4.1
shows the recognition rates obtained by our implementation against those mentioned in
[43]. It is shown that our implementation gives approximately the same results over the
ORL database. However, in Yale database, the results of our implementation are
approximately the same as those of [43] except when the testing includes the two non-
uniform-illuminated images, the results are decreased significantly. The possible reason
for this is that we don’t apply the two preprocessing steps mentioned in [43].

Figure 4.2: Spectroface block diagram

64
Table 4.1: Comparison between results in Lai et al. [43] and in our implementation (better rates are
italic)
ORL DB (rank 1) Yale DB –not cropped (rank 1)
1 3 1 training image 2 training images
training training without 2 without 2
with 2 illum with 2 illum
image images illum illum
Method in [43] 76.38% 94.64% 95.0% 91.33% 99.05% 95.56%

Our Implementation 78.61% 95.00% 93.33% 82.00% 99.05% 83.70%

4.3 Databases Descriptions


4.3.1 UMIST database
This database [46] is used for studying the pose variation. It consists of 565 images of 20
subjects. The pose varies from frontal to profile with slight variations in the tilt of the
head as well. Each image is cropped to contain only the face and is 112 × 92 pixels in
grayscale. The set of subjects contains 16 males and 4 females. There 8 subjects with
glasses.
We use only 300 images (15 images per subject) with significant pose effects up to 80˚.
We do not consider the profile images in this study since it is shown by our experiment
that these profile images should be treated as separate cases in order to recognize them
correctly, which is out of the scope of this comparison. Each image is flipped around y-
axes to represent the pose changes in both directions. The resulting 600 images are
divided into training and testing set. The training set consists of 200 images with 10
images/subject chosen according to the pose as follows: two normal images, two images
with ±10~15˚, two images with ±30˚, two images with ±45˚, and two images with
±75~80˚. The testing set consists of 400 images with 20 images per subject chosen to
cover all pose variations from frontal to 80˚. An example for training and testing sets is
shown in Fig.4.3.

4.3.2 Yale B database


We use the Yale B database [64] – frontal images only – for studying the non-uniform
illumination variation. It consists of 10 subjects (9 males and 1 female) each with 65
(64 illuminations + 1 normal) images. The 64 illuminated images are acquired in about
two seconds, so there are only small changes in head pose and facial expressions. Only
46 out of these 65 images are divided into four subsets according to the angle the light
source direction makes with the camera’s axis (12˚, 25˚, 50˚, and 77˚).

65
normal 10 -15° pose 30° pose 45° pose 75-80° pose

(a) Training images in one direction only

(b) Testing images in one direction only

Figure 4.3: UMIST: selected images for one subject in both training and testing sets
We use only these four subsets. All images are cropped to include only the head portion.
Subject’s images on each subset are divided into training and testing as follows: subset 1
is divided into 3 training images and 5 testing images; each of subset 2, 3, and 4 is
divided into 4 training images and 8, 8 and 10 testing images, respectively. As a result,
the training set consists of 15 images × 10 subjects while the testing set consists of the
remaining 31 images × 10 subjects. Fig.4.4 shows the training images, randomly selected,
in each subset and the light angle of each image.

Subset 1 Subset 2 Subset 3 Subset 4


Normal Vertical Vertical Horizontal Vertical Horizontal Vertical Horiz & Ver
(+ 10°, - 10°) (+ 25°, - 25°) (+ 20°, - 20°) (+ 50°, - 50°) (+ 45°, - 35°) (+ 70°, - 70°)(+ 65°, ± 35°)

Figure 4.4: Yale B: Training images for one subject in the four subsets with the light angle of each
image

4.3.3 Grimace database


We use the Grimace database [115] for studying the facial expression variation. It
consists of 18 subjects with 20 images for each one. There are 2 females and 16 males.
The images are with major facial expression variation and very little image lighting
variations. There is no hairstyle variation as the images were taken in a single session.
There are no special configurations for this database. Fig.4.5 shows sample images from
this database.

66
4.3.4 JAFFE database
This database [65] is used for studying the facial expression variation. It contains 213
images of 10 Japanese female models obtained in front of a semi-reflective mirror. Each
subject was recorded three or four times while displaying the six basic emotions and a
neutral face. The camera trigger was controlled by the subjects. The resulting images
have been rated by 60 Japanese women on a 5-point scale for each of the six adjectives.
The rating results are distributed along with the images. Fig.4.6 shows example images
for one subject along with the majority rating. The images are originally printed in
monochrome and then digitized using a flatbed scanner.
In our comparisons, all images are cropped using the face detection function in the Intel
OpenCV library [116] to contain only the head portion, examples are shown in Fig.4.5.

(a) Notterdam
5 images × 70
subjects

(b) Yale
8 images × 15
subjects

(c) JAFFE
213 images
10 subjects

(d) Grimace
20 images × 18
subjects

Figure 4.5: Selected images for one subject from each database used for studying the facial expression variation

Figure 4.6: Example images from JAFFE database. The images in the database have been rated by
60 Japanese female subjects on a 5-point scale for each of the six adjectives. The majority vote is
shown underneath each image (with natural being defined through the absence of a clear majority)

67
4.3.5 Nott-faces database
This database [117] is used for studying the facial expression variation. It consists of 70
males each with 7 images, 4 frontal images with facial expressions, 1with bathing cap
and 2 in 3/4 profile. The images are with non-fixed background. There are some
translation of face in image, very small head scale variation and small linear uniform
illumination.
We exclude the 2 images with 3/4 profile from our test and work only on the 5 frontal
images which form a total of 5 images × 70 subjects. All images are cropped using the
face detection function in the Intel OpenCV library [116] to contain only the head
portion. Fig.4.5 shows sample images from this database.

4.3.6 Yale database


Yale database [32] consists of 15 subjects (14 males and 1 female) each with 11 images,
1 normal image, 3 non-uniform illuminated images and 7 images with facial expressions.
We exclude the 3 illuminated images from our test which form a total of 8 images × 15
subjects. This allows us to use this database for studying the facial expression variation.
All images are cropped using the face detection function in the Intel OpenCV library to
contain only the head portion, examples are shown in Fig.4.5.

4.3.7 Face 94 database


We use the Face 94 database [118] for studying the scale variation. It consists of 152
subjects with 20 images for each one. There are 19 females and 133 males. The images
are with fixed background with neither head scale nor illumination variations. There are
small expression variations due to the speaking of subject during the acquisition phase.
Only the first 15 images for each subject are used to form the training and testing sets, 5
for training and 10 for testing. For each subject in the training set, two images are scaled
by factors ±8% and two are scaled by factors ±17% and the remaining image is left
without changes. In the testing set, the ten images of each subject are scaled by 10
different factors: ±3%, ±6%, ±9%, ±12.5%, and ±15.5%. Fig.4.7 shows the training and
testing sets of one subject. Note that after scaling up, the image is cropped to be 128 ×
128 while after scaling down, the boundary pixels are repeated to reach 128 × 128.

68
0% +8% +17% -8% -17%

(a) Training set


3% 6% 9% 12.5% 15.5%

Scale
Up

Scale
Down

(b) Testing set

Figure 4.7: Face 94: 15 images for each subject in both training and testing sets

4.4 Experimental Results


In this section, we describe the five different comparisons between the two methods,
Eigenface and Spectroface. For each comparison, we state the used databases(s) and
describe the training and testing methodologies that are applied on each database, and
then state the results of comparison in addition to some observations about these results.
In all comparisons, only two preprocessing steps are applied to both methods:
1. Convert each image to grayscale.
2. Resize each image to a fixed size of 128 × 128.
This allow us to establish a base that can be used for further studying the effects of any
preprocessing/illumination normalization approach on each of the five variations
separately using two methods representing the two broad holistic-based categories,
Eigenspace-based and frequency-based categories.
In both methods, we use single feature vector for each training image rather than using
the average feature vector for each subject since it gives – by our experiments – better
results over all variations.
To indicate how much significant is certain method better than another, we calculate the
difference between their average recognition rates over all training cases.

4.4.1 Pose Variation

1. Training and Testing Methodologies


We use the UMIST database for studying this variation.
First two columns in Table-4.2 describe both the training cases and the number of
training images/subject in each case. There are 12 training cases differ in the degree of

69
rotation in the chosen images. Note that the normal images are common in all cases. The
cases are chosen to cover all possible combinations for training by both four and six
images/subject. This is in addition to training by the two normal images per subject and
by all the training images up to 75˚ (10 images per subject). The testing is done using all
the 400 images of the testing set.

2. Results
We observe the following results from Table-4.2:
First, concerning the comparison between both methods, the Spectroface method gives
better recognition rates than Eigenface method in 10 out of 12 training cases with average
difference of 4.6%. This means that for the 3D pose variation, the Spectroface method
outperforms the Eigenface method.
Second, concerning the best training case, the top four rates in both methods show that
there is no significant difference between training by all the five angles – namely 0˚, 10˚,
30˚, 45˚, and 75˚ (10 images/subject) and training by three angles, 0˚, 75˚, and an in-
between angle [10˚, 45˚], (six images/subject). This means that to achieve the best
recognition rate over poses up to 75-80˚, the system should be trained by normal image +
75˚ pose + an in-between angle [10˚, 45˚].
Table 4.2: Pose Variation: recognition rates over 12 training cases (top four rates in each method are
italic)
Training Case # train/subject Eigenface Spectroface
normal only 2 64.0 48.0
normal + 10˚ 4 68.5 67.3
normal + 30˚ 4 75.0 76.0
normal + 45˚ 4 87.0 89.0
normal + 75˚ 4 85.5 89.0
normal + 10˚ + 30˚ 6 74.0 74.5
normal + 10˚ + 45˚ 6 84.5 90.0
normal + 10˚ + 75˚ 6 87.5 94.5
normal + 30˚ + 45˚ 6 85.0 90.3
normal + 30˚ + 75˚ 6 88.5 94.8
normal + 45˚ + 75˚ 6 87.5 95.0
normal+10˚+30˚+45˚+75˚ 10 88.0 95.0

4.4.2 Facial Expressions Variation

1. Training & Testing Methodologies


We use four different databases for studying this variation – namely Grimace [115], Yale
[32], JAFEE [65], and Nott-faces [117].
First four columns in Table-4.3 describe the training cases for each database in addition
to the number of both training and testing images per subject in each case. There are three

70
training cases for each database. In each case, the testing is done using all other images
that are not included in the training. In Nott-faces, the first two cases are tested two times,
one without the capped images and other with them.
In Eigenface method, we try to improve its results in facial expressions variation by first
applying the wavelet transform on the original image, and then compute the Eigenface
from the resulting low subband. This low subband contains less information about facial
expressions which usually founded in LH and HL subbands. All previous training &
testing cases are applied to this low subband to be able to study the effect of it on the
recognition rates.

2. Results
We observe the following results from Table-4.3:
First, concerning the comparison between both methods, the Spectroface method gives
better recognition rates than the Eigenface on Original images in all the 14 training cases
with average difference of 5.4%.
Table 4.3: Expressions Variation: recognition rates over four databases with two Eigenface tests
Eigenface
Training Case # train × # test × Spectro-
Database on on
(images/subject) #subject #subject face
Original Wavelet
normal only 1 × 15 7 × 15 92.4 81.9 81.9
Yale normal + 2 expressions1 3 × 15 5 × 15 98.7 96.0 96.0
normal + 3 expressions 4 × 15 4 × 15 98.3 96.7 98.3
normal only 1 × 18 19 × 18 100 96.2 96.8
Grimace normal + 2 expressions 3 × 18 17 × 18 100 97.1 97.1
normal + 4 expressions 5 × 18 15 × 18 100 96.7 96.7
normal only 1 × 10 2032 93.1 84.2 86.2
JAFFE normal + 2 expressions 3 × 10 183 97.8 90.2 90.7
normal + 4 expressions 5 × 10 163 97.6 89.6 89.6
normal only 1 × 70 4 × 70 59.6 55.4 55.7
with
Nott-faces

normal + 1 expression 2 × 70 3 × 70 60.0 55.7 56.7


cap.
normal + 2 expressions 3 × 70 2 × 70 50.7 47.9 48.6
w/out normal only 1 × 70 3 × 70 76.2 69.0 70.5
cap. normal + 1 expression 2 × 70 2 × 70 84.3 76.4 78.6

Second, in Eigenface method, the last two columns show that computing the Eigenface
from the low subband of the wavelet transform gives better results than computing it
from the original image directly. The results are better in 9 training cases with average

1
Normal + N expression(s):- means that we train with normal image + N images each contains single
expression – randomly selected.
2
Number of testing images for all subjects, since in JAFFE database, each subject has different number of
images.

71
difference of 1.2% and are equal in the remaining cases. However, the Spectroface
method gives better recognition rates than the Eigenface on Wavelet also in all the 14
training cases with average difference of 4.7%.
Thus, in facial expression variation, it is better to use the wavelet low subband as it
contains less information about facial expressions which usually founded in LH and HL
subbands. However, applying the frequency-based method on the low subband of the
wavelet transform is much better than applying the PCA-based method on it. One
possible reason is that the wavelet low subband contains information about the
frequencies’ positions, and since the changing in facial expression causes changes in
pixels’ positions within the face, thus any information about pixels’ positions will contain
also information about facial expressions. As a result, applying the PCA (Eigenface) on
the wavelet low subband directly will still affected by some information about facial
expressions. In contrast, applying the Fourier transform on the wavelet low subband will
eliminate the information about pixels’ positions contained in this subband which leads to
reduce the information about facial expressions.
Briefly, for the facial expressions variation, the Spectroface method outperforms both
Eigenface on original image and Eigenface on the low subband of the wavelet transform.
However, applying Eigenface on the low subband is better than applying it on the
original image.

4.4.3 Non-Uniform Illumination Variation

1. Training & Testing Methodologies


We use the Yale B database – frontal images only – for studying this variation.
First two columns in Table-4.4 describe the training cases and their corresponding
subsets. There are 25 different training cases. Note that the normal image is common in
all cases. We have four elementary subsets – namely subset 1, 2, 3, and 4 shown in the
table to the left. Each elementary subset composed of training by the normal image and
either the vertical lighting, horizontal lighting or both of them.
Also, we have seven combinations of the elementary subsets, shown in the table to the
right, where subset 1 is essential in all of them as it contains the lowest illumination. Each
combination composed of training by the normal image and either the vertical lighting or
the vertical and the horizontal lighting. The testing is done using all the 31 × 10 images
of the testing set.

2. Results
Concerning the comparison between both methods, Table-4.4 shows that the Spectroface
method gives better recognition rates than the Eigenface method in all the 25 training

72
cases with average difference of 9.0%. This means that for non-uniform illumination
variation, the Spectroface method outperforms the Eigenface method.
Table 4.4: Illumination Variation: recognition rates over 25 training cases (top three rates in each
method are italic)

Four Elementary Subsets Seven Combinations


Training Case Training Case
Sub Eigen- Spectr Sub- Eigen Spectr
(train. (train.
-sets face oface sets -face oface
images/subject) images/subject)
nor only 45.8 48.4 nor + 4 ver 57.7 58.7
1 1, 2
nor + 2 ver 48.7 52.3 nor + 4 ver + 2 hor 59.7 60.7
nor + 2 ver 54.9 57.7 nor + 4 ver 60.0 65.5
1, 3
2 nor + 2 hor 50.7 58.1 nor + 4 ver + 2 hor 58.1 69.0
nor + 2 ver + 2 hor 55.2 61.0 nor + 4 ver 50.0 60.3
1, 4
nor + 2 ver 55.5 61.6 nor + 4 ver + 2 hor 46.8 59.7
3 nor + 2 hor 53.2 56.8 nor + 6 ver 61.0 66.5
1, 2, 3
nor + 2 ver + 2 hor 52.6 65.2 nor + 6 ver + 4 hor 61.0 71.6
nor + 2 ver 45.9 55.5 nor + 6 ver 55.2 65.2
1, 2, 4
4 nor + 2 hor 44.8 48.1 nor + 6 ver + 4 hor 52.3 68.4
nor + 2 ver + 2 hor 41.9 54.8 nor + 6 ver 55.5 70.0
1, 3, 4
nor + 6 ver + 4 hor 53.6 74.5
nor: normal 1, 2, nor + 8 ver 57.4 71.0
ver: vertical hor: horizontal 3, 4 nor + 8 ver + 6 hor 55.5 77.1

4.4.4 Translation Variation

1. Training & Testing Methodologies


We use the previous six databases for studying this variation – namely UMIST, Grimace,
Yale, JAFEE, Nott-faces, and Yale B.
For training, we chose one training case from each of the six previous databases, usually
the one that gives best results in one or both methods, as shown in Table-4.5. Thus, we
have 6 training cases with the same previous results.
Table 4.5: Translation Variation: chosen cases from the six databases and their recognition rates
Database Training Case Eigenface Spectroface
UMIST normal + 45˚ + 75˚ 87.5 95.0
Grimace normal + 2 expressions 97.1 100.0
Yale normal + 3 expressions 96.7 98.3
JAFFE normal + 2 expressions 90.2 97.8
Nott-faces normal + 1 expression 76.4 84.3
Yale B Subsets 1, 2, 3 normal + 6 vertical 61.0 66.5
The testing is applied two different times; first, by translating with circulation in which
the output pixels after translation are circulated to fill the empty pixels in the opposite
direction. Second, translating without circulation in which the empty pixels after

73
translation are filled by fixed color (gray in our case). In both times, each test image is
translated with 2, 4, 6, and 8 pixels in each of the four directions which give 16 new
recognition rates for each training case. Then, the decreasing in recognition rates is
calculated. Finally, the average decreasing for each translating value is calculated over
the four directions. Examples for translating are shown in Fig.4.8.

2. Results
First, in translating with circulation, Table-4.6 shows that the recognition rates of
Eigenface are decreased significantly. On other hand, the recognition rates of Spectroface
are not affected by these translations, as the maximum decrease value in all the 24 testing
cases is 0.8%.
Original Right by 2 Right by 4 Right by 6 Right by 8

(a) Translating
with circulation

(b) Translating
without circulation

Figure 4.8: Translation Variation: example for translating with and without circulation

Table 4.6: Translation Variation: average decreasing in the recognition rates of both methods after
translating with circulation in the four directions
(a) Eigenface Method (b) Spectroface Method
Translation Value Translation Value
Database Database
2 4 6 8 2 4 6 8
UMIST 1.7 4.6 10.5 20.7 UMIST 0.0 0.0 0.0 0.0
Grimace 0.1 1.3 11.7 29.9 Grimace 0.0 0.0 0.0 0.0
Yale 2.1 10.5 19.6 31.7 Yale 0.0 0.0 0.0 0.0
JAFFE 2.9 11.9 20.4 31.5 JAFFE 0.8 0.0 0.8 0.0
Nott-faces 4.1 14.3 27.7 39.6 Nott-faces 0.0 0.0 0.0 0.0
Yale B 2 6.1 13.6 18.8 Yale B 0.0 0.0 0.0 0.0
Average 2.2 8.1 17.3 28.7 Average 0.1 0.0 0.1 0.0

Second, in translating without circulation, Table-4.7 shows that the recognition rates in
both methods are decreased. However, the decreasing in Eigenface is much more
significant than in Spectroface – see the average row.

As a result, it is clear that the Spectroface method is more robust against the translation
variation than the Eigenface method.

74
Table 4.7: Translation Variation: average decreasing in the recognition rates of both methods after
translating without circulation in the four directions
(a) Eigenface Method (b) Spectroface Method
Translation Value Translation Value
Database Database
2 4 6 8 2 4 6 8
UMIST 1.5 5.5 10.9 22.2 UMIST 0.6 1.8 4.5 9.2
Grimace 0.2 1.4 12.8 37.7 Grimace 0.1 2.1 7.6 12
Yale 2.1 10.9 20.4 31.3 Yale 0 0 0.4 1.2
JAFFE 2.5 12.6 22.9 34.3 JAFFE 1.2 1.4 3.5 5.3
Nott-faces 3.6 15.1 26.8 38.7 Nott-faces 0 1.8 6.6 13.2
Yale B 2.4 8.5 16.5 22.8 Yale B 0.1 2.3 7.5 13.3
Average 2.1 9 18.4 31.2 Average 0.3 1.6 5 9

4.4.5 Scaling Variation

1. Training & Testing Methodologies


We use the Face 94 database for studying this variation.
There are seven different training cases according to the scaling factors of the chosen
images, as shown in Table-4.8:
Table 4.8: Scaling Variation: description of the training cases
Training Case # train/ subject Description
normal only 1 normal image only
normal + up8 2 normal image & scaled up image with factor 8%
normal + down8 2 normal image & scaled down image with factor 8%
normal + up8 + down8 3 normal image, scaled up & scaled down with factor 8%
normal + up17 2 normal image & scaled up image with factor 17%
normal + down17 2 normal image & scaled down image with factor 17%
normal + up17 + down17 3 normal image, scaled up & scaled down with factor 17%

The testing is done using all the images in the testing set. For each training case, the
testing is done two times, before and after scaling, in order to record the decreasing in
recognition rates after scaling all testing images, see Table-4.9.

2. Results
Concerning the comparison between both methods, Table-4.9 shows that the Eigenface
method gives better results (less decreasing in recognition rates) than the Spectroface
method in six out of the seven used training cases with average difference 2.7%. This
means that for the scaling variation, the Eigenface method outperforms the Eigenface
method.

75
Table 4.9: Scaling Variation: decreasing in recognition rates after scaling all images in the testing set
Training Case Eigenface Spectroface
normal only 14.6 19.1
normal + up8 6.6 7.7
normal + down8 9.8 12.8
normal + up8 + down8 0.7 0.7
normal + up17 5.8 8.0
normal + down17 8.7 13.1
normal + up17 + down17 0 0.9

4.5 Summary
In this chapter, we introduce a comparison between two holistic-based face recognition
methods chosen to represent the two broad categories of the holistic-based approach –
namely Standard Eigenface method from the PCA-based category and Spectroface from
the frequency-based category. Seven databases, ranged from small to medium size, are
used to compare the two methods against five main variations separately using suitable
database(s) for each variation. All comparisons are applied without using any
preprocessing/illumination normalization approaches.
The aim of these comparisons is to establish a base that can be used for further studying
the effects of any preprocessing/illumination normalization approach on each of the five
variations separately using two methods representing the two broad holistic-based
categories, Eigenspace-based and frequency-based categories.
Moreover, the comparison results show that the Spectroface method outperforms the
Eigenface method in four out of the five variations – namely the 3D pose, facial
expressions, non-uniform illumination, and translation variations while the Eigenface
method is better in the scaling variation. Also, in facial expressions variation, applying
the frequency-based method on the low subband of the wavelet transform is much better
than applying the PCA-based method on it. One possible reason is that applying the PCA
(Eigenface) on the wavelet low subband directly will still be affected by some
information about pixels’ positions which represent information about facial expressions.
In contrast, applying the Fourier transform on this subband will eliminate this
information about pixels’ positions which leads to reduce the information about facial
expressions.
In the next chapter, we will describe the proposed illumination normalization approach
together with its results on illuminated databases. Moreover, the comparisons of the
proposed approach with other best-of-literature approaches will be discussed in chapter 6.
We’ll use the results of this chapter as a baseline for those of the next two chapters to see
the effect of the normalization approaches on different databases.

76
CHAPTER 5: The Proposed Illumination Normalization
Approach
5.1 Introduction
As we stated previously in Chapter 3, illumination normalization approaches can be
classified into two categories: model-based and image-processing based approaches.
Although the model-based approaches are perfect in theory, the requirement of
assumptions and constraints in addition to their highly computational cost make these
approaches unsuitable for realistic applications. On the other-hand, the image-processing
based approaches are more commonly used in practical systems for their simplicity and
efficiency.
Although most of the illumination normalization approaches can cope with illumination
variation well, some may bring negative influence on images without illumination
variation. In addition, some approaches show great difference on performance when
combined with different face recognition approaches. Some other approaches require
perfect alignment of face within the image which is difficult to achieve in practical/real-
life systems.
So, in this chapter, we aim to propose an image-processing-based illumination
normalization approach that proves flexibility to different face recognition approaches
and independency to face alignment. These make it suitable for practical/real-life systems
as it can be used with different face recognition approaches and doesn’t need any pre-
assumptions or constraints concerning the face alignment.
This chapter is organized as follows: section 2 describes the idea behind the proposed
illumination normalization approach. Sections 3 to 6 contain the detailed descriptions of
the proposed approach. Experiments appear in section 7. Finally, chapter summary is
presented in section 8.

5.2 Idea of the Proposed Approach


Among different image-processing based approaches, the histogram matching (HM) is
considered one of the most common and successful approach [66], [67], [68], [69], [70].
Some comparative studies in literature show the superiority of HM among other
approaches [71], [72]. For example, [71] compares five different illumination
normalization approaches, namely histogram equalization (HE), histogram matching
(HM), log transformation (LOG), gamma intensity correction (GIC) and self quotient
image (SQI) over three large-scale face databases which are FERET, CAS-PEAL and
CMU-PIE. The results show that HM gives the best results among the four other
approaches over FERET and CAS-PEAL, while comes after GIC over CMU-PIE. Results
in [72] over the extended Yale B face database show that HM gives the best results

77
among three globally-applied approaches, which are normal distribution (NORM), HE
and GIC.
Moreover, Histogram matching has the following two main advantages:
1. It is a preprocessing step that can be applied with any face recognition approach.
2. It is insensitive to geometrical effects on the image as it’s applied globally and
thus no additional alignment steps are required.
Although enhancing the image resulting from the HM can lead to increase the recognition
rates over using the HM alone, no attempts have been made to combine the HM with
other image enhancement methods for illumination normalization. Also, the compression
function of the Retinal filter [73] as an image enhancement method has not been used in
the literature. So, it’s very interesting to combine the HM with other image enhancement
methods as illumination normalization for face recognition.
As a result, we introduce a new illumination normalization approach based on enhancing
the image resulting from HM. Four different image enhancement methods are used in this
study – three of them are common in literature, namely histogram equalization, log
transformation and gamma correction [74], while the fourth method is newly suggested to
be used as an image enhancement method in this study, which is the compression
function of the Retinal filter [73]. These four image enhancement methods are applied in
two different approaches through this study:
1. After histogram matching; on the resulting image from HM.
2. Before histogram matching; on the reference image before matching the input
image on it.
In addition, for each approach, we try to further enhancing the results by applying one of
these four methods again. Finally, the proposed approach is chosen from these
combinations based on the increase in recognition rates against using the HM alone
regardless of the following conditions:
1. Face recognition approach that the normalization approach is applied with it,
2. Face alignment within the image,
3. Number of training images, and the degree of illumination within these images.
This ensures both the flexibility of the proposed approach among different face
recognition approaches and the ability to apply it on practical/real-life systems in which
perfect alignment of faces is difficult to achieve. The verifications of these conditions are
described in detail later in this chapter.
All previous combinations are empirically demonstrated and compared over Yale B
database [64] using the two holistic-based face recognition approaches introduced
previously in Chapter 4, namely, standard Eigenface [24] and Spectroface [43]. These
two approaches are chosen to represent the two broad holistic-based categories,
Eigenspace-based and Frequency-based respectively [22].

78
The rest of this chapter is as follows: section 3 contains the description of the histogram
matching algorithm. Section 4 contains the description of the four image enhancement
methods. In section 5, the different approaches of applying these four methods to enhance
the image resulting from HM are introduced. Section 6 is dedicated to describe the
verification of the selection conditions using the Yale B database. Experimental results
showing the best combinations of HM with different image enhancement methods are
presented in section 7. Finally, chapter summary is presented in section 8.

5.3 Histogram Matching Algorithm


Given an illuminated face image X and a well-lit face image Y, histogram matching [74]
is applied to bring the illumination level of the input image X to that of the reference
image Y. This is done by making the histogram of the X approximately “match” to the
histogram of Y which makes both images having roughly the same mean and variance in
their histograms. Fig.5.1 demonstrates the histogram matching process to an illuminated
image. The illuminated image (source) is shown in Fig.5.1 (a) and the corresponding
histogram is shown in Fig.5.1 (b). The well-lit image (target) and its corresponding
histogram are shown in Fig.5.1 (c) and (d) respectively. The resulting image and its
histogram after applying the histogram matching are shown in Fig.5.1 (e) and (f).
To explain the algorithm, Let H(i) be the histogram function of an illuminated image X
and G(i) be the desired histogram of the well-lit image Y, we wish to map H(i) to G(i) via
a transformation FHÆG(i). We first compute a transformation function for both H(i) and
G(i) that will map the histogram to a uniform distribution, U(i). These functions are
FHÆU(i) and FGÆU(i), respectively. Equations 5.1 and 5.2 depict the mapping to a uniform
distribution, which is also known as histogram equalization [74].
i

∑ H ( j)
j =0
FH → U(i) = N −1 (5.1)
∑ H ( j)
j =0

∑ G( j)
j =0
FG → U(i) N −1 (5.2)
∑ G( j)
j =0

Where N is the number of discrete intensity levels. N = 256 for 8-bit grayscale images.
To find the mapping function, FHÆG(i), we invert the function FGÆU(i) to obtain FUÆG(i).
Since the domain and the range of the functions of this form are identical, the inverse
mapping is trivial and is found by cycling through all values of the function. However,
due to the discrete nature of these functions, inverting can yield a function which is
undefined for certain values. Thus, we use linear interpolation and assume smoothness to
fill undefined points of the inverse function according to the values of well-defined points

79
in the function. As a result, we generate a fully defined mapping FUÆG(i) which transforms
a uniform histogram distribution to the distribution found in histogram G(i). The mapping
FHÆG(i) can then be defined as in equation 5.3, [66].
FH →G ( i ) = FU →G ( FH →U ( i ) ) (5.3)

Illuminated
Image

(a) (b)

Well-lit
Image

(c) (d)

Resulting
Image

(e) (f)
Figure 5.1: Histogram matching process to an illuminated image
It’s common in literature to match all images, in both training and testing sets, with a
single histogram of either a fixed well-lit image as in [71], [67] or an average image as in
[72]. In this work, the reference image for HM is constructed by calculating the average
image of a set of well-lit images – one for each subject which gives, by our experiments,
better results than using a single well-lit image for the whole image set.
The complexity to match the histogram of the input image to the one of the reference
image is O(L), where L is the number of pins in the histogram (equal to 255 in gray-scale
image). While the complexity of applying the new histogram to the input image takes
order O(N × M), where N, M represent the image resolution. This makes the complexity
of the whole HM process is O(N × M).

80
5.4 Image Enhancement Methods
The principal objective of image enhancement is to process the original image to be more
suitable for the recognition process. Many image enhancement methods are available in
the literature. Usually, a certain number of trial and error experiments are required before
a particular image enhancement method is selected [74]. In this study, four image
enhancement methods are chosen. Three of them are common in literature, namely
histogram equalization, log transformation and gamma correction, while the fourth
method which called the compression function of the Retinal filter [73] is newly
suggested to be used as an image enhancement method in this study.

5.4.1 Histogram Equalization (HE)


It’s one of the most common image enhancement methods [74]. It aims to create an
image with uniform distribution over the whole brightness scale by using the cumulative
density function of the image as a transfer function. Thus, for an image of size M × N
with G gray levels and cumulative histogram H(g), the transfer function at certain level
T(g) is given as follows:
H ( g ) × (G − 1)
T (g) = (5.4)
M ×N

5.4.2 Log Transformation (LOG)


LOG is a frequently used technique of gray-scale transform. It simulates the logarithmic
sensitivity of the human eye to the light intensity. The general form of the log
transformation [74] is:

s = c log(1 + r ) (5.5)

Where r and s are the old and new intensity value, respectively and c is a gray stretch
parameter used to linearly scaling the result to be in the range of [0, 255]. The shape of
the log curve in Fig.5.2 shows that this transformation maps a narrow range of dark input
gray-levels (shadows) into a wider range of output gray levels. The opposite is true for
the higher values of the input gray-levels.

5.4.3 Gamma Correction (GAMMA)


Gamma correction is a technique commonly used in the field of Computer Graphics. It
concerns how to display an image accurately on a computer screen. Images that are not
properly corrected can look either bleached out, or too dark. Gamma correction can
control the overall brightness of an image by changing the Gamma parameter. The
general form of the gamma correction [74] is:

s = cr 1 γ (5.6)

81
where r and s are the old and new intensity value, respectively, c is a gray stretch
parameter used to linearly scaling the result to be in the range of [0, 255] and γ is a
positive constant. In our case, the γ is chosen to be greater than 1 (empirically, it’s chosen
to be four) in order to map a narrow range of dark input values (shadows) into a wider
range of output values, with the opposite being true for higher values of input levels as
shown in Fig.5.2. Unlike the log transformation, the gamma correction has a family of
possible transformation curves obtained simply by varying γ values.

Figure 5.2: Transformation functions of LOG and GAMMA (L: number of gray levels)

5.4.4 Compression Function of the Retinal Filter (COMP)


A Retinal filter [75] acts as the human retina by inducing a local smoothing of
illumination variations. It has been successfully used as an illumination normalization
step in the segmentation of facial features in [73], [76]. In this work, we try to use it as an
illumination normalization step in face recognition. However, our empirical results over
both Eigenface and Spectroface methods using an illuminated YALE B database show
that using the Retinal filter as an illumination normalization step is significantly affected
when the faces are not perfectly aligned. One possible reason is that the Retinal filter
produces a non-realistic image leaving only the high frequencies (edges) which in turn
may require the faces to be perfectly aligned specially in the holistic-based approaches.
Therefore, in this study, we use only the compression function of the Retinal filter as an
image enhancement method since it’s applied globally and so produces more realistic
image (for more details about the Retinal filter, see [75]).
Let G be a Gaussian filter of size 15 × 15 with standard deviation σ = 2 [73]. Let Iin be the
input image and let IG be the result of G filtering of Iin. Image X0 is defined by:
0.1 + 410 IG
X0 = (5.7)
105.5 + IG

82
The definition of the compression function C is based on X0:
(255 + X0) Iin
C= (5.8)
Iin + X0
Fig.5.3 shows the result of applying each of the four enhancement methods on a face with
non-uniform illumination.
Illuminated HE LOG GAMMA COMP

Figure 5.3: Effect of the four enhancement methods on an illuminated face

5.5 The Enhanced HM Approaches


A total of 40 different enhancement combinations using the HM [74] combined with
different enhancement methods are considered and compared in this study in order to
enhance the results of applying the HM alone [77]. As stated in section 5.3, our reference
image used for HM is constructed by calculating the average image of a set of well-lit
images – one for each subject which gives, by our experiments, better results than using a
single well-lit image. Each of the four enhancement methods is applied in three different
approaches; 1) After the HM, 2) Before the HM, 3) Further enhancing 1 and 2.

5.5.1 Enhancement After HM


Each of the image enhancement methods, discussed in section 5.4, is applied on the result
of HM in order to enhance it, as shown in Fig.5.4. This give us four combinations, noted
by HM-HE, HM-LOG, HM-GAMMA, HM-COMP, corresponding to applying HE,
LOG, GAMMA and COMP, respectively, on the result of HM. Fig.5.5 shows the effect
of these combinations on an illuminated face.

Average well-lit
Reference Image

Histogram Matched Image Image Output Image


Input Image
Matching (HM) Enhancement

Figure 5.4: Block diagram of applying the image enhancement method after the HM

83
Illuminated HM

HM-HE HM-LOG HM-GAMMA HM-COMP

Figure 5.5: Effects of applying the image enhancement methods after applying the HM

5.5.2 Enhancement Before HM


Opposite to the approach in 5.5.1, each of the image enhancement methods is applied on
the reference image before matching the input image on it, see Fig.5.6. This give us
another four combinations, noted by HE-HM, LOG-HM, GAMMA-HM, COMP-HM,
corresponding to applying HE, LOG, GAMMA and COMP respectively, on the reference
image. Fig.5.7 shows the effect of these combinations on an illuminated face.
Average well-lit Image Enhanced
Reference Image Enhancement Reference Image

Input Image Histogram Matching (HM) Output Image

Figure 5.6: Block diagram of applying the image enhancement method before the HM
Illuminated HM

HE-HM LOG-HM GAMMA-HM COMP-HM

Figure 5.7: Effects of applying the image enhancement methods before applying the HM

84
5.5.3 Further Enhancement
Here, we further enhancing the result of each combination using each of the four
enhancement methods which give us 8 × 4 = 32 additional combinations. Fig.5.8 shows
block diagrams for such enhancements. The effects of further enhancement on both the
HM-GAMMA and GAMMA-HM combinations using each of the four enhancement
methods are illustrated in Fig.5.9.

Input Enhancement approach in Enhanced Image Output


image 5.5.1/5.5.2 image enhancement Image

Figure 5.8: Block diagram showing the further enhancement of combinations in 5.5.1 and 5.5.2

Illuminated HM

First Enhancement Further Enhancement Using


HM-GAMMA HE GAMMA LOG COMP

GAMMA-HM HE GAMMA LOG COMP

Figure 5.9: Effects of further enhancement on both HM-GAMMA and GAMMA-HM combinations
using each of the four enhancement methods

5.6 Verification of the Selection Conditions


As stated in section 5.5, we have 40 different enhancement combinations resulting from
combining the HM with different enhancement methods to enhance the results of
applying the HM alone. As stated previously in section 5.2, the proposed approach is
chosen from these combinations based on the increase in recognition rates against using
the HM alone regardless of the following conditions:

85
1. Face recognition approach that the normalization approach is applied with it,
2. Face alignment within the image,
3. Number of training images, and the degree of illumination within these images.
This ensures both the flexibility of the proposed approach among different face
recognition approaches and the ability to apply it on practical/real-life systems in which
perfect alignment of faces is difficult to achieve.
We use the Yale B database [64] – frontal images only – as described in section 4.3.2 for
studying and comparing the 40 enhancement combinations.
In order to verify the first condition, each of the 40 enhancement combinations is applied
with the two face recognition methods, Eigenface and Spectroface, representing the two
broad holistic-based categories, Eigenspace-based and Frequency-based respectively. The
better enhancement combination the one that always enhances the recognition results in
both methods.
To verify the second condition, all images are cropped in two different ways to include
only the head portion:
1. Automatic cropping using the face detection function in Intel OpenCV library
[116] to produce non-aligned version of the database; we call it YALE B-AUTO.
2. Manual cropping using the landmarks’ coordinates available on the Yale B
website [119] to produce an aligned version of it; we call it YALE B-MANU.
These two versions, shown in Fig.5.10, allow us to test the robustness of each
enhancement combination against the geometrical changes in faces within the images.
The better enhancement combination, the one that always enhances the recognition
results either with or without aligning the faces inside images.

YALE
B-
AUTO

YALE
B-
MANU

Figure 5.10: Sample faces from Yale B database – automatically and manually cropped
To verify the third condition, all the 25 different training cases, described in section 4.4.3,
are used in the testing with this database, as shown in Table-5.1, in which the normal
image is common in all cases. These training cases are chosen to cover both the training
with each elementary subset – namely subset 1, 2, 3, and 4, and the training with the
seven combinations of these subsets where subset 1 is essential in all of them as it

86
contains the lowest illumination. Each elementary subset is composed of training by the
normal image and either the vertical, horizontal or both lighting. While each combination
is composed of training by the normal image and either vertical lighting or vertical and
horizontal lighting.
These training varieties help us to test the robustness of each enhancement combination
against the number of training images and the changes in illumination direction of these
images. The better enhancement combination, the one that always increases the
recognition rates regardless of the training case.
Table 5.1: The 25 different training cases used in testing
Elementary Subsets Seven Combinations
Training Case(train. Training Case(train.
Subsets Subsets
images/subject) images/subject)
nor only nor + 4 ver
1 1, 2
nor + 2 ver nor + 4 ver + 2 hor
nor + 2 ver nor + 4 ver
1, 3
2 nor + 2 hor nor + 4 ver + 2 hor
nor + 2 ver + 2 hor nor + 4 ver
1, 4
nor + 2 ver nor + 4 ver + 2 hor
3 nor + 2 hor nor + 6 ver
1, 2, 3
nor + 2 ver + 2 hor nor + 6 ver + 4 hor
nor + 2 ver nor + 6 ver
1, 2, 4
4 nor + 2 hor nor + 6 ver + 4 hor
nor + 2 ver + 2 hor nor + 6 ver
1, 3, 4
nor + 6 ver + 4 hor
nor: normal 1, 2, 3, nor + 8 ver
ver: vertical hor: horizontal 4 nor + 8 ver + 6 hor

5.7 Experimental Results


The aim of these experiments is to choose the best enhancement combination from the 40
different combinations described in section 5.5 according to the selection conditions
stated in section 5.6. Thus, each combination is applied four different times
corresponding to the Eigenface and Spectroface methods over YALE B-AUTO and
YALE B-MANU versions.
In each time, a combination is tested over the 25 training cases and their average
recognition rate is calculated and then compared with the one resulting from applying the
HM alone. The best enhancement combination, the one that increases the recognition
rates resulting from applying the HM alone in all of the following:
1. Both face recognition methods (Eigenface and Spectroface),
2. Over aligned and non-aligned versions (YALE B- MANU and YALE B- AUTO),
3. In all the 25 training cases.
The first condition is to ensure the flexibility of the chosen combination among different
face recognition approaches. While the second ensures its suitability for practical/real-life
systems, in which perfect aligning of the faces inside images is not a simple task. Finally,

87
by ensuring the increasing of recognition rates in all the 25 training cases, it proves that
the chosen combination is not affected by either the number of training images or the
changes in illumination direction of these images.
As described in section 5.5, 32 out of the 40 enhancement combinations are for further
enhancement. So, to see if further enhancement the image leads to further increasing the
recognition rates or not, we plot the average difference in recognition rates from applying
the HM alone for each of the eight single enhancement combinations together with the
ones achieved by further enhancing them using either HE, GAMMA, LOG or COMP.
Fig.5.11 shows such plotting for the Eigenface method over the YALE B-AUTO
database. Fig.5.12 is dedicated for the Eigenface method over YALE B-MANU while
Figures 5.13, 5.14 are dedicated for the Spectroface method over the YALE B-AUTO
and YALE B-MANU, respectively.
We summarize the results of the further enhancement combinations in Table-5.2 as
follows:
For each of the four further enhancement combinations, corresponding to applying either
HE, GAMMA, LOG or COMP after the eight single enhancement combinations, we
count how many times the further enhancement combination lead to increase the average
difference in recognition rates over the eight single enhancement combinations.

(a) Further enhancement using HE

88
(b) Further enhancement using LOG

(c) Further enhancement using GAMMA

(d) Further enhancement using COMP


Figure 5.11: Eigenface method over YALE B-AUTO: Effects of further enhancement over the eight
single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP

89
(a) Further enhancement using HE

(b) Further enhancement using LOG

(c) Further enhancement using GAMMA

90
(d) Further enhancement using COMP
Figure 5.12: Eigenface method over YALE B-MANU: Effects of further enhancement over the eight
single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP

(a) Further enhancement using HE

(b) Further enhancement using LOG

91
(c) Further enhancement using GAMMA

(d) Further enhancement using COMP


Figure 5.13: Spectroface method over YALE B-AUTO: Effects of further enhancement over the eight
single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP

(a) Further enhancement using HE

92
(b) Further enhancement using LOG

(c) Further enhancement using GAMMA

(d) Further enhancement using COMP


Figure 5.14: Spectroface method over YALE B-MANU: Effects of further enhancement over the
eight single enhancements using (a) HE, (b) LOG, (c) GAMMA and (d) COMP

93
Table 5.2: The number of combinations that lead to increase the recognition rates after using each of
the enhancement methods for further enhancement
Face Further Enhancement Combination Using:
Recognition Database HE GAMMA LOG COMP
Method (8 combinations) (8 combinations) (8 combinations) (8 combinations)
YALE B-
AUTO 0 5 5 8
Eigenface
YALE B-
MANU 0 0 2 8
YALE B-
AUTO 0 1 0 5
Spectroface
YALE B-
MANU 1 0 0 5
It’s clear from Table-5.2 that further enhancement the image using any of the three
traditional enhancement methods – namely HE, GAMMA and LOG, doesn’t lead to
further enhancement in recognition rates of both the Eigenface and Spectroface methods
especially in the YALE B-MANU version, see the second and fourth rows. Only COMP
that’s lead to further enhancement in recognition rates of both face recognition methods
over the two database’s versions. For clarification, in Spectroface method over the YALE
B-MANU (last row in Table-5.2), we can see that when applying HE as further
enhancement after each of the eight single combinations, ONLY one of these
combinations get further increasing in its average recognition rate due to further
enhancement it with HE. When applying either GAMMA or LOG as further
enhancement, NONE of the eight single combinations get further increasing in its
average recognition rate due to further enhancement it with HE. On other hand, when
applying COMP as further enhancement, five out of the eight single combinations get
further increasing in their average recognition rates after applying the COMP over those
accomplished before applying it.
As a result, only five out of 40 enhancement combinations are satisfying the three
previously mentioned conditions, their effect is shown in Fig.5.15:
1. GAMMA-HM, where gamma is applied before HM.
2. GAMMA-HM-COMP, where gamma is applied before HM, then the result is
further enhanced by applying the compression function.
3. HE-HM-COMP, where equalization is applied before HM, then the result is
further enhanced by applying the compression function.
4. COMP-HM-COMP, where compression function is applied before HM, then the
result is further enhanced by applying it again.
5. HM-HE-COMP, where equalization is applied after HM, then the result is further
enhanced by applying the compression function.

94
Illuminated

GAMMA-HM- COMP-HM-
GAMMA-HM HE-HM-COMP HM-HE-COMP
COMP COMP

Figure 5.15: Effects of the five enhancement combinations that satisfy the three conditions
Table-5.3 and Table-5.4 show the results of using these combinations with the Eigenface
and the Spectroface methods, respectively, over both versions of the Yale B database. In
addition to the results over the 25 training cases, both the average recognition rate of each
combination over these training cases and the difference between it and the average
recognition rate of applying the HM alone are shown in the last two rows, respectively.
It appears from Table-5.3 that using the second enhancement combination, namely
GAMMA-HM-COMP, with the Eigenface method gives the best average difference from
HM alone (see last row) over the four other combinations in both database’s versions.
While in the Spectroface method, Table-5.4 shows that there are no significant
differences between using any of the five combinations in each of the database’s
versions, the enhancement in recognition rates ranged from 3.7% to 4.2% in YALE B-
AUTO and from 6.6% to 7.4% in YALE B-MANU.
As a result, we can choose the GAMMA-HM-COMP combination as the best
enhancement combination over the 40 different combinations according to the criteria
stated above.

Complexity of the Proposed Approach


The GAMMA-HM-COMP approach based on applying three consecutive steps, namely
GAMMA, HM and compression function of the Retinal filter. For an N × N image, both
GAMMA and HM take O(N2). Since the compression function is based on Gaussian
filtering the image by applying the 1D Gaussian filter twice, it takes order O(N2 × k),
where k is the mask size. But since the mask size is fixed and equal to 15 in our case [73],
the overall complexity of the GAMMA-HM-COMP approach remains O(N2) which is
equal to the complexity of using the HM alone.

95
Table 5.3: Results of using the best five combinations with the Eigenface method over the two
versions of the database. Average recognition rate is calculated over the 25 different training cases.
(The best average differences are italic)
(1: GAMMA-HM, 2: GAMMA-HM-COMP, 3: HE-HM-COMP, 4: COMP-HM-COMP, 5: HM-HE-
COMP, nor: normal, ver: vertical, hor: horizontal)

Sub- YALE B-AUTO YALE B-MANU


Training Case
sets HM 1 2 3 4 5 HM 1 2 3 4 5
nor only 49.0 61.6 63.2 60.3 57.7 58.7 74.8 81 89.4 84.5 83.2 83.5
1
nor + 2 ver 63.6 68.7 70 67.7 68.1 68.4 89 95.2 95.8 94.5 94.5 94.5
nor + 2 ver 67.1 73.2 74.5 71 71.3 71.9 88.1 96.8 96.5 94.8 95.2 94.8
2 nor + 2 hor 58.7 66.1 68.1 65.2 62.9 65.2 82.6 89.7 92.6 90.6 89.4 90.6
nor + 2 ver + 2 hor 66.8 74.5 74.5 73.2 72.6 73.2 87.7 95.8 96.8 95.5 94.8 95.5
nor + 2 ver 68.7 74.2 73.9 73.2 72.6 72.9 93.5 97.7 98.7 95.8 96.5 95.8
3 nor + 2 hor 63.6 72.9 72.9 71.6 69.7 71.6 87.7 92.3 93.9 89 90.6 89
nor + 2 ver + 2 hor 67.1 73.9 74.8 74.2 73.2 74.5 89.4 96.1 96.5 94.8 95.5 94.8
nor + 2 ver 55.8 60.7 61.9 58.1 57.7 57.4 94.5 96.8 98.1 97.1 97.1 97.1
4 nor + 2 hor 54.2 67.4 68.4 63.5 61.6 63.2 76.8 91 93.9 90.3 88.1 89.4
nor + 2 ver + 2 hor 56.8 64.5 65.5 63.5 62.3 63.5 87.1 95.2 95.2 95.2 94.2 95.2
nor + 4 ver 69.7 74.5 78.7 74.5 72.6 74.2 90.3 96.5 96.5 95.2 95.5 95.2
1, 2
nor + 4 ver + 2 hor 69.4 75.5 76.5 74.2 75.2 74.2 87.1 96.5 96.8 95.2 94.8 95.2
nor + 4 ver 73.9 77.1 76.5 75.8 76.5 75.8 95.8 98.4 98.1 97.4 97.4 97.4
1, 3
nor + 4 ver + 2 hor 70.0 77.4 78.7 75.8 73.9 75.8 91 96.5 97.4 96.8 96.8 96.8
nor + 4 ver 64.2 67.1 67.1 65.5 64.8 65.2 96.8 98.1 98.4 97.7 97.7 97.7
1, 4
nor + 4 ver + 2 hor 64.5 70.3 71 68.1 67.4 68.1 93.5 96.8 96.5 96.5 95.8 96.5
1, 2, nor + 6 ver 75.8 79.4 79.4 78.4 78.4 78.4 96.1 98.1 98.4 97.4 97.7 97.4
3 nor + 6 ver + 4 hor 74.2 79.7 79.4 79.4 77.1 79.4 90 96.5 97.4 96.8 95.8 96.8
1, 2, nor + 6 ver 66.8 71.6 72.9 69.7 70 69 97.7 98.7 98.7 99 98.7 98.7
4 nor + 6 ver + 4 hor 67.7 73.2 74.8 72.9 72.6 72.6 94.5 96.8 96.8 97.4 96.8 97.4
1, 3, nor + 6 ver 72.6 76.5 76.1 74.5 74.5 74.2 98.1 98.1 98.7 99.7 99.4 99.7
4 nor + 6 ver + 4 hor 71.6 76.5 77.7 76.1 76.1 75.5 94.5 96.5 96.5 96.5 96.1 96.8
1,2,3 nor + 8 ver 74.8 77.4 78.1 76.8 75.2 76.8 99 99 99 100 99.4 100
,4 nor + 8 ver + 6 hor 74.5 79.0 79 78.4 76.5 78.7 94.5 97.1 97.1 97.1 96.8 97.1
Average Recognition
66.4 72.5 73.3 71.2 70.4 71.1 90.8 95.6 96.5 95.4 95.1 95.3
Rate
Average Difference - 6.1 6.9 4.8 4.0 4.7 - 4.8 5.7 4.6 4.3 4.5

96
Table 5.4: Results of using the best five combinations with the Spectroface method over the two
versions of the database. Average recognition rate is calculated over the 25 different training cases.
(The best average differences are italic)
(1: GAMMA-HM, 2: GAMMA-HM-COMP, 3: HE-HM-COMP, 4: COMP-HM-COMP, 5: HM-HE-
COMP, nor: normal, ver: vertical, hor: horizontal)

Sub- YALE B-AUTO YALE B-MANU


Training Case
sets HM 1 2 3 4 5 HM 1 2 3 4 5
nor only 56.8 62.6 62.9 62.6 63.2 62.6 61.9 69 67.7 69 72.3 70.3
1
nor + 2 ver 68.7 73.2 70.7 75.8 72.6 74.8 73.6 81.3 76.1 80.3 80.7 79.7
nor + 2 ver 71.3 77.4 78.7 77.4 78.1 77.1 76.1 84.2 81.3 86.1 84.8 86.5
2 nor + 2 hor 62.6 66.1 67.7 67.4 66.8 66.8 68.4 74.5 75.5 74.5 78.7 75.5
nor + 2 ver + 2 hor 73.6 78.1 81.0 78.1 78.7 77.7 81.6 85.2 82.6 88.7 87.7 89
nor + 2 ver 72.6 76.8 77.4 77.1 78.1 76.8 80.7 89.7 92.9 89.7 89.7 89.7
3 nor + 2 hor 61.9 64.8 65.5 64.8 66.5 65.2 67.1 74.5 74.8 73.9 77.4 74.2
nor + 2 ver + 2 hor 76.5 78.4 80.3 80.0 79.7 80 85.5 91.6 94.2 92.6 93.2 92.3
nor + 2 ver 67.4 73.6 72.9 72.3 71.6 72.3 80 86.8 88.4 86.5 88.4 86.5
4 nor + 2 hor 56.8 63.2 63.9 62.6 62.6 62.9 63.2 71 74.2 70 73.6 71
nor + 2 ver + 2 hor 67.4 71.9 73.9 71.6 71.0 71.3 80 87.4 89.7 85.5 87.4 85.8
nor + 4 ver 73.6 78.1 78.1 78.4 79.4 78.7 77.4 85.8 81.9 87.1 85.8 87.7
1, 2
nor + 4 ver + 2 hor 74.8 78.4 80.0 78.7 79.7 79.4 81.6 86.1 83.2 88.7 88.1 89.4
nor + 4 ver 77.4 80.7 81.0 80.7 81.3 80.7 83.6 91.6 93.2 91.6 90.7 91
1, 3
nor + 4 ver + 2 hor 82.3 83.9 83.6 83.2 83.6 83.2 87.7 93.2 94.5 93.9 94.2 93.6
nor + 4 ver 74.2 77.4 77.1 79.4 77.4 79.4 85.2 91.9 91.3 91 91.6 90.7
1, 4
nor + 4 ver + 2 hor 73.9 76.5 77.1 78.7 77.1 78.4 85.2 92.6 92.6 90.3 90.7 90
1, 2, nor + 6 ver 77.7 80.0 80.7 80.7 83.6 81.9 82.6 92.3 93.9 91.9 91 91.6
3 nor + 6 ver + 4 hor 82.9 83.9 82.9 82.9 84.5 84.2 87.7 93.6 94.5 93.9 94.2 93.9
1, 2, nor + 6 ver 75.5 79.7 80.0 80.7 80.0 81 85.5 93.2 94.2 91.3 91.9 91.3
4 nor + 6 ver + 4 hor 77.4 80.0 81.6 81.3 80.7 82.3 89.7 94.2 95.8 92.3 93.6 91.9
1, 3, nor + 6 ver 78.7 82.6 82.9 82.6 81.9 82.9 85.8 92.9 93.9 92.3 92.3 91.9
4 nor + 6 ver + 4 hor 83.6 86.1 87.1 86.5 86.5 87.1 90 94.8 95.5 93.6 94.5 93.9
1,2,3 nor + 8 ver 78.7 82.6 82.6 82.3 82.9 82.9 85.2 93.6 94.5 92.6 92.9 92.6
,4 nor + 8 ver + 6 hor 83.9 85.8 86.1 85.2 86.1 86.1 90.3 94.8 95.8 93.9 94.8 93.9
Average Recognition
73.2 76.9 77.4 77.2 77.3 77.4 80.6 87.4 87.7 87.3 88.0 87.4
Rate
Difference of Averages - 3.7 4.2 4.0 4.1 4.2 - 6.8 7.1 6.6 7.4 6.7

5.8 Summary
Many illumination normalization approaches have been proposed in literature and can be
classified into two categories: model-based and image-processing based approaches. The
image-processing based approaches are more commonly used in practical systems for
their simplicity and efficiency.

97
Although most of the illumination normalization approaches can cope with illumination
variation well, some may bring negative influence on images without illumination
variation. In addition, some approaches show great difference on performance when
combined with different face recognition approaches. Some other approaches require
perfect alignment of face within the image which is difficult to achieve in practical/real-
life systems.
This chapter introduces a new image-processing based illumination normalization
approach based on enhancing the image resulting from histogram matching using the
gamma correction and the Retinal filter’s compression function, which we called
GAMMA-HM-COMP approach. It is based on three consecutive steps:
1. Applying the gamma correction on the reference average well-lit image,
2. Histogram matching the input image to the result from 1,
3. Applying the Retinal filter’s compression function to further enhancing the result
of 2.
Among 40 different enhancement combinations, GAMMA-HM-COMP approach proves
its flexibility among different face recognition approaches and independency to face
alignment. These make it suitable for practical/real-life systems as it can be used with
different face recognition approaches and doesn’t need any pre-assumptions or
constraints concerning the face alignment. The results show that the GAMMA-HM-
COMP leads to average increasing in recognition rates over HM alone ranges from 4~7%
in Eigenface and Spectroface methods using aligned and non-aligned versions of the Yale
B database.
Moreover, in this study, the compression function of the Retinal filter is newly applied as
an image enhancement method. It proves its suitability for further enhancement rather
than the other three traditional enhancement methods which are the histogram
equalization, gamma correction and log transformation.

In the following chapter, we will evaluate the proposed illumination normalization


approach (GAMMA-HM-COMP) together with other best-of-literature approaches
introduced previously in chapter 3 over images with illumination variation and images
with other facial and geometrical variations using the two selected face recognition
methods.

98
CHAPTER 6: Evaluate the Proposed Approach
6.1 Introduction
The aim of this chapter is to establish comparative studies between the proposed
illumination normalization approach and best-of-literature approaches over images with
illumination variation and images with other facial and geometrical variations using the
two selected face recognition methods. This allows us to test which of these approaches
is flexible to different face recognition approaches, which is independent on face
alignment and which has less side-effects over variations other than illumination.
As chapter 3 introduced previously, there are seven best-of-literature approaches selected
among 38 different illumination normalization approaches based on surveying nine
different comparative studies. Here we chose four out of these seven approaches to
compare with the proposed approach. The chosen approaches are:
1. Single Scale Retinex with Histogram Matching (SSR-HM).
2. Local Normal Distribution (LNORM).
3. Local Binary Patterns (LBP).
4. Preprocessing Chain Approach (CHAIN).
The detailed descriptions of these approaches are introduced previously in chapter 3.
The rest of this chapter is organized as follows: section 2 describes the implementation
parameters of the four approaches and the proposed one in addition to the difference in
results between our implementation of some of these approaches and the published ones.
The comparison between the four approaches and the proposed one over images with
illumination variations and images with other facial and geometrical variations are
introduced in sections 3 and 4, respectively. Finally, chapter summary is introduced in
section 5.

6.2 Implementation of the Compared Approaches


In all experiments through this thesis, only two preprocessing steps are applied to all face
images in both Eigenface and Spectroface methods:
1. Convert each image to grayscale.
2. Resize each image to a fixed size of 128 × 128.
In the following subsections, we describe the implementation parameters of each of the
four approaches and the proposed one in addition to the difference in results between our
implementation of some of these approaches and the published ones. Please refer to
chapter 3 for detailed descriptions about each of the four approaches and chapter 5 for the
proposed approach.

99
6.2.1 Preprocessing Chain Approach (CHAIN)
As described previously in chapter 3, the CHAIN approach consists of four consecutive
steps, which are:
1. Gamma Correction.
2. Difference of Gaussian (DoG).
3. Masking.
4. Contrast Equalization.
Here we use the original implementation of the CHAIN approach used by the authors of
[109] without the masking step. The original implementation can be found in [121]. Also,
we use the same default settings of the various parameters of the CHAIN approach that
are summarized in Table-6.1. Moreover, it’s found by authors of [109] that the CHAIN
approach gives similar results over a broad range of parameter settings, which greatly
facilitates the selection of parameters.
Table 6.1: Default parameter settings for CHAIN approach
Procedure Parameter Value
Gamma Correction γ 0.2
σ0 1
DoG Filtering
σ1 2
Contrast α 0.1
Equalization τ 10
We use the output of applying CHAIN approach as it is without normalizing it to [0-255].
However, in Spectroface method, we bring to positive the whole grayscale range of the
resulting image by adding fixed value (equal 15) to all pixels. This gives much better
results on YALE B database with its two versions (YALE B-AUTO & YALE B-MANU)
as shown in Table-6.2.
Table 6.2: Results of applying CHAIN with and without sliding on Spectroface method on both
versions of the YALE B database
CHAIN Preprocessing
Database
without sliding with sliding
YALE B-AUTO 31.4% 72.2%
YALE B-MANU 59.1% 96.5%

The possible reason behind this is when we bring the whole range to positive; the DC
component of the FFT will have the maximum magnitude over all other components.
Thus, when we normalize the FFT magnitudes by dividing on DC to remove the scaling
factor (refer to section 4.2.2), the consistency between the FFT magnitudes remains the
same on the image itself and on all other images (since the DC always the max in all
images). On the other hand, if we don’t bring the whole range to positive, the max
magnitude will appear at different locations for different images, even for the images of
the same person. So, when we divide all FFT magnitudes on it for normalization, the
consistency between the FFT magnitudes remains the same on the image itself but differ

100
on other images. This leads to misclassification even between the images of the same
person due to the different location of the max value after normalization between the
images.

6.2.2 Local Normal Distribution (LNORM)


Here we use our implementation for the LNORM with window size 7 × 7 as we found
that it is more suitable to image size 128 × 128 rather than 5 × 5 that is originally used by
the authors of [72] for image size 75 × 85.
To test if there is a difference between our implementation and the original one, we re-
implement some of the original experiments of the LNORM published in [72] using the
same recognition approach, Euclidean distance, the same image size, 75 × 85, and the
best window size, 5 × 5. The results in Table-6.2 show that there are no significant
difference between our implementation of the LNORM and the original one; 0.4% in
Extended Yale B and 0.6% in Yale B. These small differences may be due to the
difference in selection of the five training images from subset 1 (S1) in Extended Yale B
and the randomization of the five training images in Yale B.
Table 6.3: Difference between our implementation of the LNORM and the original one
Original Results Our Results
Normaliz. Differ-
Database Description Recog. Recog.
Approach % % ence
Approach Approach
train: 5 from S1
Extended Yale B Euclidean 97.3 Euclidean 97.7 +0.4%
test: remain 59
LNORM 5×5
Yale B – pure faces train: 5 from S3
Euclidean 99.4 Euclidean 100 +0.6%
(average of 10 random) test: remain 59

We use the output of applying LNORM approach as it is without normalizing it to [0-


255]. However, in Spectroface method, we bring to positive the whole grayscale range of
the resulting image by adding fixed value (equal 5) to all pixels as we done before in
CHAIN. This gives much better results on YALE B database with its two versions
(YALE B-AUTO & YALE B-MANU) as shown in Table-6.4.
Table 6.4: Results of applying LNORM with and without sliding on Spectroface method on both
versions of the YALE B database
LNORM Preprocessing
Database
without sliding with sliding
YALE B-AUTO 21.7% 67.1%
YALE B-MANU 48.1% 95.6%

6.2.3 Single Scale Retinex with Histogram Matching (SSR-HM)


Here we use our implementation for the SSR-HM with σ = 4 as the authors of [66]
conclude that the illumination correction is best at Retinex scales between σ = 2 and σ =
6. Moreover, the reference image for HM is constructed by calculating the average image

101
of a set of well-lit images – one for each subject rather than using a single well-lit image
for the whole image set.
To test if there is a difference between our implementation and the original one, we re-
implement the original experiments of the SSR-HM published in [66] using the best
sigma in the published work (σ = 2) but with the Eigenface as recognition approach rather
than the SVM that is used in [66]. The results in Table-6.5 show that there are no
significant difference between our implementation and the original one except in one
experiment. These differences may be due to the randomization of the training images
and/or the different recognition approach used by our experiments (Eigenface rather than
SVM).
Table 6.5: Difference between our implementation of the SSR-HM and the original one
Original Results Our Results
Normaliz. Differ-
Database Description Recog. Recog.
Approach % % ence
Approach Approach
train: 1 from S1
Yale B – head SVM 99.0 Eigenface 100 +1.0%
test: remain 63
SSRÆHM Yale B – head train: 1 from any subset
SVM 90.2 Eigenface 98.2 +8.0%
(σ = 2) (average of 20 test: remain 63
random from the train: 2 from any subset
SVM 99.8 Eigenface 99.7 -0.1%
whole database) test: remain 62

6.2.4 Local Binary Patterns (LBP)


Here we modify the implementation of the LBP found in [121] to be similar as the one
used by the authors of [80]. We use the same parameters that are used in the original
implementation (i.e. applying LBP operator on 8 equally spaced neighborhood pixels on
circle of radius 2).
Since the original implementation of the LBP is used in [80] for face authentication rather
than face recognition as in our case, unfortunately we are not able to compare the results
of both implementations here.

6.2.5 Proposed Approach (GAMMA-HM-COMP)


As stated previously in chapter 5, the proposed illumination normalization approach
consists of applying three consecutive steps, which are:
1. Gamma correction,
2. Histogram matching,
3. Compression function of the Retinal filter.
The detailed description of the proposed approach can be found in chapter 5. Here we
review quickly about the parameters used for this approach during the experiments.
First, the gamma value used in first step is chosen to be equal four and the output of the
gamma correction is normalized to be in range [0-255]. Second, the reference image for
HM is constructed by calculating the average image of a set of well-lit images – one for

102
each subject rather than using a single well-lit image for the whole image set. Finally, the
parameters of the Gaussian filter used in the compression function of the Retinal filter is
originally taken from [73] which uses a Gaussian filter of size 15 × 15 with standard
deviation σ = 2.

6.3 Comparison on Illumination Variations


We use the Yale B database (frontal images only) to compare between the five
approaches on illuminated face images. To test if each approach requires perfect face
alignment or not, the comparison is applied on both aligned and non-aligned versions of
the Yale B, namely YALE B-MANU and YALE B-AUTO (please refer to chapter 5 for
descriptions about both versions). On both versions, the comparison is done on all the 25
training cases described previously in chapter 4 then the average recognition rate is
calculated and used as a comparison measure. Following are the comparison results on
each version using both Eigenface and Spectroface methods.

6.3.1 Aligned Faces


Here we use the YALE B-MANU version. Table-6.6 shows the results of applying each
of the five illumination normalization approaches for the 25 training cases on each of the
Eigenface and the Spectroface method. It shows also the results without applying any of
the five approaches. The average recognition rates over the 25 training cases are shown in
the last row of the table.
Fig.6.1 (a) and (b) shows the difference between the average recognition rates before and
after applying each of the five approaches on Eigenface and Spectroface respectively. It’s
clear from the figure that the best illumination normalization approach on both Eigenface
and Spectroface methods is the SSR-HM approach.

6.3.2 Non-Aligned Faces


Here we use the YALE B-AUTO version. Table-6.7 shows the results of applying each of
the five illumination normalization approaches for the 25 training cases on each of the
Eigenface and the Spectroface method. It shows also the results without applying any of
the five approaches. The average recognition rates over the 25 training cases are shown in
the last row of the table.
Fig.6.2 (a) and (b) shows the difference between the average recognition rates before and
after applying each of the five approaches on Eigenface and Spectroface respectively.
It’s clear from Fig.6.2 that the proposed approach is the best one on the Eigenface
method while the SSR-HM is the best one on the Spectroface method. Note the
significant decreasing in the performance of the four best-of-literature approaches on
both methods when the images are not perfectly aligned. Moreover, both LNORM and
CHAIN approaches bring negative influence when they used with the Eigenface method.

103
This means that these approaches require the images to be perfectly aligned which is
difficult to achieve in practical/real-life systems.
Fig.6.3 (a) and (b) shows the decreasing in the performance of each approach due to the
non-aligning of faces on Eigenface and Spectroface respectively. (i.e. the difference
between the performance of each approach on the YALE B-MANU and YALE B-
AUTO). It’s clear that the minimum affected approach due to the non-aligning of faces
on both methods is the proposed approach.
Table 6.6: Results of applying each of the five illumination normalization approaches with both
Eigenface and Spectroface methods over YALE B-MANU version. Average recognition rate is
calculated over the 25 different training cases.
(0: NONE, 1: LNORM, 2: LBP, 3: CHAIN, 4: SSR-HM, 5: GAMMA-HM-COMP, nor: normal, ver:
vertical, hor: horizontal)

Sub- Eigenface Spectroface


Training Case
sets 0 1 2 3 4 5 0 1 2 3 4 5
nor only 62.9 100 100 100 100 89.451 79 74.5 89.4 88.7 68.1
1
nor + 2 ver 60.6 100 100 100 100 95.8
55.5 87.7 76.8 93.6 96.5 77.7
nor + 2 ver 66.8 100 100 100 100 96.5
61.3 91.9 82.6 97.1 97.4 82.9
2 nor + 2 hor 67.1 100 100 100 100 92.6
58.4 92.6 90.3 94.5 91.9 75.5
nor + 2 ver + 2 hor 68.7 100 100 100 100 96.8
64.5 94.5 90.7 97.1 97.7 83.6
nor + 2 ver 65.8 100 100 100 100 98.7 73.6 96.8 89.4 97.4 98.4 92.3
3 nor + 2 hor 70 100 100 100 100 93.9
61.9 93.9 91.9 96.5 95.5 73.9
nor + 2 ver + 2 hor 63.2 100 100 100 100 96.579 98.7 93.9 97.7 98.7 93.2
nor + 2 ver 62.3 100 100 100 100 98.1
66.8 98.4 89 95.5 99 89
4 nor + 2 hor 57.4 100 99.7 100 100 93.9
52.9 86.1 82.6 94.2 92.9 73.2
nor + 2 ver + 2 hor 54.2 100 100 100 100 95.2
66.1 98.4 91.6 95.5 99 89
nor + 4 ver 70.6 100 100 100 100 96.5 61.3 90.7 81.9 96.8 97.4 84.2
1, 2
nor + 4 ver + 2 hor 69.7 100 100 100 100 96.8 64.5 93.9 90.3 96.8 97.7 84.2
nor + 4 ver 70.6 100 100 100 100 98.1 74.2 96.1 90.7 96.8 98.4 92.9
1, 3
nor + 4 ver + 2 hor 70.6 100 100 100 100 97.4 80 98.7 93.9 97.1 98.7 93.6
nor + 4 ver 69.7 100 100 100 100 98.4 69.7 99 91.3 97.1 99.7 93.6
1, 4
nor + 4 ver + 2 hor 58.1 100 99.7 100 100 96.5 68.7 99 93.2 97.4 99.7 93.6
1, 2, nor + 6 ver 68.7 100 100 100 100 98.4 73.9 97.1 89.4 97.4 98.4 93.2
3 nor + 6 ver + 4 hor 73.5 100 100 100 100 97.4 81.9 98.7 94.8 97.4 98.7 94.2
1, 2, nor + 6 ver 70.3 100 100 100 100 98.7 74.8 99.4 93.2 97.4 99.7 95.8
4 nor + 6 ver + 4 hor 62.3 100 100 100 100 96.8 78.1 99.7 96.5 97.7 99.7 96.1
1, 3, nor + 6 ver 70.6 100 100 100 100 98.7 78.1 99.4 94.5 97.7 99.7 95.2
4 nor + 6 ver + 4 hor 60.3 100 100 100 100 96.5 84.2 99.7 96.8 97.7 99.7 96.1
1,2,3 nor + 8 ver 70 100 100 100 100 99 77.7 99.7 93.6 97.7 99.7 95.8
,4 nor + 8 ver + 6 hor 65.8 100 100 100 100 97.1 86.1 99.7 96.8 97.7 99.7 96.8
Average Recognition
66 100 100 100 100 96.5 69.8 95.6 90 96.4 97.7 88.1
Rate

104
Eigenface on YALE B-MANU
(NONE = 66.0%)
34

Average difference from NONE


33

32

31

30

29

28
LNORM
CHAIN S1
LBP
SSR-HM
Normalization Approach Proposed

(a) Eigenface (b) Spectroface


Figure 6.1: Average increasing/decreasing in recognition rates after applying each of the five
illumination normalization approaches on YALE B-MANU version
Table 6.7: Results of applying each of the five illumination normalization approaches with both
Eigenface and Spectroface methods over YALE B-AUTO version. Average recognition rate is
calculated over the 25 different training cases. (0: NONE, 1: LNORM, 2: LBP, 3: CHAIN, 4: SSR-
HM, 5: GAMMA-HM-COMP, nor: normal, ver: vertical, hor: horizontal)

Sub- Eigenface Spectroface


Training Case
sets 0 1 2 3 40 5 1 2 3 4 5
nor only 46.5 41.6 47.7 39 53.9 66.8
48.4 46.5 45.8 55.5 62.9 57.1
1
nor + 2 ver 48.7 49.7 53.5 50 66.8 70
52.3 61.6 57.4 67.7 74.2 70
nor + 2 ver 54.8 50.6 56.8 44.5 65.2 74.5
57.7 61.9 60 70.3 78.4 77.4
2 nor + 2 hor 50.6 50.3 57.7 54.2 63.9 68.1
58.1 63.9 65.5 68.4 75.8 65.5
nor + 2 ver + 2 hor 55.2 57.1 60.6 53.2 68.1 74.5
61 71 69 74.5 81 78.1
nor + 2 ver 55.5 49.4 59.7 47.7 65.2 73.9 61.6 55.8 65.2 65.8 79 76.1
3 nor + 2 hor 53.2 43.2 53.9 51.9
56.8 60.6 72.9 62.6 64.5 64.5 73.2 63.9
nor + 2 ver + 2 hor 52.6 51 56.5 52.3
65.2 69 74.8 68.1 74.8 71.3 81.6 76.8
nor + 2 ver 45.5 44.8 47.4 42.3
55.5 59.7 62.3 55.5 57.1 57.4 71 73.9
4 nor + 2 hor 44.8 43.5 50 40
48.1 57.4 68.4 51.6 54.5 60.7 69.4 61.3
nor + 2 ver + 2 hor 41.9 46.1 50.3 44.8
54.8 63.2 65.5 57.7 61.3 62.9 76.1 74.8
nor + 4 ver 57.7 52.6 56.5 49.4 66.5 78.7 58.7 66.5 64.5 72.3 77.1 77.4
1, 2
nor + 4 ver + 2 hor 59.7 55.8 63.2 54.5 68.7 76.5 60.7 71.6 71.3 75.2 81.3 77.7
nor + 4 ver 60 51.6 56.8 52.6 65.8 76.5 65.5 68.4 70 73.6 78.7 80.7
1, 3
nor + 4 ver + 2 hor 58.1 54.5 60 55.2 69.4 78.7 69 74.8 77.4 78.1 82.3 81.3
nor + 4 ver 50 49.4 53.2 52.3 64.2 67.1 60.3 68.4 67.4 73.9 80.7 78.4
1, 4
nor + 4 ver + 2 hor 46.8 51.6 60.3 49.4 66.8 71 59.7 68.4 71.9 75.8 82.9 78.1
1, 2, nor + 6 ver 61 54.5 61 53.9 68.7 79.4 66.5 69.7 71.6 75.2 79 81.3
3 nor + 6 ver + 4 hor 61 60.6 63.2 56.8 72.6 79.4 71.6 76.1 80.7 80.7 82.9 81.6
1, 2, nor + 6 ver 55.2 51.3 54.8 52.6 63.5 72.9 65.2 71.9 71 77.4 83.2 82.3
4 nor + 6 ver + 4 hor 52.3 58.4 62.6 56.5 69.7 74.8 68.4 77.4 79 81.9 86.5 82.6
1, 3, nor + 6 ver 55.5 52.9 55.2 51.9 69 76.1 70 72.6 74.8 76.8 82.6 84.5
4 nor + 6 ver + 4 hor 53.5 53.2 59.7 54.8 69.7 77.7 74.5 80.3 83.2 81.6 87.7 86.8
1,2,3 nor + 8 ver 57.4 53.2 60.6 53.9 69.4 78.1 71 74.5 76.5 79.4 83.2 84.8
,4 nor + 8 ver + 6 hor 55.5 60.6 65.2 59 69 79 77.1 81.6 86.1 83.9 88.1 85.8
Average Recog. Rate 53.3 51.5 57.1 50.9 65.8 73.5 62.3 67.1 68.8 72.2 79.2 76.7

105
Eigenface on YALE B-AUTO
(NONE = 53.3%)
25

Average difference from NONE 20

15

10

-5
LNORM
CHAIN S1
LBP
SSR-HM
Normalization Approach Proposed

(a) Eigenface (b) Spectroface


Figure 6.2: Average increasing/decreasing in recognition rates after applying each of the five
illumination normalization approaches on YALE B-AUTO version

(a) Eigenface (b) Spectroface


Figure 6.3: Performance decreasing of each normalization approach due to the non-aligning of faces
(i.e. subtracting the performance on YALE B-AUTO from the performance on YALE B-MANU)

The final conclusion from comparison on illuminated images is to use the SSR-HM
approach when the illuminated images are aligned properly while use the proposed
approach when the illuminated images are not aligned.

6.4 Comparison on Other Variations


The aim of these comparisons is to study the side-effect of each of the five illumination
normalization approaches on variations other than the illumination. Therefore, we
compare the five approaches on each of the four face recognition variations described
previously in chapter 4 (excluding illumination). These four variations are divided into
two-facial, which are 3D pose, facial expressions, and two-geometrical, which are
translation and scaling. We use the same databases and the same training cases that are
used in chapter 4 for comparisons.

106
In the two-facial variations, each approach is applied on all training cases of each
database, then, its average recognition rate over these training cases is calculated and
compared with the corresponding baseline average rate that is calculated before in
chapter 4. In the two-geometrical variations, the average recognition rate for each
approach is calculated before and after each variation, then the difference between the
two averages is calculated and compared with the corresponding baseline difference that
is calculated before in chapter 4. Following are the comparison results on each face
recognition variation using both Eigenface and Spectroface methods.

6.4.1 Pose Variations


Same as in chapter 4, we use the UMIST database with the 12 training cases for this
comparison. Table-6.8 shows the results of applying each of the five illumination
normalization approaches for the 12 training cases on each of the Eigenface and the
Spectroface method. It shows also the results without applying any of the five approaches
– taken from chapter 4. The average recognition rates are shown in the last row of the
table.
Fig.6.4 (a) and (b) shows the difference between the average recognition rates before and
after applying each of the five approaches on Eigenface and Spectroface respectively.

Table 6.8: Results of applying each of the five illumination normalization approaches with both
Eigenface and Spectroface methods over UMIST database. Average recognition rate is calculated
over all training cases.
(0: NONE, 1: LNORM, 2: LBP, 3: CHAIN, 4: SSR-HM, 5: GAMMA-HM-COMP)

Eigenface Spectroface
Training Case
0 1 2 3 4 5 0 1 2 3 4 5
normal only 64 25 35.5 23.5 28.5 44.5 48 41.3 48.8 36.8 36.3 40
normal + 10˚ 68.5 30.5 47.3 40.5 54 60 67.3 52.8 63 49 55 55.5
normal + 30˚ 75 37.5 57 43.5 56 71 76 57 74 58.5 54 65.8
normal + 45˚ 87 46.5 64.5 43 65 78 89 71.5 82.5 66.8 67.3 78
normal + 75˚ 85.5 52 67.5 50 61 80.5 89 76.8 84.8 67 73 83.3
normal + 10˚ + 30˚ 74 41.5 57.5 46 56.5 68.5 74.5 58.5 72.5 58 57.5 65
normal + 10˚ + 45˚ 84.5 55.5 69 52.5 73 79 90 76.8 85 69.3 72.3 80.5
normal + 10˚ + 75˚ 87.5 61 75.8 63 74.5 87.5 94.5 86.5 90.8 79.5 83.3 89.8
normal + 30˚ + 45˚ 85 49 71 50 67.5 80 90.3 76.3 85.5 70.8 70.5 80
normal + 30˚ + 75˚ 88.5 62.5 78 57.5 74 88 94.8 88 91 82.5 82.3 90.3
normal + 45˚ + 75˚ 87.5 63 79.3 54 70.5 87.5 95 83.5 89.3 79.5 80.5 87.8
normal+10˚+30˚+45˚+75˚ 88 68.5 83.3 66 81 90 95 90.5 92.5 86 85.8 90.3
Average Recognition
81.3 49.4 65.5 49.1 63.5 76.2 83.6 71.6 80 67 68.2 75.5
Rate

107
(a) Eigenface (b) Spectroface
Figure 6.4: Average difference in recognition rates after applying each of the five illumination
normalization approaches on UMIST database
Although all approaches lead to decrease the recognition rates as shown in Fig.6.3, the
proposed approach has the least side-effect due to 3D pose variation on Eigenface
method while the LBP has the least side-effect on the Spectroface method.

6.4.2 Facial Expressions Variations


Same as in chapter 4, we use each of the Grimace, Yale, JAFEE, and Nott-faces
databases with their training cases for this comparison. Table-6.9 shows the results of
applying each of the five illumination normalization approaches for the four databases on
each of the Eigenface and the Spectroface method. It shows also the results without
applying any of the five approaches – taken from chapter 4. The average recognition rates
over all databases are shown in the last row of the table.
Figures 6.5, 6.6, 6.7 and 6.8 show the difference between the average recognition rates
before and after applying each of the five approaches on Yale, Grimace, JAFFE and Nott-
faces, respectively. It’s clear from these figures that in all the four databases, the
proposed approach always has the least side-effect due to facial expression variation on
both Eigenface and Spectroface methods among the four other approaches. In Nott-faces
database, we note that applying the proposed approach and also the SSR-HM approach
lead to increase the recognition rate over the baseline. One possible reason for this
increasing is the uniform illumination effects on the faces of this database. So, when we
apply the illumination normalization approach, it normalizes these effects and thus gives
better recognition.

108
Table 6.9: Results of applying each of the five illumination normalization approaches with both
Eigenface and Spectroface methods over Grimace, Yale, JAFEE, and Nott-faces databases. Average
recognition rate is calculated over all training cases.
(0: NONE, 1: LNORM, 2: LBP, 3: CHAIN, 4: SSR-HM, 5: GAMMA-HM-COMP, expr: expression(s))

Eigenface Spectroface
DB Training Case
0 1 2 3 4 5 0 1 2 3 4 5
normal only 81.9 75.2 90.5 66.7 81 88.6 92.4 83.8 91.4 73.3 84.8 93.3
Yale

3
normal + 2 expr 96 88 93.3 84 94.7 94.7 98.7 94.7 96 86.7 97.3 97.3
normal + 3 expr 96.7 85 91.7 81.7 93.3 93.3 98.3 95 98.3 86.7 98.3 98.3
normal only 96.2 76.6 80.1 82.2 87.1 93.6 100 84.8 93 93.3 95 100
Grimac

normal + 2 expr 97.1 88.9 93.1 89.2 92.8 96.7 100 94.8 98.7 99 98 99
e

normal + 4 expr 96.7 91.5 94.1 88.5 91.5 96.7 100 94.4 99.3 99.3 97.8 100
normal only 84.2 65.5 65 55.7 73.4 79.8 93.1 82.3 81.8 69 84.2 87.7
JAFFE

normal + 2 expr 90.2 71 82 71.6 78.1 85.2 97.8 89.1 87.4 85.8 91.8 94.5
normal + 4 expr 89.6 77.9 78.5 75.5 77.9 89.6 97.6 90.8 93.9 84.7 95.7 97.6
normal only 55.4 43.9 52.5 36.1 42.5 59.6 57.1 51.4 46.1 49.3 62.9 66.1
with cap.
Nott-faces

normal + 1 expr 55.7 46.7 51 35.7 45.2 60 62.9 51.9 54.8 54.3 62.4 63.3
normal + 2 expr 47.9 43.6 43.6 33.6 42.1 50.7 56.4 47.9 47.1 51.4 52.9 52.9
normal only 69 51 62.9 39.5 48.1 76.2 63.8 65.2 55.2 57.6 80 83.3
w/out
cap

normal + 1 expr 76.4 55.7 65 40 52.9 84.3 77.1 71.4 70 67.1 87.9 87.9
Average Recognition
80.9 68.6 74.5 62.9 71.5 79.4 86.3 80.5 80.5 75.1 80.2 86.7
Rate

Eigenface on YALE
(NONE = 91.5%)
2

0
Average difference from NONE

-2

-4

-6

-8

-10

-12

-14

-16
LNORM
CHAIN S1
LBP
SSR-HM
Normalization Approach Proposed

(a) Eigenface (b) Spectroface


Figure 6.5: Average difference in recognition rates after applying each of the five illumination
normalization approaches on Yale database

3
normal + N expr:- means that we train with normal image + N images each contains single expression –
randomly selected.

109
Eigenface on Grimace
(NONE = 96.7%)
0

Average difference from NONE


-2

-4

-6

-8

-10

-12
LNORM
CHAIN S1
LBP
SSR-HM
Normalization Approach Proposed

(a) Eigenface (b) Spectroface


Figure 6.6: Average difference in recognition rates after applying each of the five illumination
normalization approaches on Grimace database
Eigenface on JAFFE
(NONE = 88.0%)
0
Average difference from NONE

-5

-10

-15

-20

-25
LNORM
CHAIN S1
LBP
SSR-HM
Normalization Approach Proposed

(a) Eigenface (b) Spectroface


Figure 6.7: Average difference in recognition rates after applying each of the five illumination
normalization approaches on JAFFE database
Eigenface on Nott-faces
(NONE = 60.9%)
0
Average difference from NONE

-5

-10

-15

-20

-25
LNORM
CHAIN S1
LBP
SSR-HM
Normalization Approach Proposed

(a) Eigenface (b) Spectroface


Figure 6.8: Average difference in recognition rates after applying each of the five illumination
normalization approaches on Nott-faces database

110
6.4.3 Translation Variations
Here we use the same training and testing methodologies described in chapter 4 for
testing each of the five illumination normalization approaches. As described previously in
chapter 4, the testing of translation is applied two different times; first, by translating with
circulation in which the output pixels after translation are circulated to fill the empty
pixels in the opposite direction. Second, translating without circulation in which the
empty pixels after translation are filled by fixed color (gray in our case).
For translating with circulation case, Table-6.10 (a)-(e) shows the average decreasing in
recognition rates4 per database on both Eigenface and Spectroface methods for each of
the five approaches. Table-6.11 (a)-(e) is dedicated for translating without circulation
case. The average decreasing in recognition rates over all databases are shown in the last
rows of each table.
Table 6.10: Average decreasing in the recognition rates of both methods after translating with
circulation in the four directions while applying (a) LNORM, (b) LBP, (c) CHAIN, (d) SSR-HM and
(e) GAMMA-HM-COMP approaches as preprocessing step.
(a) LNORM Approach
Eigenface Method Spectroface Method
Translation Value Translation Value
Database Database
2 4 6 8 2 4 6 8
UMIST 18.3 34.9 41.4 42.8 UMIST 1.2 0.3 1.6 0.4
Grimace 8.5 38.6 58 64 Grimace 0 0.1 0 0.2
Yale 13.3 51.7 70.8 73.7 Yale 0.4 3.3 0.4 2.9
JAFFE 11.8 39.3 53.4 53.8 JAFFE 4.2 2.9 3.8 2.6
Nott-faces 18.2 40.3 51.6 53.6 Nott-faces 4.1 3.2 4.6 2
Yale B 7 24.1 37.5 41 Yale B 1 0 0.6 0.6
Average 12.9 38.2 52.1 54.8 Average 1.8 1.6 1.8 1.5

(b) LBP Approach


Eigenface Method Spectroface Method
Translation Value Translation Value
Database Database
2 4 6 8 2 4 6 8
UMIST 6.8 23.2 34.1 40.4 UMIST 0.2 0.7 1 1.1
Grimace 4.4 16.6 28.1 41.6 Grimace 0.2 0.3 0.8 0.6
Yale 1.7 15 37.5 49.6 Yale 0.8 2.9 2.1 2.9
JAFFE 14.7 32.8 46.8 53.4 JAFFE 1.8 1 2.5 3.4
Nott-faces 15.5 34.1 45.2 50.5 Nott-faces 2.3 7.1 6.3 7.1
Yale B 4.3 14.7 26 33.1 Yale B 1.9 2.1 3.1 3.1
Average 7.9 22.7 36.3 44.8 Average 1.2 2.4 2.6 3

4
There are some cases in which the recognition rates are marginally increased, we consider it as noise and
set the decreasing value to 0 to indicate that there’s no decreasing due to the translation.

111
(c) CHAIN Approach
Eigenface Method Spectroface Method
Translation Value Translation Value
Database Database
2 4 6 8 2 4 6 8
UMIST 7.6 29.4 41.4 42.3 UMIST 6.3 6.9 8.2 8
Grimace 4 38.5 61.9 73.1 Grimace 0.5 0.7 1.3 0.8
Yale 12.5 44.2 63 69.6 Yale 7.1 11.7 6.7 12.1
JAFFE 12.4 39.2 53.8 55.6 JAFFE 4.2 4.6 4.9 4.6
Nott-faces 8 22.3 31.1 34.3 Nott-faces 7.3 8.8 9.5 7.5
Yale B 6.9 24.8 35.6 38.9 Yale B 2.3 4.1 3.8 3.6
Average 8.6 33.1 47.8 52.3 Average 4.6 6.1 5.7 6.1

(d) SSR-HM Approach


Eigenface Method Spectroface Method
Translation Value Translation Value
Database Database
2 4 6 8 2 4 6 8
UMIST 16.8 42.8 54.6 59 UMIST 13.2 25.8 30.1 29.1
Grimace 2.3 27.8 57 68.7 Grimace 0.7 4.9 12.4 8.5
Yale 7.9 31.2 52.9 67.1 Yale 1.7 5 10.4 12.9
JAFFE 5.1 24 41.8 49 JAFFE 0 2.5 4.4 3.8
Nott-faces 10.6 26.7 36.6 42 Nott-faces 8.6 21.1 27.7 26.2
Yale B 5.1 20.8 38.1 47 Yale B 3.7 7.7 11.5 11.7
Average 8 28.9 46.8 55.5 Average 4.7 11.2 16.1 15.4

(e) GAMMA-HM-COMP Approach


Eigenface Method Spectroface Method
Translation Value Translation Value
Database Database
2 4 6 8 2 4 6 8
UMIST 1.4 7.4 18.4 31.6 UMIST 0 0.1 0 0
Grimace 0 1.8 15.7 38.4 Grimace 0 0 0 0
Yale 0.8 9.1 20.4 37.1 Yale 0 0 0 0
JAFFE 3 13 26.3 40.5 JAFFE 0 0 0 0
Nott-faces 3.9 16.4 28.9 42.7 Nott-faces 0.9 1.1 0.9 0.9
Yale B 1.7 6.1 13.9 24.8 Yale B 0.3 0.5 0.2 0.3
Average 1.8 9 20.6 35.9 Average 0.2 0.3 0.2 0.2

112
Table 6.11: Average decreasing in the recognition rates of both methods after translating without
circulation in the four directions while applying (a) LNORM, (b) LBP, (c) CHAIN, (d) SSR-HM and
(e) GAMMA-HM-COMP approaches as preprocessing step.
(a) LNORM Approach
Eigenface Method Spectroface Method
Translation Value Translation Value
Database Database
2 4 6 8 2 4 6 8
UMIST 17.9 34 41 43.4 UMIST 1.7 1.2 1.7 2.6
Grimace 8.6 38.7 58.9 63.3 Grimace 0.1 0 0 0.5
Yale 15 52.9 70.4 73.8 Yale 1.7 2.9 1.3 4.2
JAFFE 10.8 38.3 54.2 55 JAFFE 4 3.4 4.4 3.4
Nott-faces 17.8 40.2 51.2 52.3 Nott-faces 4.3 5.9 7.1 10.2
Yale B 7.4 25.9 38.1 40.1 Yale B 1.5 1.9 2.6 3.9
Average 12.9 38.3 52.3 54.7 Average 2.2 2.6 2.9 4.1

(b) LBP Approach


Eigenface Method Spectroface Method
Translation Value Translation Value
Database Database
2 4 6 8 2 4 6 8
UMIST 7.1 23.6 35.2 41.1 UMIST 0.3 1.8 2.9 5.5
Grimace 4.5 16.6 29.2 42.8 Grimace 0.3 1.6 2.5 4
Yale 2.1 16.3 40.5 51.7 Yale 0.8 2.5 2.9 6.2
JAFFE 14.4 36.1 50.9 59 JAFFE 1.5 5.6 11.5 15.7
Nott-faces 15.7 37.3 47.7 53.8 Nott-faces 2.5 11.6 21.6 36.1
Yale B 5 15.8 26.7 34.1 Yale B 1.4 4.8 9.8 15.2
Average 8.1 24.3 38.4 47.1 Average 1.1 4.7 8.5 13.8

(c) CHAIN Approach


Eigenface Method Spectroface Method
Translation Value Translation Value
Database Database
2 4 6 8 2 4 6 8
UMIST 6.9 30.6 42.1 41.8 UMIST 8.8 11.8 12.9 7.1
Grimace 4.8 38.2 61.4 73.8 Grimace 0.4 0.2 0.7 0.3
Yale 15.9 50.5 63.4 68 Yale 25.4 34.6 37.5 21.3
JAFFE 12 39.6 54.7 55.8 JAFFE 3 3.7 3.4 2.6
Nott-faces 7.5 22.1 30.5 33 Nott-faces 5.7 7.3 7.5 3.2
Yale B 7.5 26.6 34.6 36.9 Yale B 8 12.3 14.6 11.2
Average 9.1 34.6 47.8 51.6 Average 8.6 11.7 12.8 7.6

113
(d) SSR-HM Approach
Eigenface Method Spectroface Method
Translation Value Translation Value
Database Database
2 4 6 8 2 4 6 8
UMIST 14.6 39.4 52.4 55.8 UMIST 8.4 23.9 31.4 32.4
Grimace 2.4 22.8 54 69.4 Grimace 0.7 4.6 9.6 5.1
Yale 7.5 31.6 52.9 64.1 Yale 1.7 5.8 7.1 12.5
JAFFE 4.7 23.6 38.9 46.1 JAFFE 1.2 4.9 6.7 6.3
Nott-faces 9.7 26.8 37 42.7 Nott-faces 7.1 16.2 22.5 24.3
Yale B 4.4 23.1 40.5 45 Yale B 12.3 30.6 38.5 33.4
Average 7.2 27.9 46 53.9 Average 5.2 14.3 19.3 19

(e) GAMMA-HM-COMP Approach


Eigenface Method Spectroface Method
Translation Value Translation Value
Database Database
2 4 6 8 2 4 6 8
UMIST 1.9 8.6 18.4 33.6 UMIST 0 0.1 0.9 3.6
Grimace 0 1.8 16.6 42.5 Grimace 0.5 1.5 2.5 5.6
Yale 0.8 9.1 19.1 34.1 Yale 0 0 0 0
JAFFE 3.2 13.2 26.3 39 JAFFE 0 0.1 0.5 1.4
Nott-faces 4.1 15.7 30.6 41.6 Nott-faces 0.2 0.7 2.9 7
Yale B 2.1 7.2 20.8 30.4 Yale B 0.5 3.3 11.6 22.3
Average 2 9.3 22 36.9 Average 0.2 1 3.1 6.7

Fig.6.9 and Fig.6.10 show the average decreasing curves after translating with and
without circulation, respectively, on both (a) Eigenface and (b) Spectroface methods.
Observe that in Figures 6.9 and 6.10, in both Eigenface and Spectroface, the best two
curves with minimum average decreasing in recognition rates are the NONE curve and
the proposed approach curve. This means that the proposed approach has the least side-
effect due to translation variation on both Eigenface and Spectroface methods among the
four other approaches. Note that the performance of the SSR-HM approach is
dramatically affected by the translation variation over both recognition methods.

(a) Eigenface (b) Spectroface


Figure 6.9: Average decreasing in recognition rates after translating with circulation

114
(a) Eigenface (b) Spectroface
Figure 6.10: Average decreasing in recognition rates after translating without circulation

6.4.4 Scaling Variations


Same as in chapter 4, we use the Face 94 database with the seven training cases for this
comparison. For each training case, the testing is done two times, before and after
scaling, in order to record the decreasing in recognition rates after scaling all testing
images. Table-6.12 shows the decreasing in recognition rates when applying each of the
five illumination normalization approaches for the seven training cases on each of the
Eigenface and the Spectroface method. It shows also the decreasing in recognition rates
without applying any of the five approaches (baseline) – taken from chapter 4. The
average decreasing in recognition rates are shown in the last row of the table.
Table 6.12: Decreasing in recognition rates after applying each of the five illumination normalization
approaches with both Eigenface and Spectroface methods over Face 94 database. Average decreasing
in recognition rate is calculated over all training cases.
(0: NONE, 1: LNORM, 2: LBP, 3: CHAIN, 4: SSR-HM, 5: GAMMA-HM-COMP)

Eigenface Spectroface
Training Case
0 1 2 3 4 5 0 1 2 3 4 5
normal only 14.6 51.4 46.3 38.9 40.9 19.7 19.1 62.4 47.1 62.7 49.9 24
normal + up8 6.6 33.1 30.1 24.9 27.1 9.7 7.7 42.9 30.9 43.6 33.2 9.2
normal + down8 9.8 34.3 30.2 23.9 27.2 15.8 12.8 41.3 27.4 41.9 31.5 17.3
normal + up8 + down8 0.7 19.2 14.7 9.8 11.5 3.2 0.7 21.9 10.5 21.4 14.3 2.1
normal + up17 5.8 40.7 31.8 31.9 32.3 8.9 8 48.7 31 45.2 35.7 9.5
normal + down17 8.7 38.2 33.3 25.1 26.7 14.1 13.1 48.3 29.4 49.8 32.2 17.7
normal + up17 + down17 0 31.7 21.2 19.2 18.1 0.5 0.9 33.9 12.9 32.6 17.8 2.5
Average Decreasing
6.6 35.5 29.7 24.8 26.3 10.3 8.9 42.8 27 42.5 30.7 11.8
in Recognition Rate

Fig.6.11 (a) and (b) shows the average decreasing in recognition rates after applying each
of the five approaches on Eigenface and Spectroface respectively.

115
(a) Eigenface (b) Spectroface
Figure 6.11: Average decreasing in recognition rates when applying each of the five illumination
normalization approaches before and after scaling the Face 94 database
It’s clear from Fig.6.10 that the proposed approach has the least side-effect due to
scaling variation on both Eigenface and Spectroface methods among the four other
approaches, while has side-effect slightly more than that in the NONE case (i.e. without
applying any approach). Note that the performance of the other four approaches is
dramatically affected by the scaling variation over both recognition methods.

6.5 Summary
In this chapter, we establish comparative studies between the proposed illumination
normalization approach and four best-of-literature approaches over images with
illumination variation and images with other facial and geometrical variations using the
two selected face recognition methods.
When dealing with illuminated images, the results show that the proposed approach is
the best one in the Eigenface method and the second best after SSR-HM in the
Spectroface method when the images are not perfectly aligned. However, the SSR-HM
is the best one in both methods when the images are perfectly aligned. In addition, the
proposed approach is the minimum affected approach (i.e. most robust) due to the non-
aligning of faces on both methods.
When dealing with non-illuminated images, the proposed approach brings the least
side-effects, among the four other approaches, in both methods for each of the two-facial
variations and the two-geometrical variations (except in the Spectroface method over
pose variation, it comes in second place after LBP). Moreover, the performance of the
SSR-HM is dramatically affected by the two-geometrical variations, translation and
scaling, on both recognition methods.
Thus, we can conclude the following about the proposed approach:
1. It’s flexible to different face recognition approaches, as it usually gives the best
results on both methods either on illumination variations or other variations.

116
2. It’s robust to the non-aligning of faces rather than the other four approaches
which show great difference in performance when applied to non-aligned face
images.
3. It has the least side-effects, among the four other approaches, over both facial
variations and geometrical variations.

The following chapter of this thesis will summarize the conclusion of the work together
with the suggestions for the future works.

117
CHAPTER 7: Conclusions and Future Works
7.1 Conclusions
Although many face recognition techniques and systems have been proposed, evaluations
of the state-of-the-art techniques and systems have shown that recognition performance
of most current technologies degrades due to the variations of illumination. As prove for
this, the last face recognition vendor test FRVT 2006 concludes that relaxing the
illumination condition has a dramatic effect on the performance. Moreover, It has been
proven both experimentally and theoretically that the variations between the images of
the same face due to illumination are almost always larger than image variations due to
change in face identity.

There has been much work dealing with illumination variation in face recognition.
Although most of these approaches can cope with illumination variation well, some may
bring negative influence on images without illumination variation. In addition, some
approaches show great difference on performance when combined with different
recognition methods. Some other approaches require perfect alignment of face within the
image which is difficult to achieve in practical/real-life systems.

In this thesis, we propose an illumination normalization approach that is based on


enhancing the image resulting from histogram matching; called GAMMA-HM-COMP.

The proposed approach is compared with four best-of-literatures approaches selected


among 38 different approaches based on surveying nine comparative studies. The
comparison is performed on images with illumination variation and images with other
facial and geometrical variations using two face recognition methods. These two methods
are chosen to represent the two broad categories of the holistic-based approach – namely
Standard Eigenface method from the Eigenspace-based category and Spectroface from
the Frequency-based category.

The results show that the proposed approach is the best one in the Eigenface method and
the second best in the Spectroface method when dealing with illuminated images that are
not perfectly aligned. Moreover, the performance of the other approaches is significantly
affected by the aligning of the faces inside images, opposite to the proposed approach
which not significantly affected by the aligning condition. In addition, the proposed
approach brings the least side-effects, among the four other approaches, in each of the
two methods when dealing with either facial or geometrical variations.

118
These results lead to conclude that the proposed approach:
1. is flexible to different face recognition approaches,
2. is robust to the non-aligning of faces, and
3. brings the least side-effects on images with either facial or geometrical variations.

In addition, this thesis establishes an environment that can be used for further studying
the effects of any preprocessing/illumination normalization approach. The environment
consists of:
1. Two face recognition methods representing the two broad categories of the
holistic-based face recognition approach – namely Standard Eigenface method
from the Eigenspace-based category and Spectroface from the Frequency-based
category.
2. Seven databases representing five different face recognition variations with
suitable database(s) for each variation. The variations include three-facial, which
are 3D pose, expressions and non-uniform illumination, and two-geometrical,
which are translation and scaling.

The comparative study between these two face recognition methods over the five
variations show that the Spectroface method outperforms the Eigenface method in each
of 3D pose, facial expressions, non-uniform illumination and translation variations while
the Eigenface method is better in the scaling variation.

Finally, we nominate seven illumination normalization approaches that are considered the
best-of-literature approaches based on surveying nine different comparative studies
containing 38 different approaches. These seven approaches can be considered as
benchmark that can be used for comparing any further new/suggested illumination
normalization approach.

119
7.2 Future Works
As the proposed approach depends mainly on the histogram matching, it’s possible to
apply the HM in block-wise manner rather than on the whole face. This can produce
better results as it will normalize the illumination of each face region separately
according to its illumination condition.
Another possible modification is to apply the region-based GIC to automate the
selection of gamma value over each region separately rather than using a single, fixed,
gamma value over the whole face.

In this work, all illumination normalization approaches are tested on two face recognition
methods representing the two broad categories of the holistic-based approach. It’s
important to extend this work to include both local-based and hybrid face recognition
methods in testing these approaches as they may introduce difference in performance
when combined with such methods.

Moreover, the environment established in this work can be extended to include additional
databases (specially illuminated ones) and/or other face recognition variations (e.g.
aging).

Here in this work we nominate seven best-of-literature illumination normalization


approaches as benchmark but compare with four only. The remaining three approaches
can be implemented and included in comparisons on images with illumination variation
and images with other facial and geometrical variations.

Finally, this work introduces a technology evaluation for the proposed approach and the
other best-of-literature approaches. In order to complete the thorough evaluation cycle,
both scenario and operational evaluations need to be performed for these approaches.

120
References
[1] Maltoni, D.; Maio D., Jain A.K. & Prabhakar S. Handbook of Fingerprint Recognition, Springer,
New York, 2003.
[2] W. Zhao, R. Chellappa, P. J. Phillips, and A. Rosenfeld, Face recognition: A literature survey, ACM
Computing Surveys (CSUR), 35(4), 399–458, 2003.
[3] Xiaoyang Tan , Songcan Chen , Zhi-Hua Zhou , Fuyan Zhang, Face recognition from a single image
per person: A survey, Pattern Recognition, v.39 n.9, pp.1725-1745, Sep 2006.
[4] Matthew Curtis Hesher, Automated Face Tracking and Recognition, M.Sc. Thesis, the Florida State
University, 2003.
[5] P. J. B. Hancock, V. Bruce, and A. M. Burton, “Recognition of unfamiliar faces,” Trends in
Cognitive Sciences, vol. 4, pp. 330–337, 2000.
[6] Daugman, John. Face and Gesture Recognition: Overview. IEEE Transactions on Pattern Analysis
and Machine Intelligence, vol. 19, issue 7, Jul 1997.
[7] Gong, Shaogang, Stephen J. McKenna, Alexandra Psarrou. Dynamic Vision: From Images to Face
Recognition. Imperial College Press, 2000.
[8] Hochberg, Julian, Ruth Ellen Galper. Recognition of Faces: I. An Exploratory Study. Psychonomic
Science, vol. 9, pp. 619-620, 1967.
[9] Brigham, J. C., A. Maass, L. D. Snyder, K. Spaulding. Accuracy of Eyewitness Identification in a
Field Setting. Journal of Personality and Social Psychology, vol. 42, pp. 673-681, 1982.
[10] Brown, E., K. De_enbacher, W. Sturgill. Memory for Faces and the Circumstances of Encounter.
Journal of Applied Psychology, vol. 62, pp. 311-318, 1977.
[11] L. Torres, Is there any hope for face recognition?, Proc. of the 5th International Workshop on Image
Analysis for Multimedia Interactive Services, WIAMIS 2004, Lisboa, Portugal, Apr 2004.
[12] P. J. Phillips, W. T. Scruggs, A. J. O'Toole, P. J. Flynn, K.W. Bowyer, C. L. Schott, and M. Sharpe,
”FRVT 2006 and ICE 2006 Large-Scale Results”, National Institute of Standards and Technology,
NISTIR 7408, http://face.nist.gov, 2007.
[13] Y. Adini, Y. Moses, and S. Ullman, “Face recognition: The problem of compensating for changes in
illumination direction” IEEE Tran. PAMI, 19(7), 721-732, 1997.
[14] W. Zhao, and R. Chellappa, “Robust face recognition using symmetric shape-from-shading”
Technical report, Center for Automation Research, University of Maryland, 1999.
[15] Laurenz Wiskott, Jean-Marc Fellous, Norbert Kruger, Christoph von der Malsburg, "Face
Recognition by Elastic Bunch Graph Matching," IEEE Transactions on Pattern Analysis and
Machine Intelligence ,vol. 19, no. 7, pp. 775-779, Jul 1997.
[16] Liao, R. and Li, S. Z. 2000. Face Recognition Based on Multiple Facial Features. In Proceedings of
the Fourth IEEE international Conference on Automatic Face and Gesture Recognition 2000. FG.
IEEE Computer Society, Washington, DC, 239, Mar 2000.
[17] E. Hjelmås. Biometric Systems: A Face Recognition Approach. In Proceedings of the Norwegian
Conference on Informatics, pp. 189-197, 2000.
[18] B. Heisele P. Ho and T. Poggio, “Face Recognition with Support Vector Machines: Global versus
Component-Based Approach,” Proc. Int'l Conf. Computer Vision, vol. 2, pp. 688-694, 2001.
[19] ZANA, Y.; CESAR-JR, R. M.; BARBOSA, R. A. Automatic face recognition system based on local
Fourier-Bessel features. In: BRAZILIAN SYMPOSIUM ON COMPUTER GRAPHICS AND
IMAGE PROCESSING, 18. (SIBGRAPI), Brazil, 2005.
[20] R. Paredes and E. Vidal and F. Casacuberta, Local Features for Biometrics-Based Recognition, 2nd
COST 275 Workshop, Biometrics on the Internet Fundamentals, Advances and Applications, 2004.
[21] Timo Ahonen, Abdenour Hadid, Matti Pietikäinen, "Face Description with Local Binary Patterns:
Application to Face Recognition," IEEE Transactions on Pattern Analysis and Machine
Intelligence ,vol. 28, no. 12, pp. 2037-2041, Dec 2006.
[22] T.I. El-Arief, K.A. Nagaty, and A.S. El-Sayed, “Eigenface vs. Spectroface: A Comparison on the
Face Recognition Problems”, IASTED Signal Processing, Pattern Recognition, and Applications
(SPPRA), Austria, 2007.
[23] M. Kirby and L. Sirovich, Application of karhunen-Loeve procedure for characterization of human
faces, IEEE Patt. Anal. Mach. Intell., 12, 103-108, 1990.

121
[24] M. Turk and A. Pentland, Eigenfaces for recognition, Journal of Cognitive Neuroscience, 3(1), 71-
86, 1991.
[25] Zhao .W, Chellappa .R and Phillips .P `Subspace Linear Discriminant Analysis for Face
Recognition', technical report, Center for Automation Research, University of Maryland, College
park, 1999.
[26] Liu, C. and Wechsler, H. 2000. Evolutionary Pursuit and Its Application to Face Recognition. IEEE
Trans. Pattern Anal. Mach. Intell. 22, 6, 570-582, Jun 2000.
[27] Déniz, O., Castrillón, M., and Hernández, M. 2003. Face recognition using independent component
analysis and support vector machines. Pattern Recogn. Lett. 24, 13, 2153-2157, Sep 2003.
[28] Bartlett, M.S., Movellan, J.R., T.J., Sejnowski.: Face Recognition by Independent Component
Analysis. IEEE Transactions on Neural Networks, vol. 13, No. 6, Nov 2002.
[29] Wu, J. and Zhou, Z. 2002. Face recognition with one training image per person. Pattern Recogn. Lett.
23, 14, 1711-1719, Dec 2002.
[30] Moghaddam, B., Jebara, T., and Pentland, A. Bayesian modeling of facial similarity. In Proceedings
of the 1998 Conference on Advances in Neural information Processing Systems II D. A. Cohn, Ed.
MIT Press, Cambridge, MA, 910-916, 1999.
[31] Shaohua Zhou Chellappa, R. Moghaddam, B., Intra-personal kernel space for face recognition,
Sixth IEEE International Conference on Automatic Face and Gesture Recognition, Proceedings.
pp. 235- 240, 2004.
[32] P. Belhumeur, J. Hespanha, and D. Kriegman, Eigenfaces vs. Fisherfaces: Recognition Using Class
Specific Linear Projection, Proc. of the Fourth European Conference on Computer Vision, Vol. 1,
Cambridge, UK, pp. 45-58, Apr 1996.
[33] C. Liu and H. Wechsler, Comparative Assessment of Independent Component Analysis (ICA) for
Face Recognition, Second International Conference on Audio- and Video-based Biometric Person
Authentication, Washington D.C., USA, Mar 1999.
[34] K. Baek, B. Draper, J.R. Beveridge, and K. She, PCA vs. ICA: A Comparison on the FERET Data
Set, Proc. of the Fourth International Conference on Computer Vision, Pattern Recognition and
Image Processing, Durham, NC, USA, pp. 824-827, Mar 2002.
[35] B. Moghaddam, Principal Manifolds and Probabilistic Subspaces for Visual Recognition, IEEE
Trans. on Pattern Analysis and Machine Intelligence, Vol. 24, No. 6, pp. 780-788, Oct 2002.
[36] J.R. Beveridge, K. She, B. Draper, and G.H. Givens, A Nonparametric Statistical Comparison of
Principal Component and Linear Discriminant Subspaces for Face Recognition, Proc. of the IEEE
Conference on Computer Vision and Pattern Recognition, Kaui, HI, USA, pp. 535- 542, Dec 2001.
[37] A. Martinez and A. Kak, PCA versus LDA, IEEE Trans. on Pattern Analysis and Machine
Intelligence, Vol. 23, No. 2, pp. 228-233, Feb 2001.
[38] P. Navarrete and J. Ruiz-del-Solar, Analysis and Comparison of Eigenspace-Based Face Recognition
Approaches, International Journal of Pattern Recognition and Artificial Intelligence, Vol. 16, No. 7,
pp. 817-830, Nov 2002.
[39] Delac, K., Grgic, M., and Grgic, S., Independent comparative study of PCA, ICA, and LDA on the
FERET data set, Int'l Journal of Imaging Systems and Technology, 15(5), 252-260, 2005.
[40] J. Ruiz-del-Solar and P. Navarrete, "Eigenspace-based face recognition: a comparative study of
different approaches," IEEE Transactions on Systems, Man and Cybernetics Part C: Applications and
Reviews, vol. 35, no. 3, pp. 315-325, 2005.
[41] Pan, Z. and H. Bolouri, High speed face recognition based on discrete cosine transforms and neural
networks, Technical Report, Univ. of Hertfordshire, UK, 1999.
[42] Spiess, H. and Ricketts, I., Face recognition in Fourier space, Proc. Vision Interface Conf., Montreal,
Canada, 2000.
[43] J. H. Lai, P. C. Yuen, and G. C. Feng, Face recognition using holistic Fourier invariant features,
Pattern Recognition, 34(1), 95–109, 2001.
[44] Dai, D., Feng, G., Lai, J., and Yuen, P. C. Face Recognition Based on Local Fisher Features. In
Proceedings of the Third international Conference on Advances in Multimodal interfaces. T. Tan, Y.
Shi, and W. Gao, Eds. Lecture Notes In Computer Science, vol. 1948. Springer-Verlag, London,
230-236, 2000.
[45] G. C. Feng, P. C. Yuen, and D. Q. Dai, Human face recognition using PCA on wavelet subband,
Journal of Electronic Imaging, 9(2), 226-233, 2000.

122
[46] D.B. Graham and N.M. Allinson, Characterizing virtual eigen signatures for general purpose face
recognition, Face Recognition: From Theory to Applications, H. Wechsler, P.J. Phillips, V. Bruce, F.
Fogelman-Soulie, and T.S. Huang, eds., 163, 446-456, 1998.
[47] Z. Pan, R. Adams, and H. Bolouri, “Dimensionality reduction of face images using discrete cosine
transforms for recognition." submitted to IEEE Conference on Computer Vision and Pattern
Recognition, 2000.
[48] Zana, Y. and Cesar, R. M. 2006. Face recognition based on polar frequency features. ACM Trans.
Appl. Percept. 3, 1, 62-82, Jan 2006.
[49] Saradha, A. & Annadurai, S. A Hybrid Feature Extraction Approach for Face Recognition Systems.
Institute of Road and Transport Technology, Erode, 2004.
[50] Casasent D et al, “Face Recognition with Pose and Illumination Variations Using New SVRDM
Support-Vector Machine”, Optical Engineering, vol. 43, No. 8, Society of Photo-Optical
Instrumentation Engineers, pp. 1804-1813, Aug 2005.
[51] S. Srisuk and W. Kurutach, Face Recognition using a New Texture Representation of Face Images,
Proceedings of Electrical Engineering Conference, Cha-am, Thailand, pp. 1097-1102, Nov 2003.
[52] A. Ne an, "Embedded Bayesian networks for face recognition," ICME 2002 - IEEE International
Conference on Multimedia and Expo, Lausanne, Switzerland, pp. 133-136, Aug 2002.
[53] Huang, J., Yuen, P. C., Lai, J. H., and Li, C. 2004. Face recognition using local and global features.
EURASIP J. Appl. Signal Process., 530-541, Jan 2004.
[54] P. C. Yuen and J. H. Lai, “Face representation using independent component analysis,” Pattern
Recognition, vol. 35, no. 6, pp. 1247–1257, 2002.
[55] Rao, K. S. and Rajagopalan, A. N. 2005. A probabilistic fusion methodology for face recognition.
EURASIP J. Appl. Signal Process., 2772-2787, Jan 2005.
[56] Tony Mansfield, Gavin Kelly, David Chandler, and Jan Kane, “Biometric Product Testing Final
Report”, CESG/BWG Biometric Test Programme. www.cesg.gov.uk, Mar 2001.
[57] Bolme, D. S., Beveridge, J. R., Teixeira, M. and Draper, B. A., The CSU Face Identification
Evaluation System: Its Purpose, Features, and Structure, ICVS, Graz, Austria, 2003.
[58] V. Jain, Human face classification using neural networks, Project at IITK, Kanpur, India, 1998.
[59] Sebastien M., Yann R., and Guillaume H., On the recent use of local binary patterns for face
authentication, IDIAP, Valais, Swiss, 2006.
[60] J. Zhang, Y. Yan and M. Lades, Face Recognition: Eigenface, Elastic Matching, and Neural Nets,
Proceedings of the IEEE, Vol. 85, No. 9, pp. 1423-1435, 1997.
[61] Alan Brooks and Li Gao, Face recognition: Eigenface and Fisherface performance across pose, Final project
report, Northwestern University, Evanston, IL, 2004.
[62] Y. Zhang, L. Lang, and O. Hamsici, Facial image recognition by subspace learning: a comparative
study, Project at the Ohio State Univ., Ohio, USA, 2004.
[63] Qi Li, Jieping Ye and Chandra Kambhamettu, Linear projection methods in face recognition under
unconstrained illuminations: A comparative study, IEEE Computer Society Conference on Computer
Vision and Pattern Recognition (CVPR'04), Washington, USA, 2004.
[64] Georghiades, A.S., Belhumeur, P.N., and Kriegman, D.J., From few to many: illumination cone
models for face recognition under variable lighting and pose, IEEE Patt. Anal. Mach. Intell., 23(6),
643-660, 2001.
[65] M. J. Lyons, S. Akamatsu, M. Kamachi, and J. Gyoba, Coding facial expressions with gabor
wavelets, Proceedings, Third IEEE Int’l Conf. on Automatic Face and Gesture Recognition, Nara,
Japan, 1998.
[66] Levine, M. D., Gandhi, Maulin R. and Bhattacharyya, Jisnu: “Image Normalization for Illumination
Compensation in Facial Images”. Department of Electrical & Computer Engineering & Center for
Intelligent Machines, McGill University, Montreal, Canada, Unpublished Report (2004)
[67] Yang, J., Chen, X. Kunz, W. and Kundra, H., "Face as an index: Knowing who is who using a PDA",
Inter. Journal of Imaging Systems and Technology, 13(1), pp. 33-41, 2003.
[68] Jebara, T., 3D Pose estimation and normalization for face recognition, Honours Thesis, McGill
University, Canada, 1996.
[69] Dubuisson, S., Davoine, F. and Masson, M., A solution for facial expression representation and
recognition. Signal Process. Image Commun. 17(9). 657-673, 2002.

123
[70] N. Ikizler, J. Vasanth, L. Wong and D. Forsyth, Finding Celebrities in Video, Technical Report,
EECS Department, University of California, Berkeley, USA, 2006.
http://www.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-77.html
[71] B. Du, S. Shan, L. Qing, W. Gao, “Empirical comparisons of several preprocessing methods for
illumination insensitive face recognition”, Proceedings ICASSP'05, V(2), 2005.
[72] Mauricio Villegas Santamar´_a and Roberto Paredes Palacios, “Comparison of Illumination
Normalization Methods for Face Recognition”, Third COST 275 Workshop Biometrics on the
Internet, Univ. of Hertfordshire, UK, 2005.
[73] Hammal, Z., Eveno, N., Caplier, A., and Coulon, P. 2006. Parametric models for facial features
segmentation, Signal Process., 86 (2), pp. 399-413, Feb 2006.
[74] R. C. Gonzales and R. E. Woods, Digital Image Processing, Addison Wesley Publishing Company,
Inc., New York, 1993.
[75] W. Beaudot, “The neural information processing in the vertebra retina: a melting pot of ideas for
artificial vision”, Phd thesis, tirf laboratory, Grenoble, France, 1994.
[76] Z. Hammal, C. Massot, G. Bedoya, and A. Caplier, "Eyes segmentation applied to gaze direction and
vigilance estimation," Proc. ICAPR '05, 236-246, UK, 2005.
[77] A. S. El-Sayed, K. A. Nagaty, T. I. El-Arief, “An Enhanced Histogram Matching Approach using the
Retinal Filter’s Compression Function for Illumination Normalization in Face Recognition”,
ICIAR’08, Springer-Verlag LNCS 5112, pp. 873–883, Portugal, 2008.
[78] Phillips P. J., Grother P., Micheals R. J, Blackburn D.M., Tabassi E., and Bone J. M. “FRVT 2002:
Evaluation Report”, http://www.frvt.org/DLs/FRVT_2002_Evaluation_Report.pdf, Mar 2003.
[79] Peter Belhumeur. “Ongoing Challenges in Face Recognition”, pp. 5-14, Frontiers of Engineering:
Reports on Leading-Edge Engineering from the 2005 Symposium, ISBN-10: 0-309-10102-6, 2006
[80] Heusch, G., Rodriguez, Y., and Marcel, S., “Local Binary Patterns as an Image Preprocessing for
Face Authentication”, In Proceedings of the 7th international Conference on Automatic Face and
Gesture Recognition (Fgr06) – Vol. 00, Washington, Apr 2006.
[81] R. Basri and D.W. Jacobs, “Lambertian reflectance and linear subspaces”, IEEE Trans. on Pattern
Analysis and Machine Intelligence, 25(2):218–233, 2003.
[82] Gross, R., Matthews, I., and Baker, S., “Eigen Light-Fields and Face Recognition Across Pose”. In
Proceedings of the Fifth IEEE international Conference on Automatic Face and Gesture Recognition,
Washington, USA, 2002.
[83] T.Sim, T.Kanade, Combining Models and Exemplars for Face Recognition: An Illuminating
Example, In Proceedings of Workshop on Models versus Exemplars in Computer Vision, CVPR
2001.
[84] Shan, S., Gao, W., Cao, B., and Zhao, D. “Illumination Normalization for Robust Face Recognition
Against Varying Lighting Conditions”, In Proceedings of the IEEE international Workshop on
Analysis and Modeling of Faces and Gestures AMFG, Washington, Oct 2003.
[85] R. Gross and V. Brajovic. An image preprocessing algorithm for illumination invariant face
recognition. In Audio- and Video-Based Biometric Person Authentication, AVBPA’03, 2003.
[86] T. Ojala, M. Pietik¨ainen, and D. Harwood. A comparative study of texture measures with
classification based on featured distributions. Pattern Recognition, 29(1):51–59, 1996.
[87] T. Ojala, M. Pietik¨ainen, and T. M¨aenp¨a¨a. Multiresolution gray-scale and rotation invariant
texture classification with local binary patterns. IEEE Trans. on Pattern Analysis and Machine
Intelligence, 24(7):971–987, 2002.
[88] J. Short, J. Kittler, and K. Messer. Photometric normalization for face verification. In Audio- and
Video-based Biometric Person Authentication, AVBPA’05, 2005.
[89] D. J. Jobson , Z. Rahman, G. A. Woodell, “A Multiscale Retinex for Bridging the Gap Between
Color Images and the Human Observation of Scenes,” IEEE Transactions on Image Processing,
Volume: 6, No: 3, Page(s): 965-976, Jul 1997.
[90] E. Land, “The Retinex Theory of Color Vision,” Scientific American, Page(s): 108-129, Dec 1977.
[91] H. Ando, N. Fuchigami, M. Sasaki and A. Iwata, "Robust Face Recognition Methods under
Illumination Variations toward Hardware Implementation on 3DCSS", The Fourth Hiroshima
International Workshop on Nanoelectronics for Tera-Bit Information Processing, pp. 139-141, Sep
2005.
[92] V. Vapnik. The Nature of Statistical Learning Theory. Springer, New York, 1995.

124
[93] A. Brunelli and T. Poggio, Face Recognition: Features vs Templates, IEEE Trans. Pattern Anal.
Mach. Intelligence, 15(10):1042-1053, 1993.
[94] P. Belhumeur, D. Kriegman and A. Yuille, The bas-relief ambiguity, In proc. IEEE Conf. on Comp.
Vision and Patt. Recog., p.p 1040-1046, 1997.
[95] A.Georghiades, D. Kriegman, and P. Belhumeur, “From Few to Many: Generative Models for
Recognition under Variable Pose and Illumination,” IEEE Trans. Pattern Analysis and Machine
Intelligence, vol. 40, pp. 643-660, 2001.
[96] Kuang-Chih Lee, Jeffrey Ho, David J. Kriegman, "Acquiring Linear Subspaces for Face Recognition
under Variable Lighting," IEEE Transactions on Pattern Analysis and Machine
Intelligence ,vol. 27, no. 5, pp. 684-698, May 2005.
[97] H. Chen, P. Belhumeur, and D. Jacobs, “In Search of Illumination Invariants,” Proc. IEEE Conf.
Computer Vision and Pattern Recognition, pp. 1-8, 2000.
[98] L. Zhang and D. Samaras. Face recognition under variable lighting using harmonic image exemplars.
In CVPR, volume 1, pages 19–25. IEEE Computer Society, 2003.
[99] M. Savvides, B. V. K. Vijaya Kumar, and P. L. Khosla, “Corefaces - Robust Shift Invariant PCA
based Correlation Filter for Illumination Tolerant Face Recognition” CVPR, 2004.
[100] A.Shashua, and T. Riklin-Raviv, “The quotient image: Class-based re-rendering and recognition with
varying illuminations” IEEE Tran. PAMI, Vol. 23(2), pp. 129-139, 2001.
[101] H. Wang, S. Z. Li, and Y. Wang, “Generalized Quotient Image” CVPR, 2004.
[102] Sim, T., Baker, S., Bsat, M.: The CMU Pose, Illumination, and Expression (PIE) database. In: IEEE
Int. Conf. on Automatic Face and Gesture Recognition, 2002.
[103] W. Gao, B. Cao, S. Shan, D. Zhou, X. Zhang, D. Zhao, “The CAS-PEAL Large-Scale Chinese Face
Database and Baseline Evaluations”, IEEE Transactions on Systems, Man and Cybernetics, Part A,
38(1), pp. 149-161, 2008. http://www.jdl.ac.cn/peal/index.html.
[104] Blackburn, D., Bone, M., Philips, P.: Facial recognition vendor test 2000: evaluation report (2000)
[105] Yin, W. Total Variation Models for Variable Lighting Face Recognition. IEEE Trans. Pattern Anal.
Mach. Intell. 28, 9, 2006.
[106] Peter Halinan, Alan Yeuille, and David Mumford, Harvard face Database.
[107] Haitao Wang, Stan Z Li, Yangsheng Wang, "Face Recognition under Varying Lighting Conditions
Using Self Quotient Image," fg, p. 819, Sixth IEEE International Conference on Automatic Face and
Gesture Recognition (FG'04), 2004.
[108] Press, W., Teukolsky, S., Vetterling, W., Flannery, B.: Numerical Recipes in C. Cambridge
University Press, 1992.
[109] X. Tan and B. Triggs. Enhanced local texture feature sets for face recognition under difficult lighting
conditions. In IEEE Conf. on AMFG, pages 168--182, 2007.
[110] Phillips, P.J., Flynn, P.J., Scruggs, W.T., Bowyer, K.W., Chang, J., Hoffman, K., Marques, J.,Min,
J.,Worek, W.J.: Overview of the face recognition grand challenge. In: Proc. CVPR 2005, San Diego,
CA, pp. 947–954, 2005.
[111] Borgefors, G.: Distance transformations in digital images. Comput. Vision Graph. Image Process.
34(3), 344–371, 1986.
[112] F. Samaria and A. Harter, Parameterization of a stochastic model for human face identification, Proc.
2nd IEEE Workshop on Applications of Computer Vision, Sarasota, USA, 1994.
[113] “Introduction to Biometrics” section of the Biometrics Catalog, http://www.biometricscatalog.org.
[114] “Face Recognition Vendor Test”, www.FRVT.org.
[115] Grimace database, http://cswww.essex.ac.uk/mv/allfaces/grimace.html.
[116] Intel OpenCV Library, http://sourceforge.net/projects/opencvlibrary/.
[117] Nott-faces database, http://pics.psych.stir.ac.uk.
[118] Face 94 database, http://cswww.essex.ac.uk/mv/allfaces/face94.html.
[119] Yale B face database, http://cvc.yale.edu/projects/yalefacesB/yalefacesB.html.
[120] “The Facial Recognition Technology (FERET) Database”,
http://www.itl.nist.gov/iad/humanid/feret/feret_master.html.
[121] Face normalization and descriptor code, http://lear.inrialpes.fr/people/triggs/src/amfg07-demo-
v1.tar.gz.
[122] AcSys Biometrics, http://www.acsysbiometrics.com.
[123] Berninger Software BmbH, http://www.berningersoftware.de.
[124] Biometrica Systems, Inc., http://www.biometrica.com.

125
[125] Cognitec Systems, http://cognitec-systems.de.
[126] Dialog Communication Systems Inc., http://www.bioid.com.tw/.
[127] FaceKey Corp., http://www.facekey.com.
[128] Humanscan GmbH, http://www.bioid.com.
[129] Identix, Inc., http://www.identix.com.
[130] Imagis Technologies, Inc., http://www.imagistechnologies.com.
[131] Keyware Technologies N.V., http://www.keyware.com.
[132] Neurodynamics Limited, http://www.neurodynamics.com.
[133] Viisage Technology, http://www.viisage.com/facetools.htm.
[134] VisionSphere Technologies, http://www.visionspheretech.com.
[135] ZN Vision Technologies, http://www.zn-ag.com.

126
‫ﻣﻠﺨﺺ اﻟﺮﺳﺎﻟﺔ‬
‫ﺑﺎﻟﺮﻏﻢ ﻣﻦ ان اﻟﻌﺪﻳﺪ ﻣﻦ ﻧﻈﻢ وﺗﻘﻨﻴﺎت اﻟﺘﻌﺮف ﻋﻠﻰ اﻟﻮﺟﻪ ﻗﺪ اﻗﺘﺮﺣﺖ ﺧﻼل اﻟﺴﻨﻮات اﻟﻤﺎﺿﻴﺔ‪ ،‬إﻻ أن أﺣﺪث ﺗﻘﻴﻴﻢ ﻟﻬﺬﻩ‬
‫اﻟﻨﻈﻢ واﻟﺘﻘﻨﻴﺎت ﻗﺪ أﻇﻬﺮ أن أداء ﻣﻌﻈﻤﻬﺎ ﻳﺘﺄﺛﺮ ﺳﻠﺒًﺎ ﻧﻈﺮًا ﻟﺘﻐﻴﺮات اﻹﺿﺎءة‪ .‬ﻓﻲ ﺁﺧﺮ اﺧﺘﺒﺎر ﻟﻬﺬﻩ اﻟﻨﻈﻢ ﻋﺎم ‪2006‬‬
‫ﺧﻠﺺ اﻟﻰ ان ﺗﺨﻔﻴﻒ ﺷﺮوط اﻹﺿﺎءة ﻟﻪ ﺗﺄﺛﻴﺮ آﺒﻴﺮ ﻋﻠﻰ ﺁداء هﺬﻩ اﻟﻨﻈﻢ‪ .‬وﻋﻼوة ﻋﻠﻰ ذﻟﻚ‪ ،‬ﺛﺒﺖ ﻧﻈﺮﻳًﺎ وﺑﺎﻟﺘﺠﺮﺑﺔ أن‬
‫اﻻﺧﺘﻼﻓﺎت ﺑﻴﻦ ﺻﻮر ﻧﻔﺲ اﻟﺸﺨﺺ ﺑﺴﺒﺐ اﻹﺿﺎءة ﻳﻜﻮن ﻏﺎﻟﺒًﺎ أآﺒﺮ ﻣﻦ اﻻﺧﺘﻼﻓﺎت اﻟﺘﻰ ﺗﺤﺪث ﻧﺘﻴﺠﺔ ﻟﻠﺘﻐﻴﻴﺮ ﻓﻲ‬
‫هﻮﻳﺔ هﺬا اﻟﺸﺨﺺ‪.‬‬

‫هﻨﺎك اﻟﻜﺜﻴﺮ ﻣﻦ اﻷﺑﺤﺎث واﻟﻄﺮق اﻟﻤﺨﺘﻠﻔﺔ ﻟﻠﺘﻌﺎﻣﻞ ﻣﻊ ﺗﺄﺛﻴﺮات اﻹﺿﺎءة ﻓﻲ اﻟﺘﻌﺮف ﻋﻠﻰ اﻟﻮﺟﻪ‪ .‬ﻋﻠﻰ اﻟﺮﻏﻢ ﻣﻦ ان‬
‫ﻣﻌﻈﻢ هﺬﻩ اﻟﻄﺮق ﻳﻤﻜﻨﻬﺎ اﻟﺘﻐﻠﺐ ﻋﻠﻰ اﺧﺘﻼﻓﺎت اﻹﺿﺎءة ﺑﺸﻜﻞ ﻓﻌﺎل‪ ،‬إﻻ أن ﺑﻌﻀﻬﺎ ﻗﺪ ﻳﻜﻮن ﻟﻪ ﺗﺄﺛﻴﺮ ﺳﻠﺒﻲ ﻋﻠﻰ‬
‫اﻟﺼﻮر اﻟﺘﻲ ﻟﻢ ﺗﺘﻌﺮض ﻻﺧﺘﻼﻓﺎت ﺑﺴﺒﺐ اﻹﺿﺎءة ‪.‬واﻟﺒﻌﺾ اﻵﺧﺮ ﻳﻈﻬﺮ ﻓﺮق آﺒﻴﺮ ﻓﻲ اﻵداء ﻋﻨﺪﻣﺎ ﻳﻘﺘﺮن ﺑﻄﺮق‬
‫ﻣﺨﺘﻠﻔﺔ ﻟﻠﺘﻌﺮف ﻋﻠﻰ اﻟﻮﺟﻪ‪ .‬وهﻨﺎك اﻟﺒﻌﺾ اﻵﺧﺮ ﻣﻦ هﺬﻩ اﻟﻄﺮق ﻳﺘﻄﻠﺐ ﻣﻮاءﻣﺔ اﻟﻮﺟﻪ ﺿﻤﻦ اﻟﺼﻮرة وهﻮ أﻣﺮ ﻗﺪ‬
‫ﻳﺼﻌﺐ ﺗﺤﻘﻴﻘﻪ ﻓﻲ اﻟﻨﻈﻢ اﻟﻌﻤﻠﻴﺔ اﻟﺤﻘﻴﻘﻴﺔ‪.‬‬

‫ﻟﺬﻟﻚ ﻓﺈﻧﻨﺎ ﻧﻘﺘﺮح ﻓﻲ هﺬﻩ اﻟﺮﺳﺎﻟﺔ ﻃﺮﻳﻘﺔ ﺟﺪﻳﺪة ﻟﻠﺘﻌﺎﻣﻞ ﻣﻊ اﺧﺘﻼﻓﺎت اﻹﺿﺎءة ﺑﺤﻴﺚ ﻻ ﺗﻈﻬﺮ ﻓﺮق آﺒﻴﺮ ﻓﻲ اﻵداء‬
‫ﻋﻨﺪﻣﺎ ﺗﻘﺘﺮن ﺑﻄﺮق ﻣﺨﺘﻠﻔﺔ ﻟﻠﺘﻌﺮف ﻋﻠﻰ اﻟﻮﺟﻪ وآﺬﻟﻚ ﻻ ﺗﺸﺘﺮط ﻣﻮاءﻣﺔ اﻟﻮﺟﻪ ﺿﻤﻦ اﻟﺼﻮرة ﻣﻤﺎ ﻳﺴﻬﻞ اﺳﺘﺨﺪاﻣﻬﺎ‬
‫ﻓﻲ اﻟﻨﻈﻢ اﻟﻌﻤﻠﻴﺔ اﻟﺤﻘﻴﻘﻴﺔ‪.‬‬

‫ﻟﻠﺘﺤﻘﻖ ﻣﻦ أن اﻟﻄﺮﻳﻘﺔ اﻟﻤﻘﺘﺮﺣﺔ ﻻ ﺗﻈﻬﺮ ﻓﺮق آﺒﻴﺮ ﻓﻲ اﻵداء ﻋﻨﺪ اﻗﺘﺮاﻧﻬﺎ ﺑﻄﺮق ﻣﺨﺘﻠﻔﺔ ﻟﻠﺘﻌﺮف ﻋﻠﻰ اﻟﻮﺟﻪ وآﺬﻟﻚ‬
‫ﻋﺪم اﺷﺘﺮاﻃﻬﺎ ﻟﻤﻮاءﻣﺔ اﻟﻮﺟﻪ ﺿﻤﻦ اﻟﺼﻮرة‪ ،‬ﻓﺈﻧﻪ ﻗﺪ ﺗﻢ اﺧﺘﺒﺎرهﺎ ﻋﻠﻰ ﻃﺮﻳﻘﺘﻴﻦ ﻣﺨﺘﻠﻔﺘﻴﻦ ﺗﻤﺜﻼن ﻓﺌﺘﻴﻦ آﺒﻴﺮﺗﻴﻦ ﻣﻦ‬
‫ﻼ وﻟﻴﺲ أﺟﺰاء ﻣﻨﻪ‪ .‬ﻓﻲ آﻞ ﻃﺮﻳﻘﺔ‪ ،‬ﺗﻢ اﺳﺘﺨﺪام ﻗﺎﻋﺪة‬ ‫ﻃﺮق اﻟﺘﻌﺮف ﻋﻠﻰ اﻟﻮﺟﻪ واﻟﻤﻌﺘﻤﺪة ﻋﻠﻰ دراﺳﺔ اﻟﻮﺟﻪ آﺎﻣ ً‬
‫ﺑﻴﺎﻧﺎت ﻣﻦ اﻟﺼﻮر اﻟﺘﻲ ﺗﺤﺘﻮي ﻋﻠﻰ اﺧﺘﻼﻓﺎت ﺷﺪﻳﺪة ﻓﻲ اﻷﺿﺎءة ﻻﺧﺘﺒﺎر اﻟﻄﺮﻳﻘﺔ اﻟﻤﻘﺘﺮﺣﺔ ﻣﺮﺗﻴﻦ‪ ،‬ﺑﺤﻴﺚ ﻳﻜﻮن‬
‫اﻟﻮﺟﻪ ﻣﻮاءﻣًﺎ ﺿﻤﻦ اﻟﺼﻮرة ﻓﻲ اﻟﻤﺮة اﻷوﻟﻰ وﻻ ﻳﻜﻮن ﻣﻮاءﻣًﺎ ﻓﻲ اﻟﻤﺮة اﻟﺜﺎﻧﻴﺔ وذﻟﻚ ﺣﺘﻰ ﻧﺘﻤﻜﻦ ﻣﻦ اﻟﺤﻜﻢ ﻋﻠﻰ‬
‫اﻟﻄﺮﻳﻘﺔ اﻟﻤﻘﺘﺮﺣﺔ وﻣﺪى اﺣﺘﻴﺎﺟﻬﺎ ﻟﻬﺬا اﻟﺸﺮط ﻣﻦ ﻋﺪﻣﻪ‪.‬‬

‫ﻣﻦ أﺟﻞ ﻣﻘﺎرﻧﺔ اﻟﻄﺮﻳﻘﺔ اﻟﻤﻘﺘﺮﺣﺔ ﻣﻊ ﻃﺮق أﺧﺮى ﻣﻨﺎﻇﺮة‪ ،‬ﻓﺈﻧﻨﺎ ﻗﻤﻨﺎ ﺑﻌﻤﻞ ﻣﺴﺢ ﻟﺘﺴﻌﺔ ﻣﻦ دراﺳﺎت اﻟﻤﻘﺎرﻧﺔ وﻗﻤﻨﺎ‬
‫ﺑﺎﺧﺘﻴﺎر أﻓﻀﻞ أرﺑﻌﺔ ﻃﺮق ﻣﻦ ﺑﻴﻦ ‪ 38‬ﻃﺮﻳﻘﺔ ﻣﺨﺘﻠﻔﺔ ﺷﻤﻠﺘﻬﻢ هﺬﻩ اﻟﺪراﺳﺎت‪ .‬ﺟﻤﻴﻊ اﻟﻄﺮق اﻟﺨﻤﺲ ﺗﻢ اﺧﺘﺒﺎرهﺎ‬
‫وﻣﻘﺎرﻧﺘﻬﺎ ﺑﺎﺳﺘﺨﺪام ﻃﺮﻳﻘﺘﻲ اﻟﺘﻌﺮف ﻋﻠﻰ اﻟﻮﺟﻪ‪ ،‬واﻟﺘﻲ ﺗﻢ اﺧﺘﻴﺎرهﻤﺎ ﻣﺴﺒﻘﺎً‪ ،‬وذﻟﻚ ﻋﻠﻰ ﺻﻮر ﺑﻬﺎ اﺧﺘﻼﻓﺎت ﺑﺴﺒﺐ‬
‫اﻹﺿﺎءة وﺻﻮر أﺧﺮى ﺑﻬﺎ اﺧﺘﻼﻓﺎت ﻏﻴﺮ اﻹﺿﺎءة‪.‬‬

‫أﻇﻬﺮت اﻟﻨﺘﺎﺋﺞ ﻋﻠﻰ اﻟﺼﻮر اﻟﺘﻲ ﺑﻬﺎ اﺧﺘﻼﻓﺎت ﺑﺴﺒﺐ اﻹﺿﺎءة أن اﻟﻄﺮﻳﻘﺔ اﻟﻤﻘﺘﺮﺣﺔ آﺎﻧﺖ اﻷﻓﻀﻞ ﻣﻦ ﺑﻴﻦ اﻟﻄﺮق‬
‫اﻷﺧﺮى ﻋﻠﻰ ﻃﺮﻳﻘﺔ اﻟﺘﻌﺮف اﻷوﻟﻰ وﺛﺎﻧﻲ أﻓﻀﻞ ﻃﺮﻳﻘﺔ ﻋﻠﻰ ﻃﺮﻳﻘﺔ اﻟﺘﻌﺮف اﻟﺜﺎﻧﻴﺔ وذﻟﻚ ﻋﻨﺪﻣﺎ ﻳﻜﻮن اﻟﻮﺟﻪ ﻏﻴﺮ‬
‫ﻣﻮاءﻣًﺎ ﺿﻤﻦ اﻟﺼﻮرة‪ .‬إﺿﺎﻓﺔ إﻟﻰ ذﻟﻚ‪ ،‬آﺎﻧﺖ اﻟﻄﺮﻳﻘﺔ اﻟﻤﻘﺘﺮﺣﺔ اﻷﻗﻞ ﺗﺄﺛﺮًا ﻓﻲ ﻧﺘﺎﺋﺠﻬﺎ ﻣﻦ ﺑﻴﻦ اﻷرﺑﻊ ﻃﺮق اﻷﺧﺮى‬
‫ﻧﺘﻴﺠﺔ ﻟﻌﺪم ﻣﻮاءﻣﺔ اﻟﻮﺟﻪ ﺿﻤﻦ اﻟﺼﻮرة وذﻟﻚ ﻋﻠﻰ آﻠﺘﺎ ﻃﺮﻳﻘﺘﻲ اﻟﺘﻌﺮف اﻟﻤﺴﺘﺨﺪﻣﺘﻴﻦ‪ .‬أﻣﺎ ﻋﻠﻰ اﻟﺼﻮر اﻟﺘﻲ ﺑﻬﺎ‬
‫اﺧﺘﻼﻓﺎت أﺧﺮى ﻏﻴﺮ اﻹﺿﺎءة‪ ،‬ﻓﻘﺪ أﻇﻬﺮت اﻟﻨﺘﺎﺋﺞ أن اﻟﻄﺮﻳﻘﺔ اﻟﻤﻘﺘﺮﺣﺔ ﻟﻬﺎ أﻗﻞ ﺗﺄﺛﻴﺮ ﺳﻠﺒﻲ ﻣﻦ ﺑﻴﻦ اﻷرﺑﻊ ﻃﺮق‬
‫اﻷﺧﺮى وذﻟﻚ أﻳﻀًﺎ ﻋﻠﻰ آﻠﺘﺎ ﻃﺮﻳﻘﺘﻲ اﻟﺘﻌﺮف‪.‬‬

‫ﺟﻤﻴﻊ ﻃﺮق اﻟﺘﻌﺎﻣﻞ ﻣﻊ اﺧﺘﻼﻓﺎت اﻹﺿﺎءة ﻓﻲ هﺬا اﻟﻌﻤﻞ ﺗﻢ اﺧﺘﺒﺎرهﺎ ﻋﻠﻰ ﻃﺮﻳﻘﺘﻲ ﺗﻌﺮف ﻋﻠﻰ اﻟﻮﺟﻪ ﺗﻤﺜﻼن ﻓﺌﺘﻴﻦ‬
‫ﻼ وﻟﻴﺲ أﺟﺰاء ﻣﻨﻪ‪ .‬ﻟﺬﻟﻚ ﻓﺈﻧﻪ ﻣﻦ اﻟﻤﻬﻢ أن ﻳﺘﻢ ﺗﻮﺳﻴﻊ هﺬا اﻟﻌﻤﻞ‬
‫آﺒﻴﺮﺗﻴﻦ ﻣﻦ اﻟﻄﺮق اﻟﻤﻌﺘﻤﺪة ﻋﻠﻰ دراﺳﺔ اﻟﻮﺟﻪ آﺎﻣ ً‬

‫‪127‬‬
‫ﻟﻴﺸﻤﻞ اﻟﻄﺮق اﻟﻤﻌﺘﻤﺪة ﻋﻠﻰ دراﺳﺔ أﺟﺰاء ﻣﻦ اﻟﻮﺟﻪ أﻳﻀًﺎ وذﻟﻚ ﻷﻧﻪ ﻗﺪ ﻳﺤﺪث ﻓﺮق آﺒﻴﺮ ﻓﻲ اﻷداء ﻋﻨﺪﻣﺎ ﺗﻘﺘﺮن هﺬﻩ‬
‫اﻟﻄﺮق ﻣﻊ ﻃﺮق اﻟﺘﻌﺎﻣﻞ ﻣﻊ اﺧﺘﻼﻓﺎت اﻹﺿﺎءة‪.‬‬

‫وﻋﻼوة ﻋﻠﻰ ذﻟﻚ‪ ،‬ﻳﻘﺪم هﺬا اﻟﻌﻤﻞ ﺗﻘﻴﻴﻤًﺎ ﺗﻜﻨﻮﻟﻮﺟﻴًﺎ ﻟﻠﻄﺮﻳﻘﺔ اﻟﻤﻘﺘﺮﺣﺔ واﻟﻄﺮق اﻷﺧﺮى اﻟﻤﺨﺘﺎرة‪ .‬ﻟﺬﻟﻚ ﻓﺈﻧﻪ ﻣﻦ اﻟﻤﻬﻢ‬
‫ﻋﻤﻞ ﺗﻘﻴﻴﻢ ﺳﻴﻨﺎرﻳﻮ وآﺬﻟﻚ ﺗﻘﻴﻴﻢ ﺗﻨﻔﻴﺬي ﻷداء هﺬﻩ اﻟﻄﺮق ﻣﻦ أﺟﻞ اﻧﺠﺎز ﺗﻘﻴﻴﻢ ﺷﺎﻣﻞ ﻟﻬﺎ‪.‬‬

‫‪128‬‬
‫ﺟﺎﻣﻌﺔ ﻋﻴﻦ ﺷﻤﺲ‬
‫آﻠﻴﺔ اﻟﺤﺎﺳﺒﺎت و اﻟﻤﻌﻠﻮﻣﺎت‬
‫ﻗﺴﻢ ﻋﻠﻮم اﻟﺤﺎﺳﺐ‬

‫ﻃﺮﻳﻘﺔ ﺟﺪﻳﺪة ﻟﻠﺘﻌﺎﻣﻞ ﻣﻊ ﻣﺸﻜﻠﺔ اﻹﺿﺎءة ﻓﻲ اﻟﺘﻌﺮف ﻋﻠﻰ اﻟﻮﺟﻪ‬


‫رﺳﺎﻟﺔ ﻣﻘﺪﻣﺔ إﻟﻰ ﻗﺴﻢ ﻋﻠﻮم اﻟﺤﺎﺳﺐ ﺑﻜﻠﻴﺔ اﻟﺤﺎﺳﺒﺎت واﻟﻤﻌﻠﻮﻣﺎت ﺟﺎﻣﻌﺔ ﻋﻴﻦ ﺷﻤﺲ‬
‫آﺠﺰء ﻣﻦ ﻣﺘﻄﻠﺒﺎت اﻟﺤﺼﻮل ﻋﻠﻰ درﺟﺔ اﻟﻤﺎﺟﺴﺘﻴﺮ ﻓﻰ اﻟﺤﺎﺳـﺒﺎت واﻟﻤﻌﻠﻮﻣـﺎت‬

‫إﻋـﺪاد‬

‫أﺣﻤﺪ ﺻﻼح اﻟﺪﻳﻦ ﻣﺤﻤﺪ اﻟﺴﻴﺪ‬


‫ﻣﻌﻴﺪ ﺑﻘﺴﻢ ﻋﻠﻮم اﻟﺤﺎﺳﺐ‬
‫آﻠﻴﺔ اﻟﺤﺎﺳﺒﺎت واﻟﻤﻌﻠﻮﻣﺎت‬
‫ﺟﺎﻣﻌﺔ ﻋﻴﻦ ﺷﻤﺲ‬

‫ﺗﺤﺖ اﺷﺮاف‬

‫اﻻﺳﺘﺎذ اﻟﺪآﺘﻮر ‪ /‬ﻃﻪ إﺑﺮاهﻴﻢ اﻟﻌﺮﻳﻒ‬


‫اﻷﺳﺘﺎذ ﺑﻘﺴﻢ ﻋﻠﻮم اﻟﺤﺎﺳﺐ‬
‫آﻠﻴﺔ اﻟﺤﺎﺳﺒﺎت و اﻟﻤﻌﻠﻮﻣﺎت‬
‫ﺟﺎﻣﻌﺔ ﻋﻴﻦ ﺷﻤﺲ‬

‫اﻟﺪآﺘﻮر ‪ /‬هﻴﺜﻢ اﻟﻤﺴﻴﺮي‬


‫اﻟﻤﺪرس ﺑﻘﺴﻢ ﻋﻠﻮم اﻟﺤﺎﺳﺐ‬
‫آﻠﻴﺔ اﻟﺤﺎﺳﺒﺎت و اﻟﻤﻌﻠﻮﻣﺎت‬
‫ﺟﺎﻣﻌﺔ ﻋﻴﻦ ﺷﻤﺲ‬

‫‪2009‬‬

‫‪129‬‬

You might also like