Professional Documents
Culture Documents
Problem description:
To imagine the problem from a simple view, lets assume we have a gate for parking at company , airport or high secure area, and we dont want to let any unauthorized cars or drivers to get inside those places. The first step is to build a database that contains cars and drivers who have the right of access to this place And the problem comes in the second step, which is how to check every car and driver at the gate and use the database to give them the access or prevent them from entering the place
Problem description(cont.):
Here you have two ways
The automated way, by developing a real time system which can recognize car plate number and driver and give the gate order to open or close without any human interference The manual way , by hiring a person to be in charge of the gate , and check manually each car plate number and each driver before letting them to access
And here comes our project to develop the software program that can perform the functions of the automated system and it is the first step of applying it as a real time system that can work at real world.
First Phase
Why our face detection work is different and special from other types of face detection?
The answer comes from our dealing with persons and drivers and faces behind glass of cars not faces in normal conditions, and we want to detect faces of those drivers. And this will divide our experimental work (as we will see later) to two steps: Testing of algorithms on normal images(without glass) Taking successful algorithms at the first step and testing them again but with new images(with glass).
The used algorithms till now in this field can be categorized to three main categories: i. Techniques based on information theory: and here they treat the images from the view of mathematics and try to deal with it by some analytical and mathematical methods to reach our target and detect the face. Ex: eigenfaces technique, fisherfaces technique
ii.
Background for face detection) Features and techniquessegmentation in other cont.(: Techniques based on skin
color spaces than RGB: here the idea is that human skin has fixed remarkable ranges in some color spaces. so we can identify the area of the face from this idea. Ex: HSV space, YCrCb colorspace iii. Algorithm of building skin model then template matching: At this algorithm different small skin samples are used to build a gaussian distribution of the skin and then scan the image to indicate areas inside this distribution(candidate areas), then using template matching with average face we can choose the best area to be the face.
iv.
After studying the second algorithm of building the model and template matching, we found that the problem came at the phase of building the model of the skin because the small samples of skin that were used to build the distribution are normal skin, but we are using this distribution to detect an abnormal face (with glass). At this time we found ourselves in a big problem and approximately returned to the zero point as all our algorithms have failed, and we thought at this time that the only solution here is to modify those algorithms with new ideas of our own thinking to meet and face the new element of glass. We took a period of time in thinking and making experiments to try to solve the problems of our algorithms
The direct idea which came here to solve the problem of building model algorithm was to generate new skin samples of skin behind the glass to replace the old skin samples which were in the normal conditions without glass. Then we generated new skin samples from the new images, and we built our new 3D distribution of the new images (with glass) to detect a face of the same type of images, so the model and the faces had a similarity and this made sense for us. The template matching part didnt have problems after that. The problem has been solved and the results were very good with accuracy of about 92-93% (compared with the zero results before solving the problem)but there was a small problem that we had to collect a bigger number of skin samples than the normal skin samples and the execution time was a little bit long. So this directed us to trying to solve the other algorithm in a trial of having better performance.
Then here we will talk about how we solved the problems of the skin detection algorithm: At the beginning we tried to modify the fixed range of skin in the Ycrcb domain, but it didnt work as if we draw the distribution of this new range we won't find it precise as the old range but its variance is very large which mixed it with the range of the background so it is not possible to identify this range. The solution came from new idea which was trying to isolate and remove the layer of the glass to use the old range normally as the original images. This idea was a great success and leaded to reaching accuracy of 98-99% with short execution time. And finally we took our decision to use the skin detection algorithm after adding the idea of removing the glass layer to be our technique.
long
convenient
Not passed
passed
98%
convenient
passed
accuracy 98-99%
92-93%
Concolusion:
After studying the main algorithms and trying them, first on the normal images (without glass), then with new images (with glass), we found that best results come always from those algorithms that work with skin detection idea not the template matching ones. So we took our decision to use the skin detection algorithm after adding the idea of the removing the glass layer.
(second phase)
Introduction:
The face is primary focus of attention in social life. Playing a big role in conveying identity and emotion. Human ability to recognize faces is remarkable. Without any doubt human brain is the greatest face recognition system on earth. The computational approach taken in this system is motivated by information theory which is called 'eigenfaces', as well as by the practical requirements of near-real-time performance and accuracy. We used also principle component analysis as a method of dimensionality reduction of the algorithm.
Why is this system different from other face recognition systems? system will be tested The different point her is that
and trained and used for new images as we said in face detection phase- with glass, so we will study its performance and decide if it has acceptable results or not. We searched a lot for a database that meets our requirements of images with glass so that we can train our system and test with those new images which he will deal with in practical life, but we didnt find any already made database as we wanted. So we took our decision to design the database with our own to get our specifications.
Experimental work:
implementing of eigenfaces algorithm:
i.
ii.
Obtain face images I1, I2,..., IM (training faces) as matrices. Represent every image Ii as a vector i .
Experimental work:
implementing of eigenfaces one single iii. Put all those vectors beside each other to form matrix(A) (N x M) which is algorithm:face vector() called face space, then calculate the average of this matrix as follows:
2
=1/M i=1M i iv. Subtract the mean face from each vector (A): i =i Where q= [1 2 . M] (N2 x N2) v. Compute the eigenvectors ui and eigenvalues li of (q) vi. order the eigenvalues downwards from max. eigenvalue and also the eigenvectors matrix (e) also will be ordered, so the most of the weight of the data is concentrated in the higher values of the eigenvalues and we will reduce the size of the eigenvectors matrix according to a specific num according required efficiency. vii. now we will project (q) matrix on reduced eigenvalues matrix F = (e)T*q
Classification:
i.
i.
iii.
Now, we have input image coming from the face detection phase and we want to recognize it by comparing with the face space. so, we will convert it to (f0) vector also. We will subtract this (f0) from the mean face vector and project it on the reduced eigenvectors matrix also. Test_f=(e)T*f0 The next step is to calculate the place of the column from the face space which has the minimum equlidean distance with (Test_f) and this column indicates to an image, and this image will be our decision.
Concolusion:
After studying the results of the system, which gave accuracy of about 99% at our limited database , we found that we have a very good and acceptable performance. The performance of the system wasnt affected by using the new images (with glass)and gave approximately the same accuracy that it gave with the original images (without glass).
Third phase
Introduction:
we will talk about a very important step in this project which is the localization of the vehicle plate number and then the segmentation of it after that we can segment the characters written on the plate to introduce them to the next step of the project which is the optical character recognition (O.C.R.), so that the automatic system can recognize those characters to compare them with the stored database of the plates which have the right to access the gate.
Output image
Experimental Work:
1-smearing algorithm:
A. convert color image to gray image. B. convert gray image to binary image. C. put thresholds and process image along vertical and horizontal runs. D. dilation and then erosion to keep only useful area.
Gray image:
Cropped image:
C. Convert resultant image to binary image and use gray threshold in converting. D. Calculate number of object in last image then make a loop by this number and apply some conditions on area and height of extracted object . then we get numbers and characters of license plate.
B. second experiment:
A. Convert RGB to Gray. B. enhances the contrast of output image ( gray). C. scanning vertically to get maximum edge density. D. get positions about right and left standard deviation value. E. use last positions value to crop plate license from image. F. convert cropped image to binary and extract its letters same as steps 3,4 in first experiment.
60%
Same as first experiment but we got some progress because of using standard deviation.
C. Third Experiment: A. Step 1,2,3,4,5:same as second experiment. B. Using position of 2 points around standard deviation crop license plate and some area above it and below it because if we have error in detection of license plate , we always get the plate. C. Extract the plate letters & numbers from cropped image. Results: We reached here to accuracy of 72%. problems: converting to binary threshold problem.
D. Fourth Experiment:
A. Repeating the same steps as in 3rd Experiment until cropping the plate area. B. Converting to different binary thresholds (0.25,0.35,0.5,0.65,0.75,0.85). C. The choice between thresholds based on which image contains more white pixels and had 6 or 8 objects as the number of letters & numbers of plate. D. Extract letters and numbers. Results: Here we reached to accuracy of 83%.
Disadvantages:
The choice of the best binary threshold.
E. Fifth Experiment:
A. same as first steps in fourth experiment. B. we noticed that most of characters and numbers centered at middle of image. The choice between thresholds based on which image contains more objects centered at the middle of image.
F. Sixth Experiment:
A. repeating the same steps as in 3rd Experiment until cropping the plate area. B. We noticed that the best threshold to convert to binary was around the global threshold. C. The choice between thresholds based on which image contains more white pixels and had 6 or 8 objects as the number of letters & numbers of plate. D. after choosing the best threshold ,then extract letters & numbers.
7. Seventh Experiment:
A. Repeating the same steps as in 3rd Experiment until cropping the plate area. B. The choice between thresholds in this experiment depend on which image after converting to binary containing large number of white pixels. But the difference here is we make more than one pattern. Every pattern contains more than one threshold. We choose pattern first then choose threshold. C. After choosing the best threshold ,then extract characters & numbers. Results: We reached an accuracy of 99%.
problems:
In 99% of our pictures characters & numbers was extracted but with some unwanted objects.
Accuracy
<10%
Sort diff. between objects centroid & image Centroid (take minimum 6)
Using NN
~ 60%
~80%
~85%
~90%
Scanning row by row from above & below (stop when finding 6 objects)
Trapping the plate by a rectangle to extract it only
~90%
~95%
1. Introduction:
1.
What is OCR?
OCR is the acronym for Optical Character Recognition. This technology allows a machine to automatically recognize characters through an optical mechanism.
2.
History of OCR?
in 1914, Emanuel Goldberg developed a machine that read characters and converted them into standard telegraph code& Edmund Fournier developed the Optophone.
2. Background:
Previous work:
1. Gurumukhi Script OCR.
3. Features:
A. 32 Slope Feature :
1. extract the letters in the smallest possible area & resizing. 2. Get centroid of each segmented letter. 3. Record all rows and columns for any data pixel. 4. Calculate slope for each data pixel by : (y of point-y of centroid)/ (x of point-x of centroid) 5. Divide the segmented letter into 32 slopes lines. 6. Get the average of all rows for all data pixels that lies on each slope line and similarly for columns. 7. Calculate euclidean distance between each row in the resulted matrix and the centroid .
Features cont.:
B. 16 sectors feature :
1. extract the letters in the smallest possible area. 2. resizing. 3. Get centroid of each segmented letter. 4. Divide the segmented letter into 16 equal sectors. 5. Obtain the center of gravity of each sector. 6. Calculate euclidean distance between the center of gravity of each sector and the centroid of the segmented letter, the result is: D=[d1 d2 d3 .. dn] Where: di: is the Euclidean between the center of gravity of each sector and the centroid of the segmented letter. And n=16.
Features cont.:
C. 144 Combined Feature:
To understand it first, we explain these 2 features: Sixteen block feature. 8 sectors feature. So THE 144 Combined Feature is combined feature between 2 features are: 1- Modified eight Sectors Feature: The resulted vector will be of size: 8X16=128 element . And: F= [E (1, 1) E (2, 1) E (8, 16)] Where: E (i, j)= The center of gravity of the sector number i in the block number j. 2- Sixteen Block Feature.
Features cont.:
We combined the two previous features as follows:
E= [E (1, 1) E (2, 1) . E (8, 16) alpha*[D (1) D (2) D (16)] Where: E (I, j) = the center of gravity of the sector number i in the block number j
Features cont.:
From Graph we chose =.2 as it has the best accuracy:
4. Classifiers
A. Nearest Neighbour Classifier (NN):
The class that contains the training sample whose feature vector is the nearest (least Equaledian distance) to the test sample feature vector is the expected class.
Algorithm of SVM:
Is shown in the next figure:
Classifiers cont.:
Classifiers cont.:
- The result will be 325 structures = 25+24+23++3+2+1 - We will classify each of them with the feature vector of test sample, and then the result will be a vector of 325 elements (each represent a class). - We will take the most repeatable class as the expected class.
Classifiers cont.:
D. Combining all three classifiers:
we expect certain class that is found more than one time for each test sample and if all three classes in this row are different, we expect the class that lies on the column that represent the classifier whose accuracy for this feature is the best.
5. Experimental Work:
Checking Our Features:
We tried them on handwritten English numbers (0 1 2 3 4 5 6 7 8 9).
144 combined NN
94.82 %
SVM
90.3955% 89.8305% 87.7589% 88.7006%
NN
90.2072% 90.7721% 89.6422% 89.8305%
SVM
89.4539% 90.9605% 90.3955% 90.2072%
88.1356%
88.1356% 88.5122% 87.5706% 84.3691%
89.2655%
88.7006% 87.5706% 88.1356% 87.9473%
90.7721%
90.5839% 90.2072% 90.3955% 90.2072%
90.3955%
91.3371% 90.9605% 91.9021% 90.5838%
NN
SVM
Neural Network
72.3529% 82.521% 91.2605%
Combined classifier
85.3782% 88.4874% 92.6891%
16
24 32 40 48 56 64 72
84.2857
82.437 82.0168 78.4874 75.5462 78.5714 76.9748 75.1261
83.3613
82.2689 84.1176 80.0840 78.9076 79.1597 81.9328 79.0756
67.1429
69.9160 72.3529 69.7479 67.2269 67.8151 71.4286 60.0840
85.1261
83.8655 85.3782 80.0840 78.7395 78.4874 80.1681 78.9076
80
74.8739
77.0588
62.5210
77.5630
Accuracy of NN
89.6639
K-Means Algorithm: 1. Initialize centroids of K-clusters randomly. 2. Assign each sample to the nearest centroid. 3. Calculate centroids (means) of K-clusters. 4. If centroids are unchanged, done. Otherwise, go to step 2.
2 5
92.0168% 91.0084%
89.4118% 87.7311%
144 combined
144 combined with KMeans(k=20) 32 reduced from 144 32 reduced from 144 with K-Means(k=20)
NN/95.0378
NN/95.0378 NN/92.6829 NN/92.3465
SVM/90.0757
SVM /88.9823 SVM /87.9731 SVM /85.7864
Conclusion
The first Confusion Matrix is of the best accuracy before replacing error test data with accuracy of 93.5294 %. The second Confusion Matrix is of the best accuracy after replacing error test data with accuracy of 95.0378 % and fortunately it happened with K-Means with K=20 so with smaller memory size.