Professional Documents
Culture Documents
Abstract
A recognition system for identijjing hand-written Indian (Arabic) numerals one to nine ( 5 - 9 has been developed. A graphical user interface was developed using advanced object oriented techniques that incorporates Matlab@as a technical tool. The process involved extracting a feature vector to represent the handwritten sketch based on the object centroid and boundary points. A template vector was derived for each digit by taking the average feature vector of 30 handwritten sketches made by 30 different students. The test sketch is compared against all nine templates and a distance measure is performed to make the recognition. f An overall hit ratio o 87.22% was achieved in the f preliminary results. The ratio reached IOO% for some o the digits. But there was misinterpretation between similar digits like (7 and ( 9 ) . This study is meant to be a seed toward building a recognition system for Arabic language characters.
machines. When the input stream of characters is not available from the keyboard, but rather through handwriting then a special advanced recognition system is required as an interface with the computer machine to be able to read and interpret the written text. Several projects have been accomplished for the recognition of Latin, Japanese, and Chinese character [2-51. However, few accomplishments have been registered for Arabic character recognition [ 11. Every language has its own features and unique complexity. Arabic language is featured by having several pictures of the same character based on its location in the word. In addition, handwriting comes in many styles like NASKH, KOFI, and others. This makes the recognition problem more sophisticated, and hence requires very advanced approaches. The scope of this paper was limited to developing an approach for detecting handwritten Indian numerals one to nine (9-J) used in Arabic writing, Figure (1). This project is meant to be a seed toward tackling the problem of recognizing handwritten Arabic language.
1. Introduction
The advent made in recent years in the broad field of pattern recognition and analysis has encouraged researchers to develop several techniques for online recognition of language characters [l-51. The motivation behind their efforts was to enhance the human interaction with the advanced computer
83
have enough flexibility to treat variations in color, sketch width, rotation and translation of the digit sketch.
2.3. Skeletonization
Simple erosion is the process of eliminating all the boundary points from an object [Ill. Given that the object X and the structuring element B, then erosion of X by B is defined as all the points x such that B, is included in X, that is:
Erosion: X @ B
{x I B, G X)
(1)
Erosion can be programmed as a two-step process that doesnt break objects. the first step is a conditional erosion in which pixels are marked as candidates for removal, but are not actually removed. In the second pass, those candidates that can be removed without destroying connectivity are eliminated, while those that cant are retained. This process is known as thinning. Therefore, thinning is considered a major preprocessing step to represent the digit shape by means of a skeletonized image consisting of a set of one-pixelwidth lines that highlight the significant features of the original sketch. The thinning algorithm developed by Zhang and Suen in 1984 was adopted [12]. It processes the binary image obtained after the thresholing process in a similar manner to the discussion above. Regardless of the brush width used by the writer, a single-pixel-line representing the digit object is obtained.
84
%.
MO O
and
The coordinate axes x ' y ' at an angel 8 from the x, , y axes are called the principal axes of the object. The shape is traversed from the principal axis instead of the absolute axes, therefore, eliminating any effects to rotation.
A feature vector, V , is composed for the shape based on the COG and from the angel of orientation. Each entry in the feature vector represents the length and direction of a line segment connecting the COG to a pixel point, as depicted in Figure (3). Beginning from the orientation of the object, one measurement vector is taken every 5'. To traverse the entire space (i.e. a 0 360 'offset), a maximum of 72 measurement vectors are recorded. For each measurement vector, the vector length and offset angel are recorded as an entry in the
Moo
Where Mjk is the (j,k)* moment of the skeletonized object S(x,y) defined as
-cc-m
The center of gravity of the object represents the origin of the shape. Hence, the origin in the image is determined relative to the position of the written sketch rather than the absolute center of the image template. This fact eliminates any effect to the absolute position of the sketch in the image. Therefore, the developed recognition system is fi-ee of translation possessions. The next step in the process is to find the orientation of the object, and traverse the object's pixels with respect to that axis. Orientation is defined as the angel B of axis of the least moment of inertia [9]. This angel is obtained by forcing the second-order moment p 1 to vanish, hence defined according to equation (4). 1
feature vector, ,as indicated above. However, not all angel offsets are recorded, simply because the nature of the numeral digit doesn't have any boundary pixels all around the space. Typically 30-45measurement vectors are recorded. Furthermore, some digit shapes could have two intersection points along the same line segment. An example on this is the number four, as shown in Figure (4). In this case, two entries or more are recorded in the feature vector,
corresponding to
85
v . The vector
value
The templates were designed by collecting sketches from 30 people. Each person was asked to provide 9 sketches one for each digit. A feature vector, V I was
.. .
extracted for each image according to the algorithm explained in the previous section. An average feature vector, Tkwas than taken for each 30 feature vectors. In this manner, 9 template vectors were prepared. The developed system conceives high flexibility for many reasons. First, the system is translation invariant. This fact comes as a result of processing object corresponding to the sketched digit with respect to its center of gravity (COG) rather than the absolute position of the sketch in the image template. Second, the developed system is rotation invariant. Instead of traversing the object form the basic axes, the orientation of the object was found. Then, the object was navigated starting from the principal axis that corresponds to the orientation angel. That means, no artifacts arise when the sketch is tilted left or right a little. Finally, the developed system is scaling invariant. This feature was achieved by normalizing the feature vector lengths. That is, once a feature vector is extracted, it is scanned for the maximum. Then all entries in the feature vector are divided by the maximum leaving the entries in the range of 0.0 to 1.0. This eliminates any variations in the sketches that are related to the sketch size. In addition, scaling does not alter the shape of the sketch, as it is a linear operation. The experiments that were carried out are composed primarily of two categories. The first category was the template matching preparation process. This was accomplished by asking 30 different people to make their own handwriting sketches for the different digits for a total of 270 images. These images were then divided into 9 groups with 30 images in each group. The similar sketches were placed in each of these groups. Finally, a template was produced for each of these groups. These templates were used in the second category of experiments categorized as test experiments. In the second phase of experiments, 20 different people were asked to make their own sketches. Each person made 9 sketches corresponding to the 9 digits. Table (1) below presents the results obtained out of this phase. Figure.(S) below presents an illustrative example showing the processing steps toward recognition.
Figure 4: An illustrative example portraying more than one intersection along the measurement vector.
process is performed using an Euclidean distance measure given according to equation (6).
where V,is
sketch p , and Tkis the feature vector of the kh template. The sketch is renowned analogous to the digit that corresponds to template producing the minimum distance, 6 . &
86
Table 1: Experimental results presenting the detection ratio in the test phase
4. Conclusions
A very efficient algorithm for detecting Indian numerals was developed. The proposed recognition system is translation-, rotation-, and scaling-invariant. Therefore, it achieves high flexibility for the hand writer to make hisher sketches freely. This scope of this project was limited to the recognition of the Indian (Arabic) numerals. The promising results of this project led to expanding the project limits to include the Arabic alphabets. The author is currently tackling two main issue associated with this project. The first is to enhance the developed algorithm to boost the detection ratio. To achieve that, the author is considering the use of neural network and neurohzzy approaches. Second, the author is widening the scope of the developed recognition system to include the Arabic characters as well. Enhancements on the contents of the feature vector are also undergoing.
1 2 3 4 5 6 7
16
80%
87.22%
5. Acknowledgements
The author would like to express his sincere gratitude to his students who participated by making handwritten sketches. Special thanks to S . Jawarneh, K. Foqaha, and T. Qudah for their great help in implementing and coding the developed system.
References
[l]
Figure 5: An illustrative example showing original image as submitted from the user (upper left), image after thinning (upper right), image after finding COG (lower left), and the image after finding the orientation (lower right). The feature vector is extracted based on these results.
F. Bousalma, Structural and Fuzzy Techniques in the Recognition of Online Arabic Characters, Int. J. Pattem Recognition and Artificial Intelligence, Vol 13, NO. 7, pp.1027-1040. T. Chang, & S. Chen, Character Segmentation Using Convex-Hull Techniques, Int. J. Pattem Recognition and Artificial Intelligence, Vol 13, NO.6 (1999), pp.833-858. M. Cheriet, Extraction of Handwritten Data From Noisy Gray-Level Images Using A Multiscale Approach, Pattem Recognition and Artificial Intelligence, Vol 13, No. 5 (1999), pp. 665-684. M. Okamoto, Online Handwriting Character Recognition Method Using Directional and Direction-Change Features, Int. J. Pattern Recognition and Artificial Intelligence, Vol 13, NO. 7 (1999), pp. 1041- 1059 K. Qian, Gray Image Skeletonization with Hollow Preprocessing Using Distance Transformation, Pattem Recognition and Artificial Intelligence, Vol 13, No. 6 (1999), 881892. A. Halawani, Recognition of Gestures in Arabic Sign Language Using Neuro-Fuzzy Systems, a master thesis submitted to the Electrical Eng. Dept. at Jordan University of Science and Tech., Irbid, Jordan (2000).
87
[7]
[8]
[9] [lo]
[111
[12]
M. Hussien, Automatic Recognition of Sign Language Gestures, a master thesis submitted to the Electrical Eng. Dept. at Jordan University of Science and Tech., Irbid, Jordan (1999). X. Yi, and 0. I. Camps, Line-Based Recognition Using A Multidimentional Hausdorff Distance, IEEE Trans. Pattem Recognition and Machine Intelligence, Vol21, No. 9 (1999) pp. 901-917. A. K. Nail, Fundamentals of Digital Image Processing, Prentice Hall, Inc. (1989). S. Umbaugh, Computer Vision and Image Processing, a practical approach using CVIPtools, Prentice Hall, Inc. (1998). K. Castleman, Digital Image Processing, Prentice Hall, Inc. (1996). R. C. Gonzalez, & R. E. Woods, Digital Image Processing, 1st Edition, Prentice Hall, Inc.
(1992).
88