You are on page 1of 5

ISSN: 2277-4629 (Online) | ISSN: 2250-1827 (Print)

CPMR-IJT Vol. 2, No. 2, December 2012

Detection and De-occlusion of Text in Images


S. Bhuvaneswari* Dr. T. S. Subashini** Dr. V. Ramalingam***

ABSTRACT
The proposed work aims to automatically detect the pixel locations corresponding to text regions in an image automatically using CCL and template matching. These detected pixels are used as inpainting mask to de-occlude them from the image using the fast marching inpainting algorithm. This work is done in two steps. The first step detects the region of text from the image without the user manually marking it and in the second step the text is de-occluded from the image using the fast marching inpainting algorithm. Keywords: Niblacks algorithm, CCL, Selection criteria, Templates, Fast marching algorithm.

this work the text region in the image is detected and concealed using fast marching technique. The procedure is divided into four steps: detection, localization, extraction, and inpainting. The detection step roughly classifies text regions and non-text regions. The localization step determines the exact boundaries of text strings. The extraction step filters out background pixels in the image, so that only the text pixels are left for inpainting. Inpainting restores a degraded image or video in such a way that the changes are not apparent to a casual observer. Photography and film industries are now adopting this technique to remove artifacts and scratches, to remove undesired objects and text from the foreground or from the background.

I. INTRODUCTION
The image can be understood as a two dimensional function (x,y) where x and y are spatial coordinates, and the amplitude of f at any pair of coordinates (x,y) is called the intensity or gray level of the images at that point [1]. The text data which is present in an image is of different font styles, sizes, orientation, colors and mostly against a complex background. Text data extraction and manipulation of objects from digital media is essential for understanding editing and retrieving information. In

Fig. 1 a ) Input image b ) Inpainted image The common requirement for all image inpainting algorithms is that the region to be inpainted should be manually selected by the user.

*, **, *** Department of Computer Science and Engineering, Annamalai University, Tamilnadu, India

www.cpmr.org.in

CPMR-IJT: International Journal of Technology

ISSN: 2277-4629 (Online) | ISSN: 2250-1827 (Print)

CPMR-IJT Vol. 2, No. 2, December 2012

As a first step the user manually selects the portions of the image that will be restored. This is usually done as a separate step and involves the use of other image processing tools. Then image restoration is done automatically, by filling these regions in with new information coming from the surrounding pixels for from the whole image. It can be seen from Fig.1 how the flowers in the original image have been inpainted [9] after manually selecting the flower region. There are many applications of image inpainting ranging from restoration of photographs, films, removal of occlusions such as text, subtitle, logos, stamps, scratches, red eye removal etc., The rest of the paper is organized as follows. Section II describes the related work in this area. Section III presents the key observations and methodology of this work. Section IV shows the experimental results. Section V concludes the paper.

can be directly used for text recognition process. The work in [9] introduces a novel algorithm for digital inpainting of still images without user intervention. It removed target region from digital photographs using region filling algorithm. But the proposed technique does not require any user intervention to select the region to be inpainted. Since we intend to de-occlude text from images, we propose a method to automatically detect text which will be used as the inpainting mask during the inpainting phase.

III. METHODOLOGY
This work is done in two steps. The first step detects the region of text from the image without the user manually marking it and in the second step the text detected in the first step is used as a mask for de-occluding it from the image using the fast marching inpainting algorithm. The proposed work is shown in Fig. 2.
Input image

II. RELATED WORK


The work in [2] presents an algorithm to inpaint images using cellular neural networks. The results show that an almost blurred image can be recovered with visually good effect. In [3] holes were inpainted using morphological component analysis designed for separation of linearly combined texture in a given image. Authors in [4] used a convolution mask which was decided interactively and requires user intervention. But this algorithm works only for small regions and cannot inpaint large regions in the image. [5] employs the horizontal projection and geometric properties for region segmentation and selection of text regions. [6]uses the texture property to identify text using SVM. [7] analyses the connected component based algorithm, and edge based algorithm for text region extraction. They concluded that the connected component algorithm is robust and invariant to scale, lighting and orientation compared to the edge based algorithm. The work in [8] applies an algorithm which localizes the text in images. The detected text are binarized which
www.cpmr.org.in

Resizing the image to 400x400 Checking for the light background, if not change background as light background CCL is applied to obtain connected regions Regions are subjected to selection criteria New candidate regions obtained and remaining regions are resized in 24x42 Candidate regions compared with templates Matching candidates are extracted as text Extracted text is the mask for inpainting Apply Fast Marching method Inpainted image

Fig. 2 The methodology of the proposed work


CPMR-IJT: International Journal of Technology

ISSN: 2277-4629 (Online) | ISSN: 2250-1827 (Print)

CPMR-IJT Vol. 2, No. 2, December 2012

1. The image background and foreground information is manipulated to have a dark text region against the light background. 2. The resultant image is resized to 400 x 400 to reduce unnecessary computations. 3. The resized image is binarized using Niblack algorithm [10] which takes the local mean and standard deviation to find out the threshold for binarizing the image. 4. Connected component labelling (CCL) scans an image and groups its pixels into components based on pixel connectivity, which means all pixels in a connected component share similar pixel intensity values and are in some way connected with each other. Once all groups have been determined, each pixel is labelled with a graylevel or a colour according to the component it was assigned to [1].So, CCL is applied to obtain the various connected region and the connected regions are bounded using Bounding boxes. 5. Parameters namely criteria namely area, perimeter and the aspect ratio of each region is calculated to eliminate non text region. 6. The remaining candidate text region are now resized into 42 x 24. 7. Each resized component is matched with the template to check whether it is a text. A template of all the text characters (AZ),(a.z),(09) has been created and stored in a template file. The images of the character were resized to 42x24 and stored. 8. This further eliminated non text region and the resultant text region will now be considered as the inpainting mask. 9. The image is now inpainted using fast marching algorithm [11], which is a technique for producing distance maps of the points in a region from the boundary of the region. This method combined with a way to paint the points inside the boundary, according to the increasing distance from the boundary of the region.

IV. EXPERIMENTAL RESULTS


The images with simple and complex background are collected as input. The algorithm is to the images and the outcomes are shown below:

a. Original

b. Binarized image using Niblacks algorithm

c. Labelled image

d. Target regions before applying criteria

e. Target regions after applying criteria

f ) Inpainted image

Fig.3 The text in T shirt has been detected and inpainted

a. Original image

b. Binarized image using Niblacks algorithm 8

www.cpmr.org.in

CPMR-IJT: International Journal of Technology

ISSN: 2277-4629 (Online) | ISSN: 2250-1827 (Print)

CPMR-IJT Vol. 2, No. 2, December 2012

c. Labelled image

d. Target regions before applying criteria

e. Target regions after f ) Inpainted image applying criteria Fig.5 The text in Car name plate detected and inpainted regions

V. CONCLUSION
In this paper, we have presented a method to detect text automatically in colour images. The detected text region is the target region for inpainting. Niblacks algorithm was applied to binarize the image, CCL and template matching was carried out to detect the text. The text detected was de-occluded using fast marching algorithm. Experimental results have demonstrated that the proposed method can be effectively used to detect the text automatically. One of the limitations of this work is that the proposed work fails to detect joined running text, and fancy text fonts.

e. Target regions after f. Inpainted image applying criteria Fig.4 The text in a Poster is detected and inpainted

VI. REFERENCES
a. Original image b. Binarized image using Niblacks algorithm [1] R.C.Gonzalaz and R.E.Woods, Digital Image Processing, 2nd ed., Pearson Education, 2002. [2] M.J.Fadiii,J.L.Starck and F.Murtagh, Inpainting and Zooming using sparse Representation, The computer Journal, pp.6479, 2009. [3] A.C.Kokaram, R.D.Morris, W.J.Fitzgerald, and P.J.W.Rayner, Interpolation of missing data in image sequences. IEEE Transactions on Image Processing, Vol. 11, No.4, pp.15091519, 1995.

c. Labeled image

d. Target regions before applying criteria

www.cpmr.org.in

CPMR-IJT: International Journal of Technology

ISSN: 2277-4629 (Online) | ISSN: 2250-1827 (Print)

CPMR-IJT Vol. 2, No. 2, December 2012

[4] A.Criminisi, P.Perez, K.Toyama, Region Filling and object removal by exemplar based image inpainting, IEEE transactions on image processing, Vol.13, No.9, pp.1-7, 2004. [5] RodolfoP.DosSantos, S.Gabriela, Clemente, TsangIng Ren, George D.C.Calvalcanti, Text Line segmentation based on morphology and histogram projection, IEEE Proceedings on the 10th International conference on document analysis and recognition (ICDAR), pp.651-655, 2009. [6] Qixiang Ye,Wen Gao,Wiquiang Wang,WeiZeng, A robust text detection algorithm in images and video frames, IEEE Proceedings on the International conference on information, communication and signal processing (ICICS), Vol.2, pp.802-806, 2003. [7] J.Sushma, M.Padmaja, Text Detection in color images, IEEE Proceedings on the International conference on intelligent agent & multi agent systems (IAMA), pp.1-6, 2009. [8] Udaymodha, Preethidave, image inpainting Automatic Detection and Removal of Text from

images, International journal of engineering research and applications (IJERA), Vol.2, No.2, pp-930-932, 2012. [9] S.Bhuvaneshwari, T.S.Subashini, S.Soundharya, V.Ramalingam, A novel and fast exemplar based approach for filling portions in an image, IEEE Proceedings on the International conference on recent trends in information technology (ICRTIT), pp.91-96, 2012. [10] Graham Leedham, Chen Yan Kalyan Takru, Joie Hadi Nata Tan and Li Mian, Comparison of Some Thresholding algorithms for text/ Background Segmentation in difficult document images, IEEE Proceedings on the Seventh international Conference on Document Analysis and Recognition (ICDAR), pp.859-864, 2003. [11] Alexandru Telea,An image Inpainting Technique Based on the Fast Marching Algorithm, Journal of Graphics tools,Vol.9, No.1, pp.25-38, 2004.

www.cpmr.org.in

CPMR-IJT: International Journal of Technology

10

You might also like