You are on page 1of 6

Compound Image Compression

The category of compound images is scanned document images, and its

compression has been intensively studied in the past several years. In order to apply

different compression algorithms to different image types, usually a scanned image are

first segmented into different classes before compression.

Module Description

1. Segmentation

The segmentation is a two-step procedure, including block classification and refinement

segmentation. The first step is to classify 16 × 16 nonoverlapping blocks into

text/graphics blocks and picture blocks by thresholding the number of colors in each

block. Each block is scanned to count the number of different colors. If the color number

is more than a certain threshold (is used for SPEC), the block is classified as picture

block. Otherwise, it is classified as text/graphics block. At the same time, 1-byte index

data is generated for text/graphics blocks. This reduces time to encode these blocks.

The underlying reason of thresholding the number of different colors is that continuous-

tone pictures generally have a large number of different colors even in a small region,

while text or graphics only have a small number of colors even in a large region. The

block classification based on counting different colors can be extremely fast. Typical

webpage images can be done within 40 ms, and wallpaper images can be done within

20 ms. Most blocks are correctly classified, except for those on the boundary of

text/graphics and pictures. The above block classification is a coarse segmentation,

because classified picture blocks may contain text or graphics pixels, and text/graphics

blocks also may contain pictorial pixels. Therefore, refinement segmentation is followed
to extract text and graphics pixels from picture blocks to enhance the results. Pictorial

pixels in text/graphics blocks are not segmented for two reasons. First, with a proper

color number threshold, the amount of pictorial pixels in text/graphics block can be

relatively small; thus, these pixels can be efficiently coded with lossless methods.

Second, for images with large regions of text and graphics, the coarse segmentation is

computationally efficient. If the refinement segmentation is applied

to all blocks, it can be very time-consuming.

2. Extraction of Text

There are four steps which were used for the text extraction from the images in the

implementation phase

a). Template creation

1. Create the template images according to the given specifications.


2. Select the image from which the text has to be extracted.
3. Get the colour of the printed text embedded in the image (i.e., the text pixel
intensity).
4. Disintegrate the whole image into pixels and store their intensity values in an
array.
5. Set the threshold value for choosing the text pixels from the image according to
the colour chosen (i.e., set the colour of the text pixels as 8 and all other pixel
values as 1).
6. Once the threshold value has been set, the pixel intensity values are stored in a 2-
dimensional array.
b). Segmentation Text pixel

1. Get the 2-dimensional arrays containing pixel intensity of the pixels in the image.
2. Scan the array row wise and identify a pair of boundaries – the upper and the
lower line boundaries for each line of text in image and store in a separate array.
3. Within each pair of line boundaries identify the word boundaries whenever a few
columns with all 1’s in them is encountered and store in another 2-dimensional
array.
4. Within each of these identified word boundaries determine and store the character
boundaries in another 2-dimensional array, whenever one or more columns of the
pixel intensity values having all 1’s is encountered.
c).Edge-detection:

1. For each character, obtain the segmentation boundaries and the pixel mapping
from the 2-dimensional array.
2. Identify the topmost left corner text pixel from this pixel mapping to start the edge
detection. Set the current pixel position as the start pixel position.
3. Store the direction of movement as right.
4. Trace the whole boundary of the character using 8-Neighbourhood technique in
a clockwise sequence by repeating the following steps:
3.1) Store the current pixel position as (x, y) co-ordinate entry in two 2-
dimensional arrays- one for the for the input and the other for
template respectively.
3.2) Identify all possible neighborhood pixels in the
text for edge tracing.
3.3) Move to the next possible adjacent pixel in
clockwise sequence from the current position
according to the direction of movement.
3 3.4) Store the last direction of movement.
3.5) Break if the current pixel position reaches start
pixel.
5. Repeat steps from (1) to (4) with intermediate delimiters until all characters are
processed.

d). Comparison:

When the edge-detected arrays for the templates and input image are ready, do the
following:
1. Store the (x, y) co-ordinate values between the delimiters in a separate 2-
dimensional array to represent a single character from the input image.
2. Use a separate variable for count of mismatches and set it to some high value.
3. Until the character boundaries in the template array is exhausted:
4.1.a) Compare each of the [x, y) co-ordinate positions
stored in the template array with that of input
array.
b) Allowance of +1 or -1 pixel is allowed in each pixel comparison
c) If it does not match increment the mismatches by
one.
4.2 Update the mismatches value if the current mismatch value for the template
character is lesser than the existing value.
4. Choose the character from the template corresponding to the least mismatches, as
the selected character and write it to the text file.
5. Add delimiters for indicating next character, word and line.
6. Repeat the steps from (1) to (5) until there are no more boundary values available
in the array used for the input text.
3. Extraction of Graphics and Picture Pixels

The procedure of extracting shape primitives of graphics in a picture

block is as follows. This procedure is similar to the rectangle

decomposition procedure. Each picture block

is scanned from left to right and from bottom to top, started from left-

bottom pixel. If the current pixel has been included in a previously

extracted shape primitive, the scanning procedure skips it and goes to

the next pixel. Otherwise, the current pixel is set as a beginning point,

and the scanning procedure then searches rightward and upward to

extract shape primitives. There may be a rectangle, a horizontal line, a

vertical line, or at least an isolated pixel. An irregular region is

represented by a list of the four types of shape primitives, and the

representation is not unique.

4. Compression

Our lossless coding of text/graphics pixels is mainly based on shape

primitives, which creates a compact representation of shape primitives

of text and graphics. To compress text/graphics blocks, shape

primitives are extracted firstly. The extraction procedure

is similar to that of picture blocks. In addition, in block classification we

can find the color with the largest size, and this color is recorded as

background color. For a text/graphics block, the background color is

usually segmented into interior text regions


and graphics regions. The background color is recorded, and, thus, the

coding of those background color pixels can be skipped. The shape

primitives in other colors are extracted from text/graphics blocks, and

all shape primitives extracted from

Text/graphics blocks and picture blocks are losslessly coded.

SPEC compresses pictorial pixels in picture blocks using a simple

JPEG coder. In order to reduce ringing artifacts and to achieve higher

compression ratio, text/graphics pixels in the picture block are

removed before the JPEG coding. These pixels

are coded by the lossless algorithm. Actually, their values can be

arbitrarily chosen, but it would be better if these values are similar to

the neighbor pictorial pixels. This produces a smooth picture block. We,

therefore, fill in these holes with the average color of pictorial pixels in

the block

5. Transmission and receiving

The SPEC compressed images are transmitted from the Sender to the
authenticated user.

The suitable encoding process are implemented for sending them


securely through public

networks. The authenticated user downloads the files with the


necessary decoding

algorithm and then proceeded towards reconstruction of the original


image.

7. Decompression and reconstruction of image at receiver


side :
The Decompression of the SPEC compressed Image file is done first.

The corresponding Decompression algorithm is implemented to get

back to the original uncompressed form of the image file. The

reconstruction of image is achieved by applying the inverse of the

Segmentation process implemented earliest in the Client Side(Sender

Side). The pixel information and Text information are restored back to

their normal locations and with true quality.

Thus,the compound image comprising of pictures and

Text/graphics are reconstructed back for the usage and further process

on them.

You might also like