You are on page 1of 3

A hierarchical classifier for visual label identification

based on binary features

R Massen and J. Gassler

In order to control the flow of photocassettes in an automatic sorting system, a local photo printer
laboratory has developped a unique moulded container which is able to accept and to maintain in a
fixed position the different film cassette types currently in use : 24/36 roll films, pocket size , minipocketThe containers pass different automatic magazining stations equipped with robotic handlers.
Each handler has the task to pick up a specific class of film and stack it into a magazine for further
processing. Our goal was the optical recognition of the film type by company brand, number of
exposures (24/36 exposures) and sensitivity (100/200 ASA) . 30 classes have to be automatically
recognized in 0,s sec each.Some 1O.OOO films are classified per day.
The only cue available for an optical classification was the printed patterns on the 1abel.As colour
was not available, we choose 3 types of visual greylevel features for automatic classification:
a) the average brighmess in a small window; BRIGHT
b) the presence of an edge in a small window; EDGE
c) the presence of a 4 or a 6 digit in a small window;4
All features are binarized in a first step: feature a) by comparing the average intensity in the 20 by
20 pixel window to an average in a much larger neighbourhood; feature b) by comparing the local
variance in the small window against a futed threshold; feature c) by matching the window to 2 templates with a 4 and 6 reference digit and deciding for the highest degree of correlation.We thus
obtain for each window a set of 3 binarized features expressing the presence of a light or dark background ( BRIGHT ), the presence of an edge (EDGE ) and the presence of a 24 exposure film ( 4).
If the classification is critical, we mark the window as non classifieable, so that we actually have a
ternary result for the BRIGHT, the EDGE and the 4 feature in every window.
The position and the shape of typ. 30 windows is fmed interactively by looking at difference images
and deciding for those parts of an image which are the most powerful in discriminating several
classes of films (Fig. 1)

Fig.1 Ternary features for brightness, edge and 4 digit are measured in 20 fixed sized
windows to classify 30 different types of film cassettes
R.Massen J. Gbsler Transfer Centre Constance for Image Processing D-7750CONSTANCE

The classification is based on the principle of exclusion.Every window feature can discriminate
between two sets of classes, each set containing at least one member. Non-Classifiable windows
are just discarded.The simplified situation in Fig2 may demonstrate our approach.We want to classify 4 types of films:Kodak-24,Fuji-24,Fuji-36
and Agfa-100.A set of 6 feature windows is used: 3
BRIGHT windows (F14, F23, F34) , 2 EDGDE windows (F12, F13) and one 4 window (F24).
Feature F14 is able to discriminate between class 1 and class 4, feature F23 between class 2 and
class 3 etc.
class I
class 2
BRIGHT
Kodak-2

Fuji-24

class 3

Agfa-100

class 4

Fuji-36

EDGE

Fig. 2 Simplified example for the classification of 4 types of films using 6 windows.
We arrange the features Fij into a binary decision matrix (Fig. 3).Iffeature F12 is TRUE,it excludes
the presence of class 2. If feature F12 is FALSE! it excludes the presence of class 1

F i j = false

\I

F i j = true

classes
__ _.
....

Fig3 Binary decision m a 6 for excluding a class depending on the binarised window feature

We still have a free choice with what window we should start the classification procedure.We compute a measure for the power of discrimination D(Fij) for each window feature which takes into
account the following a-priori knowledge: the joint probability of class i and class j (p(Cij)), the
total number of class pairs which can be discriminated by a single feature Fij (Nij) and the computing time needed to calculate a feature Fij (Tij):

Dij = p(Cij) * Nij / Tij


Features pointing to classes with high absolute probability and which are able to discriminate between large sets of classes and which can be computed in a short time are ranking at the top.
Classification starts in this ranking order. The features are only extracted at this moment, so that
only features which are really used have to be computed. Fig.4 shows the binary decision tree with
the ranked features.After each decision (node in the graph), the remaining set of features is reordered
in an updated ranking list and classification steps forward to the next node.The graph also shows that
the set of features is redundant. This redundancy is used as an additional check by backuacing trough
the classification me to check the consistency of the remaining features. If f.i. we have found that a
class1 film is present through the inspection of the window features F12,F34 and F13, we backtrace
through features F14 which must be FALSE for a consistent result. Non-consistent results lead to a
rejectdecision; the films are routed to a manual inspection station.

0 Q(?JO(?JO
Fig. 4 Ranked classification tree for example of Fig2 and Fig. 3
This classifier was implemented on a 68000 VME-Bus vision system developped at the Transfer
Centre and installed on-line. Classification time took 400 ms in the average with typically 92% of
non-ambiguous correct classification rate, 8% of reject rate and an un-measurable rate of misclassification.The rather high reject rate resulted from a significant number of films , where the labels
had been tom off or written on by the customer.It was no problem to classify those films by hand.

You might also like