Professional Documents
Culture Documents
M.Tech.
in
COMMUNICATION SYSTEMS
By
DECEMBER 2010
BONAFIDE CERTIFICATE
in partial fulfillment of the requirements for the award of the degree of Master of
Technology in Communication Systems of the NATIONAL INSTITUTE OF
TECHNOLOGY, TIRUCHIRAPPALLI, during the year 2010-2011.
S. DEIVALAKSHMI
Early detection of breast cancer increases the survival rate and increases the
treatment options. One of the most powerful techniques for early detection of breast
cancer is based on digital mammogram. In order to detect the breast cancer, the
Radiologist usually searches the mammograms visually for specific abnormalities.
However, visual analysis of mammograms is difficult task for radiologists. Computer
Aided Diagnosis (CAD) technology helps in identifying and assists the radiologists to
make final decision.
The proposed CADx system involves three major steps called Lesion Detection,
Feature extraction and Classification.
Malignant and benign masses are abnormal/tumour cells present in the breast. While
malignant are treated as cancerous tumours and benign are non-cancerous. Now
Classification to judge whether Benign or Malignant using Canonical Discriminant
analysis for all five feature sets is performed and their classification rates are
compared. In short, five classification schemes are discussed.
The proposed method can allow the radiologist to focus rapidly on the relevant parts
of the mammogram and it can increase the effectiveness and efficiency of radiology
clinics.
Finally I would like to thank to all teaching staff and my classmates and
computer support group staff, for their sincere help, without whom I am unable to
complete this project
TABLE OF CONTENTS
Title Page No
ABSTRACT………………………………………………….... i
ACKNOWLEDGEMENTS…………………………………... ii
LIST OF FIGURES………………………………………….... v
LIST OF TABLES…………………………………………….. ix
ABBREVIATIONS……………………………………………. x
CHAPTER 1 INTRODUCTION
1.1 Motivation…………………………………………………………........... 1
1.2 Objectives and Approach…………………………………………………. 3
1.3 Study Outline …………………………………………………………. 3
3.1 Pre-Processing……………………………………………………………. 9
3.2 Lesion Detection………………………………………………………….. 15
3.3 Region selection…………………………………………………………… 19
CHAPTER 4 FEATURE EXTRACTIONS & CLASSIFICATION
4.1 Feature extraction………………………………………………………….. 21
4.2 Feature classification………………………………………………………. 24
CHAPTER 5 RESULTS AND DISCUSSION
REFERENCES 57
LIST OF FIGURES
3.14 Dashed line indicates the PDF pI (x). Two solid lines indicate
4.3 Steps to determine CDF for all four feature sets extracted from
respective channels………………………………………………….26
4.4 Steps to determine CDF for feature set 1extracted from Lesion……27
Mammograms ……………………………………………………….38
5.19 Lesion and its one stage level-1 DWT decomposition for
mdb002 using db4…………………………………………………...40
5.20 Lesion and its one stage level-1 DWT decomposition for mdb028
using db4…………………………………………………………….41
5.21 CDF histogram plot for a) Benign & b) Malignant groups using
feature set-1………………………………………………………….42
5.23 CDF histogram plot for a) Benign & b) Malignant groups using
feature set-2………………………………………………………….46
5.25 CDF histogram plot for a) Benign & b) Malignant groups using
feature set-3………………………………………………………….48
5.27 CDF histogram plot for a) Benign & b) Malignant groups using
feature set-4………………………………………………………….50
5.28 Classification result for validation dataset (unknown
mammograms) (N=4)………………………………………………50
5.29 CDF histogram plot for a) Benign & b) Malignant groups using
feature set-5………………………………………………………….52
set 1 to 5………………………………………………………………………………………………….55
LIST OF TABLES
CC Craniocaudal
DA Discriminant Analysis
DS Discriminant score
Breast cancer is the second leading cause of cancer death in women today (after
lung cancer). An estimated 40,230 breast cancer deaths are expected in 2010. .According to
National Cancer Institute, one out of eight women will develop breast cancer during her
lifetime.
Breast cancer stages range from stage 0 (very early form of cancer) to state IV
(advanced, metastatic breast cancer).Early stage breast cancer are associated with high
survival rates than late stage concerns.
The key to surviving breast cancer is early detection and treatment. According to
ACS, when breast cancer is confined to the breast, the five-year survival rate is almost 100%.
Breast cancer screening has been shown to reduce breast cancer mortality. The high survival
rates of early detection of breast cancer can be attributed to utilization of mammography
screening as well as high level of awareness of the disease symptoms in the population.
Beginning in their early 20s, women should be told about the benefits and
limitations of breast self-examination (BSE).For women in their 20s and 30s, it is
recommended that clinical breast examination (CBE) be part of a periodic health
examination, preferably at least every three years. Asymptomatic women aged 40 and over
should continue to receive a clinical breast examination as part of a periodic health
examination, preferably annually and prior to mammography and to begin annual
mammography at age 40.
1.1 Motivation
Fig. 1.1 Structure of Breast Fig 1.2 CC view & MLO view.
Fig. 1.3 Two basic views of mammographic image: (a) CC view, (b )MLO view.
The ultimate aim of the CADx system is to help the radiologist in making
recommendations for patient management.
CAD systems consist primarily of the following processing stages such as Pre-
processing, Segmentation, Feature extraction and classification.Pre-processing is performed
to reduce and suppress noise , to enhance mammogram and to remove Background Region
in MLO view of mammogram. Segmentation is nothing but Lesion detection is performed
using adaptive threshold technique . Now from detected ROIs of respective mammograms
shape features are extracted and processed towards classification of abnormality.
The database consists of 40 mammograms with 20 Benign and 20 Malignant cases.
Five feature sets were extracted from ROIs and provided as input to classification stage
using DA. Classification results were compared. After knowing Canonical Discriminant
functions for all five feature sets Analysis is performed on database of 65 unknown
mammograms as validation process and the algorithem for CADx is finalized based on most
significant feature set.
This project thesis is organized as follows. Chapter 2 reviews the literature and
background of breast cancer and CAD systems in mammography. The Materials and
Methods used in this study are discussed in Chapter 3 and Chapter 4. Chapter 3 deals with
first part of lesion detection and Chapter 4 about Feature extraction and classification
towards classifying given Lesion. Chapter 5provides the results and discussion and the
Chapter 6 concludes thesis with future direction.
CHAPTER-2
LITERATURE REVIEW
In this chapter, important literatures on the CADx system and their algorithms in
mammography are reviewed. Along with that The Literature required for proposed CADx
system are reviewed.
SPSS ver. 14 manual on algorithms titled “Discriminant” explains all steps involved toward
Classification based on CDF coefficients.
Ingrid Daubechies (1987) invented first smooth orthogonal wavelet with compact support
now known as db-N family. In her text “Ten Lectures on Wavelets” she explains theory
behind wavelets and give nice tour to wavelet era.
Olivier Rioul (1993) described Multiresolution analysis and synthesis for discrete time
signals, in “A Discrete-Time Multiresolution Theory”. Concepts of scale and resolution are
first reviewed in discrete time. The resulting framework allows one to treat the discrete
wavelet transform, octave-band perfect reconstruction filter banks, and pyramid transforms
from a unified standpoint.
Xiao-Ping Zhang and Mita D. Desai (2001) has suggested a general systematic method for
the detection and segmentation of bright targets, in “Segmentation of Bright Targets Using
Wavelets and Adaptive Thresholding”. A method is developed which adaptively chooses
thresholds to segment targets from background, by using a multiscale analysis of the image
probability density function (PDF). A performance analysis based on a Gaussian distribution
model is used to show that the obtained adaptive threshold is often close to the Bayes
threshold. The method has proven robust even when the image distribution is unknown.
Examples are presented to demonstrate the efficiency of the technique on a variety of
targets.
H.D. Cheng et.al (2003) surveyed most important part of CADx algorithm in “Computer-
aided detection and classification of Micro calcifications in mammograms: a survey”. In
that paper they summarized and compare the methods used in various stages of the
computer-aided detection systems (CAD). In particular, the enhancement and segmentation
algorithms, mammographic features, classifiers and their performances are studied and
compared. Remaining challenges and future research directions are also discussed.
Gonzalez R. et.al (2004) discussed detail discussion regarding shape and margin features in
chapter 11 of the text “Digital image processing using MATLAB”.
Alfonso Rojas Dominguez & Asoke K. Nandi (2008) presented a method for automatic
detection of mammographic masses, in “Detection of masses in mammograms via
statistically based enhancement, multilevel-thresholding segmentation, and region
selection”. As part of this method, an enhancement algorithm that improves image contrast
based on local statistical measures of the mammograms is proposed. After enhancement,
regions are segmented via thresholding at multiple levels, and a set of features is computed
from each of the segmented regions. For feature extraction he used shape and margin based
properties
Jelena Bozek et.al (2009) surveyed Algorithms, in “A Survey of Image Processing Algorithms
in Digital Mammography”. This chapter gives a survey of image processing algorithms that
have been developed for detection of masses and calcifications. An overview of algorithms
in each step (segmentation step, feature extraction step, feature selection step,
classification step) of the mass detection algorithms is given. Wavelet detection methods
and other recently proposed methods for calcification detection are presented. An overview
of contrast enhancement and noise equalization methods is given as well as an overview of
calcification classification algorithms.
B. Surendiran et.al (2009) performed Discriminant Analysis for classifying the masses
present in mammogram, in “Classifying Digital Mammogram Masses using Univariate
ANOVA Discriminant Analysis”. This approach combines the19 shape properties of the mass
regions and classifies the masses as benign or malignant using Univariate ANOVA. The DDSM
database along with ground truth details are used for experiment. According to which,
Malignant and benign masses are abnormal/tumour cells present in the breast. While
malignant are treated as cancerous tumours and benign are non-cancerous.
Kai Hu et.al (2010) proposed novel algorithm towards Lesion detection, in that work they
proposed combination of two thresholding segmentations; in “Detection of Suspicious
Lesions by Adaptive Thresholding Based on Multiresolution Analysis in Mammograms”;
i.e., a coarse segmentation and a fine segmentation, to segment suspicious lesions in
multiscale images First use the coarse segmentation to get a rough representation of the
localization of suspicious lesions and then use the fine segmentation to improve the rough
representation to generate more precise segmentation results. This algorithm avoids the
deficiencies of the histogram-based and the window based thresholding algorithms and
improves the segmentation accuracy effectively.
CHAPTER-3
LESION DETECTION
CAD system consists of a few typical steps depicted in Fig. 3.1. The screen film
mammographic images need to be digitized prior the image processing. This is one of the
advantages of digital mammography where the image can be directly processed
Pre-processing
Segmentation
Feature extraction
Feature selection
Classification
Pre-processing
algorithms
In this chapter we shall discuss all algorithms undergone towards Lesion Detection.
After getting Region portion user need to decide whether to decide mammogram as normal
or to send it for further classification. Key point is that for normal mammogram after Lesion
detection either only Black image will appear (means all zeros) or may contain noise or
region from background region.
3.1 Pre-Processing
To remove noise
To remove background
Median filter is non-linear filter and is efficient in removing salt-and pepper noise. Median
tends to preserve the sharpness of image edges while removing noise. It is found that the
noise is removed effectively as the size of the window increases. Also, ability to supress
noise only at the expense of blurring of edges
The Median Filter block replaces the central value of an M-by-N neighbourhood with its
median value.
Fig. 3.4 images with salt and pepper noise and after passing through Median filter
Fig. 3.5 original image and BW1,BW2 (Binary versions with different Threshold)
Algorithm:
If input image is I,
Then,
Skin line=BW1-BW2
Now after detecting skin line it is necessary to detect type of MLO view; whether its Left
sided(LMO) or Right sided( RMLO).
Step 1:
Input image will undergo through both RMLO and LMLO test and we will get 2 images,
I1=‘RMLO’
I2= ‘LMLO’
Step 3: If pixel is black then replace it with white and move to next pixel and repeat step 3,
Step 5: replace current pixel with black and move to next pixel.
Step 6: Repeat step 2 to 4 for next row unless you exceed all rows
Step 2: scan from right most column1 towards left side (here next pixel means Left one).
Step 3: If pixel is black then replace it with white and move to next pixel and repeat step 3,
Step 5: replace current pixel with black and move to next pixel.
Step 2:
To remove background portion, apply Mask1 obtained after knowing type of MLO.
Fig 3.9 Masking to remove background-Mask1 & after masking mammogram (Image2)
3.1.4 To remove rib portion
Fig 3.10: After Local threshold for rib bone removing (Image 3) & after removing rib (Image4)
Algorithm:
Step 1
Image2 apply local threshold i.e. convert it to binary with Threshold =173. (Fig
3.10:Image3)Do not consider first 200 rows, for Label Removing.
Step 2
As we know View is Left or Right, Scan from right or left direction for respective cases.
Step 3
Now travel up to First zero pixel and replace all travelled pixels with zero.
Step 4
a) For every POLE ,it should not exceed no. of pixel travelled by previous row
b) Now if at all Rule a is violating for consecutive 5 times then by keeping 45o in
mind decrease pole position for next row by 1 and replace all pixels with zero up
to calculated POLE
After performing these steps you will get image as shown in Fig. 3.8
Step 5
Step 6
Fig. 3.11 Rib portion (RIB Part) , after removing Rib (Pre-processed mammogram) and Given
mammogram
3.2.1 Segmentation
The aim of the segmentation is to extract ROIs containing all masses and locate the
suspicious mass candidates from the ROI. Segmentation of the suspicious regions on a
mammographic image is designed to have a very high sensitivity and a large number of false
positives are acceptable since they are expected to be removed in later stage of the
algorithm Researchers have used several segmentation techniques and their combinations.
1. Thresholding Techniques
2. Region-Based Techniques
3. Edge Detection Techniques
4. Hybrid Techniques
In Thresholding Techniques, Global Thresholding, Local Thresholding, Local adaptive
techniques and based on Multiresolution analysis adaptive Thresholding Techniques are
there. Local thresholding is slightly better than global thresholding and Adaptive
thresholding are better than other and this way Adaptive threshold based on
Multiresolution analysis is superior to other methods
According to Zhang and Desai (2001), after the mammograms are wavelet
transformed the gray-level distribution of the target and the background regions of the
images approaches to Gaussian distribution.
Pre-processing
Fig. 3.13 Block diagram for adaptive segmentation method adapted by Zhang
The segmentation of possible targets can be modelled by the following classification
problem. For an ideal image I(m,n), there are pixels belonging to two classes: 1) the
background Cb and 2) the target Ct.
Here,
pb(x) PDF of class Cb
P(Cb) a priory probability of class Cb in image I
pt(x) PDF of class Ct
P(Ct) a priory probability of class Ct in image I
Fig3.14 Dashed line indicates the PDF pI (x). Two solid lines indicate P (Cb )pb(x) and P(Ct)pt
(x), respectively. The Bayes threshold _ and the proposed threshold _ are indicated.
Assuming that fb(x) and ft(x) have one point of intersection, as illustrated in Fig. , the
above classifier is equivalent to the following threshold detection criterion:
I(m, n)<λ :I(m, n)ε (Pixel belong to background class)
I(m, n)>λ :I(m, n)ε (Pixel belong to target class )
Where,
fb(λ) = ft(λ)
And segmented image at scale j using Bayes classifier can be expressed as,
Iseg,j(m,n)= 1, Iseg,j(m,n)> λ
=0, Iseg,j(m,n)< λ
Wavelet transforms are used in the new method and the Bayes classifier is
employed for the segmentation problem. An approach for choosing the threshold adaptively
by looking for the global local minima of the PDFs of wavelet transformed images is
proposed. Based on the assumption of Gaussian distributions, the adaptive threshold by the
new method is compared with the Bayes threshold. It is shown that in general practical
cases, the performance of the proposed threshold is often very close to the Bayes threshold,
which is the optimal threshold from the statistical point of view
Fig. 3.15 Bayes threshold λ1 and the proposed candidate threshold λ3 are indicated.
Pre-processed Mammogram
Histogram
Smoothing (moving
average)
Second derivative
Global Thresholding
Threshold=first zero
using given Threshold
crossing from right side
Segmented Portion
Now, Feature Extraction & classification are done using selected Region or Lesion . Here
Radiologist need to take decision whether to send it for feature extraction & classification to
find out type of abnormality i.e. Benign or Maligant. Next Chapter will deal with Remaining
part of proposed CADx algorirhm.
CHAPTER-4
FEATURE EXTRACTION & CLASSIFICATION
After selecting Lesion part as explined in last chapter ; in this chapter Feature
extraction & classification algorithms are explained. In this work five different platforms are
given for classification of Lesion and their classification rates are compared both for Known
database (training using 40 sample mammograms) and Unknown database (Validation
using 65 sample mammogram). This chapter will introduce those platforms and in later part
of thesis we will discuss Results and comparison.
LESION
Benign or Malignant
Fig 4.1 shows the algorithm which is followed by all five classification
schemes . Instead of classifying Lesion only based on shape based feature extraction and
classification using DA on Lesion for other 4 platforms shape based feature extraction and
classification using DA are performed on all 4 DWT Level1 channels. And their respected
features are called as feature set (N); where N ranges from 1 to 5.
In this step for given input image containing ROI, following 13 features are extracted.
Set of 13 Feature is nothing but Feature set (N), where N is decided based on input Image
applied,
h
2 Lesion_CA
h 2
g 2 Lesion_CHD
Lesion
Fig 4.2 one stage, 2d- DWT Decomposition with db4 family
Where,
After extracting features from feature set (N) for all Training cases i.e. mammograms
with known abnormality (we called it as ground truth.) we determined unstandardized
coefficients (N) along with group centroids (N). Now using any platform in other words any
classification scheme (N: 1 to 5); we can determine Discriminant score (N) for unknown
mammogram. And classification is done based on Threshold rule which is obtained using
group centroids (N).
For n1 Benign & n1 malignant cases which are there in known database or Training cases,
Where,
=A1G
C=
A2
B = T –W
Then, V = U -1X
f=D0+XD
Now after substituting Xinput value in obtained CDF we will get finput; this value is
nothing but Discriminant Score (DS) for given input feature vector .
Now,
If DS > Thd
If DS < Thd
Known Database –
20 Benign & 20
Malignant cases
Lesion_CA Lesion_CHD
Training Database
– 20 Benign & 20
Lesion Malignant cases
- Feature set 1
Unstandardized
coefficients(1)
Fig 4.4 Steps to determine CDF for feature set 1 extracted from Lesion (N=1)
Unstandardized coefficients
(N)
Classification based on CDF
Benign or Malignant
Fig 4.5 Steps involved for classification of given Lesion by determining Discriminant
score (N) - Validation process.
Figure 4.3 & Figure 4.4 describes algorithm adapted while training period towards
obtaining CDF & threshold. Figure 4.5 guides the methodology applied to classify
mammogram after training period is over i.e. testing phase. Now hereby by means of
chapter 3 and chapter 4; I proposed CADx system used for Lesion detection and classification
in mammogram based on Adaptive threshold and DA.
In next chapter we shall see Results and Discussion towards implementation part of
proposed CADx system.
CHAPTER-5
RESULTS AND DISCUSSION
As proposed method was explained and implemented in 2 parts ; this section also
we will discuss in 2 parts; First part will deal with Results Lesion detection & in later results
obtained during classification.
The mammogram images used in this experiment were taken from the mini
mammography database of MIAS (http://peipa.essex.ac.uk/ipa/pix/mias/). All images are
held as 8-bit gray level scale images with 256 different gray levels (0-255) and physically in
portable gray map (pgm) format with size 1024 pixels x 1024 pixels.
5.1 Lesion detection
LL1
LH1
HL1 HH1
In Next part 3 mammograms of type Normal, Benign and Malignant; namely MDB135,
MDB226 & MDB115 are used as examples for Lesion detection.
Now to calculate Threshold 2, we need to scale result according max pixel value present in
original image. To find out Threshold 1 LL1 image is enhanced such a way that its minimum
pixel value will be zero and maximum is 255 (fig. 5.1);
And Now again to correlate that threshold in special domain of given mammogram.
Fig. 5.14 (MDB115) Derivative plot of histogram of LL1 channel of given mammogram-
Threshold 1=231
Threshold 2= Threshold 1*235/255= 231*225/255 = 204
As per notations in previous Chapter, here P=13 features & n1=20 cases. Now for
performing DA or evaluating CDF for all five feature set training dataset of 40
mammograms are used; from which lesion part is detected first, such 20 benign and 20
malignant cases (detected Lesion part) are shown in Figure 5.17 and 5.18 respectively.
For validation part classification is done based on discriminant score obtained and
Threshold corresponding to respected platform (N 1 to 5) which is chosen for feature
extraction and classification. For validation database 38 Benign and 27 malignant cases
from MIAS database are considered ,which we previously used for Lesion detection
algorithm. In training part for one case in each group feature vectors corresponding to
each feature set is extracted. Which are tabled in Table 5.1 and Table 5.2.
Now Secction 5.2.1 to 5.2.5 summarizes the result obtained in Feature classification
part for respective platforms (N=1 to 5). Where to evaluate feature set N, 13 features
are extracted from corrosponding input image.
And you will come up with Feature vectors, column wise in Table 5.2.
Now select your N, i.e. your platform and by adapting procedure explained in previous
Chapter you can form matrix A1 and A2 and by performing steps described earlier
Group centroids & Canonical Discriminant function can be identified, by knowing
unstandardized coefficients and constant.
20 Benign cases
Fig. 5.17 Lesion part detected from training dataset of 20 Benign mammograms
20 Malignant cases
Fig 5.18 Lesion part detected from training dataset of 20 malignant mammograms
Fig 5.19 Lesion and its one stage level-1 DWT decomposition for mdb002 using db4.
Feature set Feature set Feature set Feature set Feature set
Feature 1 2 3 4 5
Area 5599 2098 2098 2098 2098
Perimeter 719.74 279.14 279.4 279.14 279.14
rmin 13.11 6.19 3.81 1.23 15.08
rmax 105.37 51.37 20.22 2.02 50.44
convexarea 9026 2778 2788 2778 2788
eno -10 0 0 0 0
ect 0.97 0.96 0.96 0.96 0.96
en 0.13 0.2 1.28 128.5 0.21
solidity 0.62 0.76 0.76 0.76 0.76
c1 0.4 0.5 1.28 12.79 0.51
dp 0.02 0.02 0.01 0 0.02
esd 111.96 159.17 35.07
Lesion_Ca 41.41 26.58
Lesion_Chd
si 3.42 2.72 6.9 69.08 2.77
Lesion_Cvd Lesion_Cdd
Fig 5.20 Lesion and its one stage level-1 DWT decomposition for mdb028 using db4.
Feature Feature set 1 Feature set 2 Feature set 3 Feature set 4 Feature set 5
Area 6153 1893 1893 1983 1893
Perimeter 336.78 171.88 171.8 171.88 171.88
Rmin 34.81 10.28 0.93 0.92 18.31
Rmax 53.97 34.3 3.31 12.27 26.75
Convex area 6590 1988 1988 1988 1988
Eno 1 1 1 1 1
Ect 0.54 0.53 0.53 0.53 0.53
En 0.53 0.4 43.22 3.14 0.66
Solidity 0.93 0.95 0.95 0.95 0.95
C1 0.82 0.72 7.42 2 0.92
Dp 0.01 0.02 0 0.01 0.01
Esd 97 21.2 29.88 31.35 135.24
Si 3.12 2.51 29.57 7.01 3.21
5.2.1 Feature Classification –Feature set 1
Unstandardized
coefficients
Area 0.000136053
Perimeter 0.002093029
rmin 0.032719388
Benign Group Centroid = 1 &
rmax -0.006014262
Malignant Group Centroid = - 1
convexarea -0.000136239
eno 0.048984735
ect 0.229593731
en 1.948941623
solidity 6.236294906
c1 -18.67941482
dp 50.23437946
esd -0.074649649
si 0.574652977
constant 10.25357206
Threshold 0
Discriminant score is calculated using unstandardized coefficients and extracted
feature set.For MDB028: Feature set 1(Table 5.1 first column)
DS=-1.58861251052099
Now, (DS< Threshold) which belongs to Malignant group as its centroid is < Threshold.
Therefore given mammogram is classified in malignant group, which holds true as per
ground truth.
Fig 5.21: CDF histogram plot for a) Benign & b) Malignant groups using feature set-1
Predicted Group
Membership Total
type .00 1.00 .00
Original Count .00 16 4 20
1.00 4 16 20
% .00 80.0 20.0 100.0
1.00 20.0 80.0 100.0
Cross- Count .00 14 6 20
validated(a) 1.00 6 14 20
% .00 70.0 30.0 100.0
1.00 30.0 70.0 100.0
80.0% of original
grouped cases
correctly classified.
70.0% of cross-
validated grouped
cases correctly
classified.
a Cross validation is done only for those cases in the analysis. In cross validation, each case is
classified by the functions derived from all cases other than that case.
Classification rate
(Validation rate)
= 63.079%
Fig 5.22: Classification result for validation dataset (unknown mammograms) (N=1)
F=R-GT
Unstandardized
coefficients
Area -0.001296512
Perimeter -0.007166919
Rmin 0.05612585
Rmax -0.052294284
convexarea 0.001359857
Eno 0.126160304
Ect -3.019447119
en -2.244389496
solidity 14.81575929
c1 -5.089357262
dp -148.6243504
esd 0.056719493
si 1.327386923
constant -13.20729376
Threshold 0
Predicted Group
Membership Total 87.5% of original grouped cases
type .00 1.00 .00 correctly classified.
Original Count .00 18 2 20
1.00 3 17 20
% .00 90.0 10.0 100.0 72.5% of cross-validated groupe
1.00 15.0 85.0 100.0 cases correctly classified.
Cross- Count .00 14 6 20
validated(a) 1.00 5 15 20
% .00 70.0 30.0 100.0
1.00 25.0 75.0 100.0
Fig 5.23: CDF histogram plot for a) Benign & b) Malignant groups using feature set-2
Classification rate
(Validation rate)
= 33.84%
Fig 5.24: Classification result for validation dataset (unknown mammograms) (N=2)
Unstandarized
coefficients
Area 0.000513
Perimeter -0.00761
rmin 0.075119
rmax -0.01532 Benign Group Centroid = 1.8
convexarea -0.00028
eno 0.023116 Malignant Group Centroid = - 1.8
ect 2.107331
en 0.009295
solidity -12.1234
c1 -0.82893
dp 143.4587
esd 0.029166
si 0.129811
constant 8.274857
Threshold 0
Fig 5.25: CDF histogram plot for a) Benign & b) Malignant groups using feature set-3
Classification rate
(Validation rate)
= 67.69%
Fig 5.26: Classification result for validation dataset (unknown mammograms) (N=3)
Unstandarized
coefficients
Area -0.00019
Perimeter -0.01045 Benign Group Centroid = 0.96
rmin 0.094519
rmax -0.02209 Malignant Group Centroid = - 0.96
convexarea 0.000349
eno -0.06987
ect 4.259276
en 0.007625
solidity -3.66475
c1 -0.79561
dp -85.9958
esd -0.02715
si 0.14592
constant 3.963368
Threshold 0
Fig 5.27: CDF histogram plot for a) Benign & b) Malignant groups using feature set-4
Classification rate
(Validation rate)
= 69.23%
Fig 5.28: Classification result for validation dataset (unknown mammograms) (N=4)
Unstandarized
coefficients
Area -0.000324559
Perimeter -0.017567012 Benign Group Centroid = 1.44
rmin 0.21035515
rmax -0.098717383 Malignant Group Centroid = - 1.44
convexarea 0.000721538
eno -0.674295328
ect 4.619567598
en -0.008232511
solidity -18.20872834
c1 0.649064184
dp 64.58390762
esd -0.065053371
si -0.087046045
constant 16.34716167
Threshold -1.78E-15
Predicted Group
Membership Total 92.5% of original
grouped cases
type .00 1.00 .00
Original Count .00 19 1 20 correctly classified.
1.00 2 18 20
80.0% of cross-
% .00 95.0 5.0 100.0
1.00 10.0 90.0 100.0 validated grouped
Cross- Count .00 17 3 20 cases correctly
validated(a) 1.00 5 15 20
% .00 85.0 15.0 100.0 classified.
1.00 25.0 75.0 100.0
Fig 5.29: CDF histogram plot for a) Benign & b) Malignant groups using feature set-5
Classification rate
(Validation rate)
= 75.38%
Fig 5.30: Classification result for validation dataset (unknown mammograms) (N=5)
Cross-validated
original grouped grouped case Validation
classification rate classification rate classification rate
Feature set
1 80 70 63.08
Feature set
2 87.5 72.5 33.84
Feature set
3 97.5 85 67.69
Feature set
4 82.5 60 69.23
Feature set
5 92.5 80 75.38
CHAPTER 6
After completing training part using CDF functions and respective thresholds one can
classify given mammogram’s Lesion part either Benign or Malignant.
Our CADx will allow radiologist to select any one of the classification scheme.
Where decision can be taken by giving different weightage to different schemes and finally
one can conclude.
All Front end and back end algorithms are implemented in MATLAB environment.
Figure 6.1 shows GUI of developed CADx system.
Matlab ver.7 is used in which Image processing, wavelet and signal processing toolboxes are
used.
Apart from that for Discriminant analysis SPSS package ver. 14 is used whose results
matches with one with Matlab implementation, but SPSS software found to be more faster
and user friendly.
Select mammogram
Lesion
Select classification
Scheme
Result:
Type of abnormality
Fig. 6.1 GUI interface for CADx (Trained).
CHAPTER-7
CONCLUSION AND FUTURE WORK
Proposed CADx system is implemented with the help of Matlab toolbox.Figure 7.1
compares classification rates obtained by using DA on respective feature set. It is clearly
visible that feature set 3 & 5, which are nothing but 13 shape based features extracted
from Horizontal and Diagonal detail channels obtained , after one stage DWT
decomposition using db4 wavelet family of Lesion in given mammogram; gives better
result rather than simple feature extraction using Lesion itself (feature set 1).
120
100
80
original grouped classification
rate
60
Cross-validated grouped case
classification rate
40 Validation classification rate
20
0
Feature set Feature set Feature set Feature set Feature set
1 2 3 4 5
Fig. 7.1 Comparison chart for classification of Lesion using DA on Feature set 1 to 5
As part of further work, one can explore different mother wavelet or wavelet
families so as to improve classification rate. Also, instead of using statistical method of
classification ANN model can be generated.
REFERENCES
3. Ingrid Daubechies (1992) “Ten Lectures on Wavelets”, society for industrial and
applied mathematics.
6. H.D. Cheng, Xiaopeng Cai, Xiaowei Chen, Liming Hu, Xueling Lou (2003) ;
Computer-aided detection and classification of Micro calcifications in mammograms:
a survey, Pattern Recognition 36 -2967-2991
7. Gonzalez R. C., Woods R. E., Eddins S. L. (2004), Digital image processing using
MATLAB.
9. Jelena Bozek , Mario Mustra ,Kresimir Delac and Mislav Grgic (2009); A Survey of
Image Processing Algorithms in Digital Mammography, Recent Advan. In Mult. Sig.
Process. And Communication, SCI 231, pp. 631-657.
10. B. Surendiran et.al (2009); Classifying Digital Mammogram Masses using Univariate
ANOVA Discriminant Analysis. Int .Conf. on Advances in Recent Technologies in
Communication and Computing.
11. Kai Hu et.al (2010); Detection of Suspicious Lesions by Adaptive Thresholding Based
on Multiresolution Analysis in Mammograms, IEEE Trans. On Instrument and