You are on page 1of 8

International Journal of Application or Innovation in Engineering & Management (IJAIEM)

Web Site: www.ijaiem.org Email: editor@ijaiem.org


Volume 4, Issue 7, July 2015

ISSN 2319 - 4847

Automatic Cancer Detection Using


Segmentation, Supervised and Unsupervised
Techniques
V.S.Takate1, M.B.Anap2
1

Assistant Professor, Instrumentation & Control Engineering Department, Pravara Rural Engineering College, Loni,
Maharashtra,India

Assistant Professor, Instrumentation & Control Engineering Department, Pravara Rural Engineering College, Loni,
Maharashtra,India

Abstract
This paper proposed a new diagnosis technique for Brain cancer detection. The features are extracted with supervised and
unsupervised classification techniques. Also segmentation technique is used for diagnosis of the brain as normal or abnormal.
The MRI images are given as a input to DWT for filtering. Then filtered image is given to PCA(Principle Component Analysis)
to reduce the size of the matrix obtained from DWT. The supervised technique FP-ANN(Feed forward back propagation
artificial neural network) and unsupervised technique K-NN(K-Nearest neighbor) are used to predict whether the tumor is
normal or abnormal. Also the segmentation of the input brain images are done using K-means clustering to identify the mass
of tissues or tumor. The features such as skewness, kurtosis, mean, variance, Standard deviation, Energy, Entropy are
extracted. Also the Sensitivity, Specificity, Accuracy, PPV(Positive Predictive value),NPV(Negative Predictive value).FDR(False
discovery rate) and MCC(Mathews correlation coefficient) are calculated. The accuracy of the proposed system is found to be
98.5% with k-NN and 95.71with FP_ANN.

Keywords: Brain Cancer detection, DWT, FP-ANN, K-NN, MRI

1. INTRODUCTION
Cancer affects people at all ages with the risk for most types increasing with age. It caused about 13% of all human
deaths in 2011 (7.6 million). Cancers are primarily an environmental disease with 90-95% of cases due to lifestyle and
environmental factors and 5-10% due to genetics. Common environmental factors leading to cancer death include:
tobacco (25-30%), diet and obesity (30-35%), infections (15-20%), radiation, stress, lack of physical activity,
environmental pollutants. These environmental factors cause abnormalities in the genetic material of cells. Brain is the
kernel part of the body. Brain has a very complex structure. Brain is hidden from direct view by the protective skull.
This skull gives brain protection from injuries as well as it hinders the study of its function in both health and disease.
But brain can be affected by a problem which cause change in its normal structure and its normal behavior .This
problem is known as brain tumor. Brain tumor causes the abnormal growth of the cells in the brain. The cells which
supplies the brain in the arteries are tightly bound together thereby routine laboratory test are inadequate to analyze the
chemistry of brain. Brain cancer is the biggest problem now a days. It refers to the abnormal growth of cells in Brain to
form a brain tumor. Early detection of brain cancer is helpful to reduce the mortality rate of death of human being.

2. BRAIN CANCER
Brain cancer is the second most cancer among all cancer which is increasing day by day, is the big problem in front of
radiologist [4]. Brain is a kernel part of the body. All the functions of body are governed by the brain. Brain can
affected by some problems, due to which some mass formed inside the it called as Brain Tumor. To analyze it, whether
it is malignant or benign is very important.

3.MRI (MAGNETIC RESONANCE IMAGING ) IMAGES


Brain tissues are very soft tissues which are protected by skull[1]. There are different images of brain such as computer
tomography , X-Ray ,Biopsy but MRI Images are used because of soft tissue.MRI Images are superior than any other
imaging technique.MRI does not cause harmful effect of radiation. The Hydrogen Ion radiation is used for Magnetic
resonance Imaging.Image intensity in MRI depends upon four parameters. One is proton density (PD) which is
determined by the relative concentration of water molecules. Other three parameters are T1, T2, and T2*relaxation,
which reflect different features of the local environment of individual protons.

Volume 4, Issue 7, July 2015

Page 108

International Journal of Application or Innovation in Engineering & Management (IJAIEM)


Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 4, Issue 7, July 2015

ISSN 2319 - 4847

4.TECHNIQUE USED
Segmentation of brain and feature extraction are very important in brain cancer analysis.DWT is used to decompose the
morphological components of the image. Haar wavelet is used to remove noise. Using PCA the size of the matrix is
reduced. The unsupervised technique such as k-NN is used for classification. FP-ANN is used as a supervised
technique for classification[1]. Features like Energy, Entropy, Skewness, Kurtosis, Mean ,Variance, standard deviation
are calculated. In second part, K-means clustering the median filter is used to remove noise with any damage. Kmeans clustering give the extracted part of tumor.

Figure 1. Process Flow Diagram


This paper has organized as follows: Section II presents the Methodology of the proposed system. In section III the
result and discussions oriented to feature extraction are given. Finally the conclusion is given in section IV

5. METHODOLOGY
The methodology of proposed system is as follows:
A. wavelet transform
A wavelet transform have properties like Sub-band coding, Multi resolution analysis, Time frequency localization [3].
The wavelet is a powerful mathematical tool for feature extraction, and has been used to extract the wavelet coefficient
from MR images. Wavelets are localized basis functions, which are scaled and shifted versions of some fixed mother
wavelets. The main advantage of wavelets is that they provide localized frequency information about a function of a
signal, which is particularly beneficial for classification [3].The Haar Wavelet is the simplest of all wavelet transform.
In this the low frequency wavelet coefficients were generated by averaging the two pixel values and high frequency
coefficients those were generated by taking half of the difference of the same two pixels. The four bands obtained were
LL, LH, HL, HH shown in figure. The LL band was called as approximation band, which was consisting low frequency
wavelet coefficients and also important part of the spatial domain image.
Consider a real or complexvalue continuous time function ( t ) with the following properties.
1. Zero Avarage: The function integrates to zero.

( t ) dt 0

(1)

2. It is a square integral or equivalently, has a finite energy


2

( t ) dt 1

Volume 4, Issue 7, July 2015

(2)

Page 109

International Journal of Application or Innovation in Engineering & Management (IJAIEM)


Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 4, Issue 7, July 2015

ISSN 2319 - 4847

f t
Then the Function ( t ) is a mother wavelet. Let f t be any square integral function, the CWT of with respect to
wavelet ( t ) is defined as ,
W (a, b)

Where,

* tb
1
f (t )
(
) dt
a
a

(3)

=square integral function.

f t

( t ) =complex

1
a

value continuous function.


a, b are translation parameters.
Denotes complex conjugate.
a=time scaling or dilation variable.
b=time shift or translation
=normalizing factor

B. Haar Wavelet
It is a certain sequence of functions. It is the first known wavelet. This sequence was proposed in 1909 by Alfred Haar.
It is a simplest possible wavelet. The technical disadvantage is that it is not continuous and therefore not differentiable.
This property can however be an advantage for the analysis of signals with sudden transitions such as monitoring of
tool failure in machines.
The Haar wavelets mother wavelet ( t ) can be described as
( t ) {10 t

And scaling function

(t ) can

1
2

(4)

be defined as
(t ) 1

0 t 1

(5)

Figure 2. Haar Wavelet

Figure 3.Image Decomposition Using DWT


h (n)=Impulse response of low pass filter
A=Approximate Coefficient

Volume 4, Issue 7, July 2015

Page 110

International Journal of Application or Innovation in Engineering & Management (IJAIEM)


Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 4, Issue 7, July 2015

ISSN 2319 - 4847

g(n)=Impulse Response of High pass filter


D= Detailed Coefficient
C. PCA (Principle Component Analysis)
After decomposition the image is given as a input to PCA algorithm to reduce the dimensionality of the matrix. Figure
3 shows the different stages of Principal Component Analysis. Features are dimensionally reduce the component from
the extracted features. Using that algorithm we can reduce the 1024 features into seven essential features. It is basically
dimensionality reduction technique.

Figure 4.Image Resizing Using PCA


Figure 4 represents the result at the output of discrete wavelet transform and Principal analysis.

Figure 5. DWT +PCA


D. K-means clustering
The segmentation of the MRI brain images are done to extract the tumor[4]. Segmentation is a process of partitioning
a digital image into multiple segments or set of pixels. Segmentation is used to locate objects and boundaries in the
image. So that the tumor extracted and the calculated properties are match with each other. Segmentation is done using
k-means clustering method[6]. Segmentation of Input MRI images are shown here graphically.

Figure 6.Image Segmentation Using K-means Clustering

Volume 4, Issue 7, July 2015

Page 111

International Journal of Application or Innovation in Engineering & Management (IJAIEM)


Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 4, Issue 7, July 2015

ISSN 2319 - 4847

E. K-NN(K-Nearest Neighbor)
1. K-NN is a non parametric or one of the simple unsupervised machine learning method of classification.
2. K-NN is a one of the simplest classification technique.
3. Classification is done by determining the k closest training vectors according to a suitable distance metric.
4. Then the input vector which is obtained by PCA is then assigned to a class to which the majority of those k
Nearest neighbors belong to.
5. The algorithm for the k nearest neighbor rule is given below:
Assume that x will be the unknown feature vector and a distance measure, then
Out of N training vectors, identify the k nearest neighbors.
out of these k samples, identify the number of vectors, ki that belongs to class wi , i=1,2,----------M.
3. Then assign x to the class wi with the maximum number ki of samples. Here we have used the Euclidean
distance measure.
F.FP-ANN(Feed forward back propagation Artificial Neural Network)
After that the image is given as a input to FPANN classifier.
Artificial neural network is a supervised technique for analysis.
An ANN is a mathematical model consisting of a number of highly interconnected processing elements organized
into layers, geometry and functionality of which have been resembled to that of the human brain.

Figure 7. The Architecture of Basic ANN

The basic architecture of Artificial Neural Network is shown in above figure.


x1,x2,x3 are the input features.
w will be the weight which is to be adjusted.
All the input layer, hidden layer and output layer are shown in the figure.
According the output required the weights in the hidden layer is to be adjusted.
The ANN may be regarded as possessing learning capabilities in as much as it has a natural propensity for storing
experimental knowledge and making it available for later use.
The most frequently used training algorithm in classification problems is the back-propagation (BP) algorithm.
The neural network has been trained to adjust the connection weights and biases in order to produce the desired
mapping.
At the training stage, the feature vectors are applied as input to the network and the network adjusts its variable
parameters, the weights and biases, to capture the relationship between the input patterns and outputs.
The MRI brain image is taken as a input image. Then DWT is used for filtering that image. Then classifier FPANN is used to classify the image as a normal or abnormal.
The neural network which was employed as the classifier required had three layers.
The first layer consisted of 7 input elements in accordance with the 7 feature vectors that selected from the wavelet
coefficients by the PCA.
The number of neurons in the hidden layer was four.
The single neuron in the output layer was used to represent normal (Benign) and abnormal(Malignant) human
brain .

Volume 4, Issue 7, July 2015

Page 112

International Journal of Application or Innovation in Engineering & Management (IJAIEM)


Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 4, Issue 7, July 2015

ISSN 2319 - 4847

Figure 8.Performance Plot using Neural Network

6. Feature Extraction
Features were extracted for prediction of benign and malignant tumor. Features like mean, Variance, Standard
deviation, Energy, Entropy, Skewness, Kurtosis are calculated. For random variable X with n outcomes (xi=1,2,---------,n)
n
Mean : xiP ( xi )
I 0

(6)
n

S tan dardDeviation : 2

(xi2 )p(xi)

(7)

I0

Skewness : 3

3 n
3
( xi ) p ( xi )
I 0

(8)

Kurtosis : 4

4 n
4
( xi ) p ( xi )
I 0

(9)

n
Energy : E [ P ( xi )2 ]
I 0

(10)

n
Entropy : H p(xi) log2[ p(xi)]
I 0

(11)

A Database have been implemented on a real human brain MRI dataset. All the input dataset used for classification
consists of axial,T2-weighted, 256 -256 pixel MR brain images. These images were collected from the Harvard Medical
School website (http:// med.harvard.edu/AANLIB/).Some Database is collected from the Pravara Medical Trust, Loni.
We have study 300 cases using both the methods. Above properties are calculated for all the cases which is helpful for
diagnosis of Brain cancer as Normal or Malignant.
TABLE1.Extracted Features Values For 20 Samples
Test
Image

Mean

Varian
ce

Std.
Deviati
on
157.20

Entr
opy

Kurt
osis

Skewn
ess

Energy

Sta
tus

Img 1

136

Img 2

120

24689.
94
23396.
14

1.10

1.42

1.14

1.15

1.94

1.30

4433068
3.55
3878944
9.18

153.03

Img 3

139

25177.
25

158.75

1.25

1.40

1.13

4557851
8.12

Img 4

137

25315.
10

159.18

1.25

1.37

1.12

4514428
4.86

Img 5

140

24923.
92

157.95

1.18

1.37

1.12

4563217
4.65

Img 6

167

22770.
61

150.97

0.93

1.28

1.10

5194521
5.31

Img 7

140

25493.
98

159.74

1.19

1.41

1.13

4643001
8.59

Img 8

132

25324.
80

159.21

1.09

1.41

1.13

4388966
4.51

Volume 4, Issue 7, July 2015

Page 113

International Journal of Application or Innovation in Engineering & Management (IJAIEM)


Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 4, Issue 7, July 2015
Img 9

131

25532.
28

159.86

1.14

1.49

1.16

4387119
4.47

Img 10

128

25275.
64

159.06

1.07

1.51

1.17

4280067
2.68

Img 11

65.06

0.73

1.35

1.12

62.06

0.74

1.36

1.13

Img1 3

62.29

60.95

0.78

1.42

1.14

Img1 4

62.73

60.66

0.77

1.44

1.15

Img 15

64.85

60.72

0.73

1.44

1.16

Img1 6

65.16

60.36

0.72

1.41

1.15

Img1 7

64.01

59.76

0.70

1.39

1.14

Img1 8

62.96

57.83

0.72

1.35

1.13

Img1 9

65.23

57.47

0.68

1.30

1.11

Img 20

66.28

57.50

0.72

1.27

1.10

8395732
.87
8197981
.31
7775737
.28
7794506
.76
8079573
.82
8076179
.89
7849994
.45
7481091
.51
7737956
.64
7881710
.92

64.48

3965.7
3
3848.0
1
3712.3
4
3676.6
0
3684.2
1
3640.0
4
3568.6
1
3341.3
8
3300.5
8
3303.8
2

63.00

Img 12

ISSN 2319 - 4847

M
M
M
M
M
M
M
M
M

Table 1 gives the calculated values of different properties which are helpful to find out the tumor is either Benign or
malignant. Out of 20 samples first 10 are normal and 11-20 are abnormal Images. For 20 Sample MRI Images the
above properties are graphically plotted here.

Figure 8. Graphical Representation of Extracted Features

Volume 4, Issue 7, July 2015

Page 114

International Journal of Application or Innovation in Engineering & Management (IJAIEM)


Web Site: www.ijaiem.org Email: editor@ijaiem.org
Volume 4, Issue 7, July 2015

ISSN 2319 - 4847

7.Conclusion
A Wavelet Transform is used to decompose the Image with Haar wavelet. Segmentation of image by K-means
clustering is a good method for finding the Tumor. Principle component analysis reduces the size of Image for
calculation of Features. Also TP,TN,FP,FN values help to calculate the accuracy of the system. The data table created
are helpful for predicting normal and abnormal Brain Images programmatically. K-NN as a supervised classifier and
FP_ANN as a unsupervised classifier help to increase the accuracy of the system. The accuracy of the system is found to
be 98.5% with k-NN and 95.71with FP_ANN. This method will add huge value in decreasing Brain Cancer detection
cost as it is most accurate with minimum error.

References
[1] Yudong Zhang , Zhengzhou Dong Lenan Wua, Shuihua Wanga ,"A hybrid method for MRI brain image
classification" Elsevier journal , Expert Systems with Applications 38 (2011) 1004910053.
[2] Rafael Gonzalez, Richard Woods, Steven Editions Digital Image Processing using MATLAB 2007"
[3] Raghaveer m.rao Ajit. s. bopardikar wavelet transform, introduction to theory and applications .Pearson
education Asia, pp1-22.,2002.
[4] Biomechanics & biophysics of cancer cells science direct ,acta material 55,pp 3989- 4014,2007.
[5] William K. Pratt, Digital Image Processing PRATT Third edition,pp1-15.
[6] T. Logeswari & M. Karnan "An improved implementation of brain tumor detection using segmentation based on
soft computing "Journal of Cancer Research and Experimental Oncology Vol. 2(1) pp. 006-014, March, 2010.
[7] Qurat-ul-ain, ghazanfar latif, sidra batool kazmi, m. Arfan jaffar, Anwar m. Mirza, "Classification and
Segmentation of Brain Tumor using Texture Analysis" Recent advances in artificial intelligence, knowledge
engineering and data bases, ISSN: 1790-5109, ISBN: 978-960-474-154-0,PP.147-155.

Volume 4, Issue 7, July 2015

Page 115

You might also like