You are on page 1of 24

CHAPTER V

RESULTS AND DISCUSSION

5.1 NAIVE BAYES EXPERIMENTAL RESULT


Artificial Intelligence techniques have been shown to perform well in many
applications. Data mining should be applicable to any kind of information
repository. The performance of classification algorithm is usually examined by
evaluating the accuracy of the classification. According to Nave Bayes classifier,
(2011) the Bayesian networks are used to construct classifiers from a given set of
training examples with class labels. In the current research work, the risk of
diabetic patients getting heart diseases is estimated using different data mining
classification techniques.
In the first experiment, in the current work Weka has been used as a tool
and the results of our Nave Bayes experiment are shown in Figure 5.1. The
proposed nave Bayes model was able to classify 74% of the input instances
correctly. It exhibited a precision of 71% on an average, recall of 74% on an
average, and F-measure of 71.2% on an average. The results show clearly that the
proposed method performs well compared to other similar methods in the literature,
considering the fact that the attributes taken for analysis are not direct indicators of
heart disease.

88

Figure 5.1 Result screen for Nave Bayes classification in WEKA

89

5.2 SVM EXPERIMENTAL RESULT


In the second experiment, SVM classifier has been used with radial basis function kernel
to diagnose heart disease vulnerability in diabetic patients with reasonable accuracy and the
result of ROC curve for the classifier characteristics in Weka tool is shown in Figure 5.2.

Figure 5.2 ROC curve for the SVM classifier characteristics

Receiver Operating Characteristics (ROC) graph is a technique for


visualizing, organizing and selecting classifiers based on their performance. ROC
analysis has been extended for use in visualizing and analyzing the behavior of
diagnostic systems. An attempt is made to determine not only accuracy but also
looking for the receiver operating characteristic or simply ROC curve, which is a
90

graphical plot which illustrates the performance of a binary classifier system as its
discrimination threshold is varied. It is created by plotting the fraction of true
positives out of the positives True positive rate vs. the fraction of false positives
out of the negatives false positive rate , at various threshold settings. TPR is also
known as sensitivity and FPR is one minus the specificity or true negative rate.
ROC graphs are two-dimensional graphs in which true positive rate is plotted on
the Y axis and false positive rate is plotted on the X axis. An ROC graph depicts
relative tradeoffs between benefits (true positives) and costs (false positives).
According to John Peter et al. (2012), confusion matrix displays the number
of correct and incorrect predictions made by the model compared with the actual
classifications in the test data. The matrix is n-by-n, where n is the number of
classes and from that we calculated the accuracy of each classification algorithms.
Table 5.1 A simple confusion matrix table
Predicted class

Actual Class

C1

C1
True positives

C2
False negatives

C2

False positives

True negatives

Table 5.1 shows a confusion matrix for a two-class classification problem.


Thuy Thi Thu Nguyen (2009) described a confusion matrix, in which if the instance
is positive and it is classified as positive, it is counted as a TP, if it is classified as
negative, it is counted as a FN. If the instance is negative and it is classified as
negative, it is counted as a TN, if it is classified as positive, it is counted as a FP. In
order to measure the performance of a medical test, true positives and true
negatives are correct classifications. For e.g., some of the people have the disease,
and our test says they are positive. They are called true positives. Some have the
disease, but the test claims they do not. They are called false negatives. Some don't
have the disease, and the test says they do not - true negatives. Finally, there might
be healthy people who have a positive test result - false positives.
91

This analysis will result in preventing diabetic patients from being affected
by heart diseases, thereby resulting in low mortality rates as well as reduced cost on
health for the state. SVMs have proven to be a classification technique with
excellent predictive performance and also have been investigated with the help of
ROC curves for both training and testing data.
The confusion matrix indicating the accuracy of the SVM classifier for the
given data set is shown in Table 5.2.
Table 5.2. The confusion matrix of the SVM classifier

True low

True high

Class precision

pred.
low

355

24

93.67%

pred.
high

118

97.52%

class
recall

99.16%

83.10%

Overall accuracy: 94.60% +/- 2.01% (mikro: 94.60%)

From the results obtained, it can be seen that the classifier exhibits a very
high classification accuracy i.e 94.60% overall. It also shows a very high precision
for the positive class (97.52%) and also the recall of the positive class is quite good
(83.10%). In the case of negative classes, the classifier exhibits high precision
(93.67%) as well as high recall (99.10%).
The possibility of diagnosis of heart disease vulnerability in diabetic
patients with reasonable accuracy has been shown. Classifiers of this kind can help
in early detection of the vulnerability of a diabetic patient to heart disease, thereby
the patients can be forewarned to change their lifestyles. This results in preventing
diabetic patients from being affected by heart disease, in turn resulting in low
92

mortality rates as well as reduced cost on health for the state. SVMs have proved
to be a classification technique with excellent predictive performance and also have
been investigated with the help of ROC curves for both training and testing data.
Hence, this SVM model can be recommended for the classification of the diabetic
dataset.
5.3 COMPARING SUPPORT VECTOR MACHINE AND DECISION TREE
EXPERIMENTAL RESULTS
In the third experiment, Rapid miner has been used as a tool due to its
learning operators and operator framework, which allows forming nearly arbitrary
processes. Apart from accuracy are trying to determine the ROC, AUC and Lift
chart for measuring the performance.
5.3.1 ROC / AUC and Performance for the SVM classifier
In data mining and association rule learning, lift is a measure of the
performance of a targeting model at predicting or classifying cases as having an
enhanced response measured against a random choice targeting model. Lift is
simply the ratio of target response divided by average response. This operator
creates a lift chart based on a plot for the discredited confidence values for the
given example set and model.
The AUC of a classifier is equivalent to the probability that the classifier
will rank a randomly chosen positive instance higher than a randomly chosen
negative instance. AUC is also used for comparing classifiers. However, ROC and
AUC use a single training and testing pair. It is an estimate of the probability that a
classifier will rank a randomly chosen positive instance higher than a randomly
chosen negative instance.
We have set up the value of C as 5 and as 1 for the SVM to operate in
the RBF (Radial Basis Function) kernel type and use the type C-SVC which is
standardized regular algorithm. A common method is to calculate the area under the
roc curve, abbreviated AUC. Since the AUC is a portion of the area of the unit
93

square, its value will always be between 0 and 1.0. However, because random
guessing produces the diagonal line between (0, 0) and (1, 1), which has an area of
0.5, no realistic classifier should have an AUC less than 0.5. It measures the
discriminating ability of a binary classification model. The larger the AUC, the
higher the likelihood that an actual positive case will be assigned a higher
probability of being positive than an actual negative case.

Figure 5.3 Area Under Curve of the SVM classifier

The result of Area under curve for the SVM classifier used in Rapid miner
tool is shown in Figure 5.3. The red color indicates ROC and blue color indicates
ROC threshold.

94

The performance of the SVM classifier indicating the accuracy with two
classes high and low for the given data set is shown in Table 5.3.
Table 5.3 Performance of SVM with an accuracy of 79.8 % (High and Low)
True low

True high

Class precision

pred.
low

755

209

78.32%

pred.
high

36

100.00%

class
recall

100.00%

14.69%

5.3.2 ROC / AUC and Performance for the Decision Tree

Figure 5.4 Area Under Curve of the Decision Tree


The result of Area under curve for the decision tree used in Rapid miner tool
is shown in Figure 5.4. In the case of decision tree the AUC is O.907, where as the
AUC of support vector machine is 0.710.

95

The performance of the decision tree indicating the accuracy with two
classes high and low using information gain as split parameter for the given data set
is shown in Table 5.4.
Table 5.4 Performance of Decision tree with an accuracy of 89.2 % (High and
Low) using Information gain as split parameter
True low

True high

Class precision

pred.
low

654

98.94%

pred.
high

101

238

70.21%

class
recall

86.62%

97.14%

As shown in Table 5.5, for the results of all two models, decision tree
appears to be most effective as it has the highest percentage of correct predictions
(89.2%) for patients with heart diseases, followed by support vector machine.
Table 5.5 Accuracy of SVM and Decision Tree (High and Low)
Technique

Accuracy
in
percentage

Decision tree

89.20

Support Vector
Machine

79.80

The use of Rapid miner has simplified the efforts on k-fold validation,
generation of AUC and ROC which has helped in proper evaluation of the

96

performance of the learning models to evaluate the best classifier so that it can be
further refined for better prediction.
5.4 COMPARING NAIVE BAYES, SUPPORT VECTOR MACHINE AND
DECISION TREE EXPERIMENTAL RESULTS
In the final experiment also Rapid miner has been used as a tool for
evaluating and comparing three classification techniques using three classes high,
medium and low with diabetic patient dataset to determine the possible ways to
predict the risk of heart disease for diabetic patients.
In general, The Bayes theorem formula is P(h/D)= P(D/h) P(h) / P(D)

where

P (h) - Prior probability of hypothesis h


P (D) - Prior probability of training data D
P (h/D) - Probability of h given D

and

P (D/h) - Probability of D given h

Naive Bayes algorithm uses the Bayes formula, which calculates the
probability of a patient record Y having the class label Cj. The label could be
High, Medium and Low.

P(label= Cj | Y ) = P(Y|label= Cj) * P(Cj) / P(Y)

97

5.4.1 Nave Bayes distribution graphs and tables


In Nave Bayes classification technique, we are assumed that the probability
distribution for an attribute follows a normal or Gaussian distribution.

Figure 5.5 Bayes distribution plot by age attribute

We have tried to take age, sex, smoking, alcohol, cholesterol HDL as the
prime attribute to evaluate Naive Bayes with the plot and it is shown respectively in
Figures 5.5, 5.6, 5.7, 5.8 and 5.9. In Figure 5.5, X axis denotes age and Y axis
denotes the density. At the age of 52, the risk is high, at the age of 54, the risk is
medium and at the age of 55 the risk is low.
98

Table 5.6 Nave Bayes distribution table for age attribute


Attribute

Parameter

Low

Medium

High

Age

Mean

54.92

53.62

51.24

Age

Standard
deviation

12.03

11.90

11.33

Similarly, its distribution table for above five attributes is shown in Tables
5.6, 5.7, 5.8, 5.9 and 5.10. In Figure 5.6, X axis denotes sex and Y axis denotes the
density.

Figure 5.6 Bayes distribution plot by sex attribute

99

Table 5.7 Nave Bayes distribution table for sex attribute


Attribute

Parameter

Low

Medium

High

Sex

value=M

0.61

0.55

0.49

Sex

value=F

0.40

0.45

0.50

Figure 5.7 Bayes distribution plot by smoking attribute

100

Table 5.8 Nave Bayes distribution table for smoking attribute


Attribute

Parameter

Low

Medium

High

Smoking

value= yes

0.163

0.151

0.173

Smoking

value= - (no)

0.485

0.459

0.534

Smoking

value= unknown

0.350

0.389

0.293

Figure 5.8 Bayes distribution plot by alcohol attribute

101

Table 5.9 Nave Bayes distribution table for alcohol attribute

Attribute

Parameter

Low

Medium

High

Alcohol

value= yes

0.09

0.08

0.04

Alcohol

value= - (no)

0.56

0.48

0.62

Alcohol

value= unknown

0.35

0.43

0.35

Figure 5.9 Bayes distribution plot by cholesterol HDL attribute


102

Table 5.10 Nave Bayes distribution table for cholesterol HDL attribute

Attribute

Parameter

Low

Medium

High

Cholesterol

mean

3.92

4.38

4.89

Cholesterol

Standard
deviation

0.78

0.86

0.97

5.4.2 Naive Bayes, Support Vector Machine, Decision Tree performances


In order to validate the final results obtained in the research presented,
experiments were carried out by combining the three techniques and the
performance of Bayes theorem, SVM and Decision tree are shown in Tables 5.11,
5.12 and 5.14 respectively.

Table 5.11 Performance of Bayes Theorem with an accuracy of 81.58%

True low

True
medium

True high

Class
precision

631

62

21

88.38%

39

98

26

60.12%

11

25

86

70.49%

92.66%

52.97%

64.66%

pred. low
pred.
medium
pred. high
class recall

103

Table 5.12 Performance of Support vector machine with an accuracy of 61.26%

True low

True
medium

True high

Class precision

398

25

15

90.87%

283

155

59

31.19%

59

92.19%

58.44%

83.78%

44.36%

pred. low
pred. medium
pred. high
class recall

The decision tree using various split methods such as Gain ratio,
Information gain and Gini index has been tried as shown in Table 5.13 which gives
different levels of accuracy.

Table 5.13

Accuracy by split methods using decision tree

Split method
criteria

Accuracy in
percentage

Classification
error in
percentage

Gain ratio

88.19

11.81

Information gain

90.79

9.21

Gini Index

87.69

12.31

104

For classification problems, it is natural to measure a classifiers


performance in terms of the error rate. The classifier predicts the class of each
instance. The correct class is counted as success and, if not, it is taken as an error.
The error rate is just the proportion of errors made over a whole set of instances,
and it measures the overall performance of the classifier. With the use of the
information gain as split parameter in decision trees, the results are exhibited by
average precision, recall and accuracy of this technique was found to be 90.79 %.

Table 5.14: Performance of Decision tree with an accuracy of 90.79% using


information gain as split parameter

pred. low
pred.
medium

True low
657

True
medium
25

True high
11

class
precision
94.81%

18

148

20

79.57%

12

102

85.00%

96.48%

80.00%

76.69%

pred. high
class recall

Niyati Gupta et al. (2013) have defined the accuracy as the proportion of
instances that are correctly classified. It is calculated by the total number of
correctly predicted high risk (true positive) and correctly predicted low risk
(true negative) over the total number of classifications. It can be calculated as
Accuracy = (TP + TN) / (TP + TN + FP + FN)
For a multiclass classification problem, TP, FP, TN and FN for each class i
are indicated as definitions. TPi, FPi, TNi, and FNi for the class i are also defined.
Then, certain parameters can be calculated to evaluate the multiclass classification
results accordingly. For e.g., True Positive Rate TPR, Precision and f-Measure
value for each class and the overall accuracy can be calculated.
105

Table 5.15 Accuracy of various classification techniques (High, Medium, Low)


Technique

Accuracy
in
percentage

Decision tree

90.79

Nave Bayes

81.58

Support Vector
Machine

61.26

As shown in Table 5.15, for the results of all three models, decision tree
appears to be most effective as it has the highest percentage of correct predictions
(90.79%) for patients with heart diseases, followed by followed by nave Bayes and
support vector machine. The performance in terms of graphs for accuracy,
precision, sensitivity, specificity and F-score are shown in figures 5.10, 5.11, 5.12,
5.13 and 5.14 respectively.

Figure 5.10 Performance in terms of accuracy


106

When more than two classes are dealt with, the accuracy alone might not be
sufficient. So evaluation of precision, sensitivity, specificity and F-score along with
accuracy to determine the right classifier has been done.
According to Sheik Abdullah et al. (2012) precision is the fraction of
retrieved instances that are relevant and recall is the fraction of relevant instances
that are retrieved.
The precision can be calculated as
Precision = TP / (TP + FP)
However, TP rate alone is not sufficient to fully measure performance of the
classifier in a single class. Therefore we compute Precision for class i as,
Precision i = TPi / (TPi + FPi)
The sensitivity is the proportion of positive instances that are correctly
classified as positive (e.g. the proportion of sick people that are classified as sick).
It is also called as Recall.
It can be calculated as
Sensitivity = TP / (TP + FN)
The specificity is the proportion of negative instances that are correctly
classified as negative (e.g. the proportion of healthy people that are classified as
healthy). It can be calculated as
Specificity = TN / (TN + FP)
F-score or F-measure is a measure of a test's accuracy and it is the harmonic
mean of precision and recall which can be calculated as
F-score = 2 * (Precision * Recall) / (Precision + Recall)

107

We can also compute F-Measure for class i as,


F=2 *(Precision i + TP_rate i) / (Precision i + TP_rate i)
The risks calculated are arrived at by using various classification techniques
as shown in Table 5.16 and detailed analysis of performance is shown in Table
5.17.
Table 5.16 Performance of sensitivity, specificity and F-Score
Classification

Accuracy

Precision

Sensitivity

Specificity

F-Score

Decision tree

90.79

0.86

0.84

0.93

0.84

Nave Bayes

81.58

0.72

0.69

0.85

0.70

Support Vector
Machine

61.26

0.71

0.61

0.80

0.65

Figure 5.11 Performance in terms of precision


108

Figure 5.12 Performance in terms of sensitivity

Figure 5.13 Performance in terms of specificity


109

Figure 5.14 Performance in terms of F-Score

Table 5.17 Performance of sensitivity, specificity and precision by Class


overall
accuracy
in
percent

Med

High

Decision tree

90.79

0.79

Nave Bayes

81.58

61.26

Classification
models

Support
Vector
Machine

Precision

Recall (Sensitivity)

per class Specificity

Low

Med

High

Low

Med

High

Low

0.85

0.94

0.80

0.76

0.96

0.95

0.97

0.87

0.60

0.70

0.88

0.52

0.64

0.92

0.92

0.95

0.68

0.31

0.92

0.90

0.83

0.44

0.58

0.57

0.99

0.84

110

5.5 SUMMARY
The main focus in this chapter is on the application of three different data
mining algorithms namely Nave Bayes, Support vector machine and Decision tree
on diabetes dataset to predict the risk of heart diseases based on their predictive
accuracy. Hence, a comparison of the outcomes of the various classification
techniques has been made and a higher degree of accuracy of the decision tree is
found. The performances are compared through accuracy, sensitivity, specificity
and F-score. For future research, stacking techniques can be used to increase the
accuracy of decision trees and reduce the number of leaf nodes.

111

You might also like