You are on page 1of 3

International Journal of Information Technology (IJIT) Volume 3 Issue 2, Mar - Apr 2017

RESEARCH ARTICLE OPEN ACCESS

Improved Heart Disease Prediction Used Data Mining Techniques


S.Sharmila [1], M.P.Indragandhi [2]
Research Scholar [1], Department of Computer Science, Mother Teresa Womens University, Kodaikanal
Assistant Professor [2], Department of Computer Science, Mother Teresa Womens University, Kodaikanal
Tamil Nadu - India

ABSTRACT
Data mining is highly effective field in health care organization. It is used to discover the knowledge from different data set.
Data mining is to find the hidden information of data. Heart disease is directly affect the heart of some medical condition like
smoking, obesity, etc., Data mining is to reduce the manual task in medical datasets and to improve the efficiency. The main
objective of this paper is to improve the Naive Bayes performance.
Keywords:- Data mining, Naive Bayes, Decision tree.

I. INTRODUCTION II. DATA SOURCE


Data mining is the process to extract the information The dataset set used in this research work is collected
from different dataset. It refers to mine the information from UCI machine repository. It is a repository of data
from large data set. Data mining technique is an analytical base and data generator. It contains only 14 attributes.
tool to predict the heart disease. It is used to maintain the
electronic data and it can also reduce the manual task. 1. Age
Medical data mining is critical and sensitive. Prediction 2. Sex
plays an important role in medical science Health care 3. Chest
industry. It is the backbone of our society. Health care 4. Resting blood pressure
industry has a large dataset about the patients. It generates 5. Serum cholesterol
the electronic patient records or devices. Data mining 6. Fasting blood pressure
provides a set of techniques and tools applied to the dataset 7. Resting electrocardiographic result
and find the hidden information. Heart disease is the major 8. Maximum heart rate achieved
problem in human life. The different data mining 9. Exercise include angina
techniques are used to identify and predict the heart disease. 10. Old peak
11. Slope
A. Heart Disease 12. Number of major vessels
Heart is the main organ in our body. If heart is not 13. Thal and
working properly it will affect the other part of human 14. Class
body such as brain, kidney etc. Life is depends upon the
well work of heart. Heart disease is main cause of death in
developing country due to work culture, change in life style
and food habits. According to the survey, India has highest III. TOOL
number of people affect the heart disease compare with Weka is a Waikato Environment for Knowledge
other countries. Most of the people death has experienced Analysis developed at the University of Waikato
symptoms. The medical practitioners need to predict the Newzealand. It is popular software for machine learning
heart disease before they occur in the patients. The main algorithm. Weka is free software supported for java
factors of heart disease are language. Weka contains tools for data pre-processing
classification, regression, clustering, association rules and
visualization. It is also supported for developing new
Smoking machine learning algorithm.
Lack of physical exercise
IV. DATA MINING ALGORITHM
High blood pressure Data mining uses two strategies supervised and
High cholesterol unsupervised learning. In Each data mining algorithm
Poor diet works for different purpose. The two common objectives
High sugar are classification and prediction. Classification model
predict the discrete values. Prediction model predict the
continuous values. Classification algorithms are decision

ISSN: 2454-5414 www.ijitjournal.org Page 38


International Journal of Information Technology (IJIT) Volume 3 Issue 2, Mar - Apr 2017
tree and neural network. Prediction algorithms are
regression association rules and clustering.
A. Naive Bayes
Naive bayes are based on machine learning and data
mining methods. By analyzing the contribution of each
independent attribute. This algorithm is used to create
model with predictive capabilities. It can be view as
descriptive and predictive algorithm. This approach is to
handle the missing value.
B. Decision Tree
Decision tree is simple and popular algorithm. It does
not require domain knowledge and parameter setting. It can
handle dimensional data. This algorithm affects the
repetition and replication. Our research work use 148
decision tree for classification.

Fig.2 Comparison of various Algorithms


Other measurement Tp rate , Fp rate, and time for ach
algorithm. Naive Bayes Tp rate, Fp rate were (0.837,
0.172) and taken time 0.03 second. Decision tree Tp rateFp
rate were (0.767, 0.240) and taken time 0.09 second.
Naive bayes score the highest Tp rate, Fp rate,
precision, recall and fast execution time compare with all
other algorithm. The result of this study we will select
Naive bayes classifier to predict the heart disease.

VI. PROPOSED SYSTEM

Fig. 3 Proposed method implementation


Fig. 1 Implmentation step
Proposed system explains a step by step process:
1. Dataset contains patient details.
V. COMPARISON OF ALGORITHM 2. Select the attribute for predict the heart disease.
To compare three algorithm used in the same 3. Pre-processing methods are applied to the
experiment and select the best one algorithm. Select the dataset.
best algorithm based on the accuracy, Tp rate, Fp rate, and 4. Naive bayes classifier are applied on pre
time to build a model. processed data to predict the heart disease.
Table I. COMPARISON OF VARIOUS ALGORITHMS
5. Measure the accuracy of Naive bayes classifier.

A. Algorithm B. Accuracy

C. Naive Bayes D. 83.7


The above table present the accuracy of the algorithm
for heart disease data. Naive bayes classifier algorithms
produce the 83.7% accuracy. Decision tree algorithm E. Decision Tree F. 76.6
produces the 76.6% accuracy.

ISSN: 2454-5414 www.ijitjournal.org Page 39


International Journal of Information Technology (IJIT) Volume 3 Issue 2, Mar - Apr 2017
VII. CONCLUSION AND FUTURE WORK
Our research is mainly focused on the data mining in
health care. Heart disease is main cause of death in the
people. The dataset are available in online UCI repository.
The classification algorithm naive bayes, decision tree are
applied to the same data set and it show the accuracy, Tp
rate, Fp rate and time to complete a execution. The Weka
tool is supported for arff dataset. The result is to improve
the accuracy of Naive Bayes algorithm.
In future work, to increase the accuracy of Naive bayes
and to improve the naive bayes classifier performance by
removing irrelevant attribute from the dataset.
REFERENCES

[1] Heart Disease Ddiagnosis using Predictive Data


Mining International Journal of Innovative
Research in Science, Engineering and Technology,
volume 3, special issues 3, March 2014.
[2] J.Vijayashreeand N.Ch.Sriman Narayana Iyengar
Heart Disease Prediction System Using
Data Mining and Hybrid Intelligent Techniques: A
Review International Journal of Bio-Science and
Bio-Technology Vol.8, No.4 (2016), pp. 139-148.
[3] Sonam Nikhar, A.M. Karandikar Prediction of
Heart Disease Using Machine Learning Algorithms
International Journal of Advanced Engineering,
Management and Science (IJAEMS) [Vol-2, Issue-
6, June- 2016].
[4] Infogain Publication. K.Manimekalai A Proficient
Heart Disease Prediction Method Using
Different Data Mining ToolsIJESC.
[5] Sonam Nikhar, A. M. Karandikar Prediction Of
Heart Disease Using Data Mining Techniques - A
Review International Research Journal of
Engineering and Technology (IRJET) Volume: 03
Issue: 02 | Feb-2016.
[6] S.B.Bhalerao1, DR. B.L.gunjal2 Survey of Heart
Disease Prediction Based on Data Mining
Algorithm
[7] Umair safique Data Minsingin Healthcare for
Heart Disease Intrnational Journal of Innovation
and Applied Studies vol .10 March 2014.
[8] Ms.Shide Swati B Decision Support
System On Hart Diseas Using Data Maining
Techniques International Journal of Engineering
Research and General Science, Volumn 3, March-
April 205.
[9] Jyoti SoniPredictive Data Mining for
Medical Diagnosis: An Overview of Heart Disease
Prediction International Journal of Computer
Applications (0975 Volume 17 No.8, March
2011.

ISSN: 2454-5414 www.ijitjournal.org Page 40

You might also like