Study and Analysis of Prediction Model for Heart Disease Data Using Machine Learning Techniques
- 1 SASTRA University, India
Abstract
Heart disease is the number one cause of death for all communities of individuals in advanced countries and a major problem for emerging nations too. Doctors’ availability to care for the general population could not catch up with the present demand for healthcare. So, there is a severe need for a support system to assist save individuals. With novel ML frameworks and big data repositories, our motive is to design a machine learning model to predict heart disease at the earliest, help prioritize hospital consultations and improve accuracy. For this study, several analyzes were carried out on the Cleveland heart disease data set with 303 patients records, using five different classifiers namely Support Vector Machine (SVM), Random forests, Ordinal Regression, Logistic Regression and Naïve Bayes. Feature selection using chi- squared statistical test and correct tuning of hyperparameters maximized classification accuracy of the Support vector machine (Radial basis function) from 40% to 85%. By incorporating rules based on the statistical patterns observed, the efficiency was further enhanced to 95%. On the other side, seeing it as a 5-class classification, multi-class imbalance issue was addressed using suitable sampling techniques that resulted in 96% accuracy for 5-class data. We evaluated model efficiency using k-fold cross validation and confusion matrix. This study shows that the classification accuracy could be significantly improved by balancing the dataset using sampling and by properly tuning hyperparameters after feature selection.
DOI: https://doi.org/10.3844/jcssp.2020.344.354
Copyright: © 2020 B. Santhi and K. Renuka. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
- 3,834 Views
- 1,924 Downloads
- 0 Citations
Download
Keywords
- Machine Learning
- Cardio Vascular Disease
- Cardiology
- Support Vector Machine
- Accuracy
- Hyperparameter Optimization
- Sampling Techniques