Heart disease classification comparison among patients and normal subjects using machine learning and artificial neural network techniques

doi:10.15406/ijbsbe.2021.07.00216

International Journal of

eISSN: 2573-2838

Biosensors & Bioelectronics

Mini Review Volume 7 Issue 3

Heart disease classification comparison among patients and normal subjects using machine learning and artificial neural network techniques

Pavan Kota,¹ Aishwarya Madenahalli,² Rachana Guturi,³ BT Nukala,⁴

Verify Captcha

Regret for the inconvenience: we are taking measures to prevent fraudulent form submissions by extractors and page crawlers. Please type the correct Captcha word to see email ID.

Sunil Nagaraj,⁴ Santosh Kota,⁴ Purna Chandra Neeli⁵

¹University of North Carolina Greensboro, USA
²VF Corporation, USA
³IEEE Member, USA
⁴Qorvo Inc, USA
⁵TessolveDTS Inc, INDIA

Correspondence: BT Nukala, Qorvo Inc, USA

Received: May 04, 2021 | Published: May 12, 2021

Citation: Kota P, Madenahalli A, Guturi R, et al. Heart disease classification comparison among patients and normal subjects using machine learning and artificial neural network techniques. Int J Biosen Bioelectron. 2021;7(3):77-79. DOI: 10.15406/ijbsbe.2021.07.00216

Download PDF

Abstract

Machine Learning (ML) and Artificial Neural Networks (ANN) have been successfully used for classifications in many of the prediction models. These algorithms provide good accuracy results in many of the applications like fall detection.¹ In this work, we compared the performance of ML and ANN on the Heart Disease data available from Cleveland database.²The data has 76 attributes, but we have considered 13 best features by doing correlation and selecting the features that has best correlation index to train our ML and ANN. There are 303 persons data, and we trained our algorithms with 80% of training data and 30% for testing. SVM showed 84% accuracy with sensitivity of 78.5% and specificity of 87.8% whereas ANN gives an accuracy of 87% with sensitivity of 85% and specificity of 88.2%. Overall, both ML and ANN give good accuracy results to distinguish people from with and without heart disease.

Keywords: machine learning, artificial neural network, heart disease and classification

Introduction

Cardiovascular disease (HD) is among the most complex and life-deadliest diseases that affect human beings. Due to this, the heart can no longer pump enough blood to fulfill the entire body's functions without heart failure.³ According to the American Heart Association, the United States has a very high heart disease rate.⁴ A heart disease's symptoms are shortness of breath, weakness of body parts, swelling of feet, fatigue, and associated signs, caused by functional cardiac or non-cardiological anatomical abnormalities; for example, elevated jugular venous pressure and peripheral edema.⁵Identifying heart disease early on was challenging for investigators, and the results of complex testing impacted life-standards.⁶In developing countries, the diagnosis and treatment of heart disease are often complex, especially since diagnostic apparatus is often unavailable, as are doctors and other resources, resulting in less proper prediction and treatment of heart patients.⁷It is essential to reduce the potential risks associated with heart disease and improve heart security by accurately and properly diagnosing heart disease risk in patients.⁸In order to resolve these complexities in invasive-based diagnosing of heart disease, a noninvasive medical decision support system based on machine learning predictive models such as support vector machine (SVM) and artificial neural network (ANN)^9,10 has been developed by various researchers and widely used for heart disease diagnosis, and due to this machine-learning-based expert medical decision system, the ratio of heart disease death decreased.¹¹

Many algorithms have been used for prediction models and out of which, ML and ANNs have the best training models to successfully distinguish between the actual and predicted class. As said before, the work in this paper mainly focuses on the performance comparison of ML and ANN on the available Heart disease dataset from Cleveland database. This data set has 303 persons data and has different attribute information such as age, sex, chest pain type, blood pressure, cholesterol in mg/dl, blood sugar, maximum heart rate etc.

Materials and methods

The aim of the proposed classification solution is to design a machine-learning-based medical intelligent decision support system for the diagnosis of heart disease. In the present study, various machines learning predictive models such as ANN, SVM, have been used for classification of people with heart disease and healthy people. We have used 13 attributes which serve as the inputs to our training algorithms out of which the 13^th attribute is a target value of either 1 (presence of heart disease) or 0 (absence of heart disease). We have used a custom feature which is Cholesterol/age as an additional feature in our training set. Data correlation has been performed among all the 13 attributes to see how strong each attribute has with each other attribute. The methodology of the proposed system structured into five stages including (1) feature selection, (2) cross-validation method, (3) machine learning classifiers, and (5) classifiers’ performance evaluation methods. The data correlation among the training attributes has been shown in the Figure 1. Both SVM and ANN were performed in Python 7.1.

Figure 1 Correlation matrix of the input attributes set.

Support vector machine

In our SVM, we used a linear kernel function for classification and used Training-Testing (80%-20%) on the dataset. Support Vector Machine (SVM) is a method for patterns recognition/classification on two categories with supervised learning. SVM-light, one of the implementations of SVM proposed by Thorsten Joachims^12,13is applied to the classification of patients of heart disease with normal persons. In Linear SVM, a hyper-plane or a set of hyper-planes can be used as the separate lines in classification. The higher the margin of separation for the classes that can be created, the better the classification result that can generally be achieved for the Linear SVM.

Artificial neural network

For training the feed-forward ANN classifier, back propagation was applied according to Duda et al.¹⁴and a 3-layer system was picked as the standard BP ANN.¹⁵The input layer of the network has twelve neurons which correspond to the twelve input feature values. There is one hidden layer holding 8 hidden neurons, which number was optimized by adjusting the size of hidden neurons, and two output neurons corresponding to the two target classes the network needs to differentiate.

Results and discussion

After performing the training of both ML and ANNs, the algorithms have been compared with performance index such as accuracy, sensitivity, and specificity. The results are shown in Table 1. ANN shows better results compared to SVM. In both the classifiers, the training and testing is 80-20% and therefore there are a total of 61 test samples for algorithms to perform classification. The indexes have been calculated using confusion matrix shown in Figure 2. For each SVM and ANN algorithms. The confusion matrix for SVM and ANN are shown in Figure 3 & Figure 4. The equations for Accuracy, Sensitivity and specificity have been calculated using the below equations.

Performance index	SVM	ANN
Accuracy	83.60655738	86.8852459
Sensitivity	78.57142857	85.18518519
Specificity	87.87878788	88.23529412

Table 1 Performance comparison of SVM and ANN

Figure 2 Definition of confusion matrix.

Figure 3 Confusion matrix of SVM classifier.

Figure 4 Confusion matrix of ANN classifier.

$A c c u r a c y = \frac{T P + T N}{T o t a l}$ Equation 1

$S e n s i t i v i t y = \frac{T P}{T P + F N}$ Equation 2

Sensitivity (recall): This measures the actual members of the class that are correctly identified as such. It is also referred to as the true positive rate (TPR). It is defined as the fraction of positive examples predicted correctly by the classification model. Classifiers with large sensitivity have very few positive examples misclassified as the negative class.

Specificity: This is also known as the true negative rate. It is defined as the fraction of total negative examples that are predicted correctly by the model/classifier.

$S p e c i f i c i t y = \frac{T N}{T N + F P}$ Equation 3

Conclusion

In this study, a hybrid intelligent machine-learning-based predictive system was proposed for the diagnosis of heart disease. The system was tested on Cleveland heart disease dataset. Two well-known classifiers SVM and ANN were studied with feature selection. Based on our analysis, using performance indices ANN shows accuracy of 88% compared to SVM accuracy of 84%. Feed forward Neural Network with three-layer network has better sensitivity and specificity than linear SVM. Both sensitivity and specificity of ANN has more than 85% accuracy rate. Research work on this topic focuses on the development of a diagnosis system. It is based on two classifiers, one cross-validation method, and performance measurement metrics. The system was tested on Cleveland heart disease dataset to classify HD and healthy subjects. Designing a decision support system through machine-learning-based method will be more suitable for diagnosis of heart disease. Some irrelevant features slowed down the diagnosis system and made computation time longer. In the future, we will perform more experiments to increase the performance of these predictive classifiers for heart disease diagnosis by using others feature selection algorithms and optimization techniques.