A Multicriteria Weighted Vote-Based Classifier Ensemble for Heart Disease Prediction

A Multicriteria Weighted Vote-Based Classifier Ensemble for Heart Disease Prediction

0.00 Avg rating0 Votes
Article ID: iaor20164046
Volume: 32
Issue: 4
Start Page Number: 615
End Page Number: 645
Publication Date: Nov 2016
Journal: Computational Intelligence
Authors: , ,
Keywords: decision theory: multiple criteria, decision, forecasting: applications, statistics: regression
Abstract:

The availability of a large amount of medical data leads to the need of intelligent disease prediction and analysis tools to extract hidden information. A large number of data mining and statistical analysis tools are used for disease prediction. Single data‐mining techniques show acceptable level of accuracy for heart disease diagnosis. This article focuses on prediction and analysis of heart disease using weighted vote‐based classifier ensemble technique. The proposed ensemble model overcomes the limitations of conventional data‐mining techniques by employing the ensemble of five heterogeneous classifiers: naive Bayes, decision tree based on Gini index, decision tree based on information gain, instance‐based learner, and support vector machines. We have used five benchmark heart disease data sets taken from UCI repository. Each data set contains different set of feature space that ultimately leads to the prediction of heart disease. The effectiveness of proposed ensemble classifier is investigated by comparing the performance with different researchers' techniques. Tenfold cross‐validation is used to handle the class imbalance problem. Moreover, confusion matrices and analysis of variance statistics are used to show the prediction results of all classifiers. The experimental results verify that the proposed ensemble classifier can deal with all types of attributes and it has achieved the high diagnosis accuracy of 87.37%, sensitivity of 93.75%, specificity of 92.86%, and F‐measure of 82.17%. The F‐ratio higher than the F‐critical and p‐value less than 0.01 for a 95% confidence interval indicate that the results are statistically significant for all the data sets.

Reviews

Required fields are marked *. Your email address will not be published.