Performance Analysis of Various Missing Value Imputation Methods on Heart Failure Dataset

Mohammad Al Khaldy; Chandrasekhar Kambhampati

Book Chapter

Performance Analysis of Various Missing Value Imputation Methods on Heart Failure Dataset

Springer, (2018), 415-425

DOI: 10.1007/978-3-319-56991-8_31

4Citations

16Readers

Get full text

Abstract

The missing data issue is a fundamental challenge in terms of analyses and classification of data. The classification performance of incomplete data could be affected and produce different accuracy results compared with complete data. In this work we compare six scalable imputation methods, implemented on a Heart Failure dataset. The comparison is done by the performance metrics of three different classification methods namely J48, REPTree, and Random Forest. The aim of the research is to find a classifier that achieves best performance results after imputing the missing data using different imputation methods. The results show that in general, the Random Forest classification achieves the best results in comparison to the decision tree J48 and REP Tree. Furthermore, the performance of classification improved when imputing the missing values by concept most common (CMC) and support vector machine (SVM).

Author supplied keywords

Cite

CITATION STYLE

APA

Al Khaldy, M., & Kambhampati, C. (2018). Performance Analysis of Various Missing Value Imputation Methods on Heart Failure Dataset. In Lecture Notes in Networks and Systems (Vol. 16, pp. 415–425). Springer. https://doi.org/10.1007/978-3-319-56991-8_31

Performance Analysis of Various Missing Value Imputation Methods on Heart Failure Dataset

Abstract

Author supplied keywords

Cite

Register to see more suggestions