Performance Analysis of Missing Values Imputation Methods Using Machine Learning Techniques

5Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Real world data often contain missing values. Data mining techniques have been actively used to overcome this problem by using methods of imputing the missing values. In particular, before applying any classification model, handling missing data in the dataset is an important task in pre-processing stages for ensuring the quality of classification results. Using the appropriate method of missing value imputation can help to generate complete datasets for improving the classifier’s performance. Many approaches have been proposed in the field of machine learning and data mining for handling missing values. Techniques used for imputing missing values can be divided into single and multiple methods. Some techniques, namely random forests, CART, k-NN imputation method and mean method, remove attributes and observations, predicting missed values by Multivariate Imputation method by Chained Equations (MICE) for example. In this study various approaches of treating missing values were applied on different decision trees algorithms to investigate how these techniques can be used effectively to improve the performance of selected classifiers. The Stroke data set was used in these experiments to check how well the methods of handling missing values work. Moreover, the paper reports how using data imputation methods affect classification results. The best results are obtained from the classifiers with removing variables that have missing data more than the rest of attributes. This work presents an attempt to analyse the chosen techniques with the purpose to investigate their strengths and weaknesses in handling missing values, and reports that both imputation methods (MIM and MICE) are efficient and yield similar accuracy.

Cite

CITATION STYLE

APA

Rado, O., Fanah, M. A., & Taktek, E. (2019). Performance Analysis of Missing Values Imputation Methods Using Machine Learning Techniques. In Advances in Intelligent Systems and Computing (Vol. 997, pp. 738–750). Springer Verlag. https://doi.org/10.1007/978-3-030-22871-2_51

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free