Comparison of Selected Multiple Imputation Methods for Continuous Variables – Preliminary Simulation Study Results

Małgorzata Aleksandra Misztal

Journal ArticleOPEN ACCESS

Comparison of Selected Multiple Imputation Methods for Continuous Variables – Preliminary Simulation Study Results

Misztal M

Acta Universitatis Lodziensis. Folia Oeconomica (2019) 6(339) 73-98

DOI: 10.18778/0208-6018.339.05

N/ACitations

14Readers

Abstract

The problem of incomplete data and its implications for drawing valid conclusions from statistical analyses is not related to any particular scientific domain, it arises in economics, sociology, education, behavioural sciences or medicine. Almost all standard statistical methods presume that every object has information on every variable to be included in the analysis and the typical approach to missing data is simply to delete them. However, this leads to ineffective and biased analysis results and is not recommended in the literature. The state of the art technique for handling missing data is multiple imputation. In the paper, some selected multiple imputation methods were taken into account. Special attention was paid to using principal components analysis (PCA) as an imputation method. The goal of the study was to assess the quality of PCA‑based imputations as compared to two other multiple imputation techniques: multivariate imputation by chained equations (MICE) and missForest. The comparison was made by artificially simulating different proportions (10–50%) and mechanisms of missing data using 10 complete data sets from the UCI repository of machine learning databases. Then, missing values were imputed with the use of MICE, missForest and the PCA‑based method (MIPCA). The normalised root mean square error (NRMSE) was calculated as a measure of imputation accuracy. On the basis of the conducted analyses, missForest can be recommended as a multiple imputation method providing the lowest rates of imputation errors for all types of missingness. PCA‑based imputation does not perform well in terms of accuracy.

Cite

CITATION STYLE

APA

Misztal, M. A. (2019). Comparison of Selected Multiple Imputation Methods for Continuous Variables – Preliminary Simulation Study Results. Acta Universitatis Lodziensis. Folia Oeconomica, 6(339), 73–98. https://doi.org/10.18778/0208-6018.339.05

Comparison of Selected Multiple Imputation Methods for Continuous Variables – Preliminary Simulation Study Results

Abstract

Cite

Register to see more suggestions