Improving accuracy of missing data imputation in data mining

  • Ali N
  • Omer Z
N/ACitations
Citations of this article
9Readers
Mendeley users who have this article in their library.

Abstract

In fact, raw data in the real world is dirty. Each large data repository contains various types of anomalous values that influence the result of the analysis, since in data mining, good models usually need good data, databases in the world are not always clean and includes noise, incomplete data, duplicate records, inconsistent data and missing values. Missing data is a common drawback in many real-world data sets. In this paper, we proposed an algorithm depending on improving (MIGEC) algorithm in the way of imputation for dealing missing values. We implement grey relational analysis (GRA) on attribute values instead of instance values, and the missing data were initially imputed by mean imputation and then estimated by our proposed algorithm (PA) used as a complete value for imputing next missing value.We compare our proposed algorithm with several other algorithms such as MMS, HDI, KNNMI, FCMOCS, CRI, CMI, NIIA and MIGEC under different missing mechanisms. Experimental results demonstrate that the proposed algorithm has less RMSE values than other algorithms under all missingness mechanisms.

Cite

CITATION STYLE

APA

Ali, N. A., & Omer, Z. M. (2017). Improving accuracy of missing data imputation in data mining. Kurdistan Journal of Applied Research, 2(3), 66–73. https://doi.org/10.24017/science.2017.3.30

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free