Single Imputation Using Statistics-Based and K Nearest Neighbor Methods for Numerical Datasets

5Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.

Abstract

Handling missing values is often an unavoidable problem. Imputation is a preferred option in handling missing values compared to removing all row records which will reduce the number of datasets and can lead to poor research results if the size of the remaining data is too small. The problem that often occurs is that there are often wrong conclusions due to some records that have missing values, therefore this study will test several simple imputation methods, namely statistical-based imputation and kNNI. The results of testing the error value with RMSE and MAPE show that kNNI imputation results are much better than statistical-based imputation. Based on the standard used in the MAPE test, the kNNI test results (error values) are almost entirely very good because the error value is <10% except for three test results in dataset 1 at k=10, k=15 and k=20, while the statistical-based imputation results are only good because the error value is between 10% and 20%, even one of the results exceeds 20% Although kNNI is better than statistical-based imputation, it is necessary to choose the right k value to get the best imputation results.

Cite

CITATION STYLE

APA

Fadlil, A., Herman, & Dikky Praseptian, M. (2023). Single Imputation Using Statistics-Based and K Nearest Neighbor Methods for Numerical Datasets. Ingenierie Des Systemes d’Information, 28(2), 451–459. https://doi.org/10.18280/isi.280221

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free